Mikayel Samvelyan

mikayel [at] samvelyan [dot] com

I am a PhD student splitting my time between University College London (UCL) and Meta AI (FAIR). I am supervised by Tim Rocktäschel at the UCL Deciding, Acting, and Reasoning with Knowledge (DARK) Lab. I am also part of the ELLIS PhD & Postdoc Program.

I hold an MSc in Computer Science degree from the University of Oxford where I worked in the Whiteson Research Lab advised by Shimon Whiteson. Prior to that, I obtained Master’s and Bachelor’s degrees from Yerevan State University in Informatics and Applied Mathematics. I have previously held research and development engineering positions at Reddit, Mentor, Toptal and USC Information Sciences Institute.

My research interests lie in the areas of deep reinforcement learning, multi-agent learning, and open-endedness.

Avatar

News

Libraries

Publications

Journal Papers

qmix_journal

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
T Rashid* , M Samvelyan*, C Schroeder de Witt, G Farquhar, J Foerster, S Whiteson
Journal of Machine Learning Research (JMLR), 2020

@article{rashid20monotonic,
 author  = {Tabish Rashid and Mikayel Samvelyan and Christian Schroeder de Witt and Gregory Farquhar and Jakob Foerster and Shimon Whiteson},
 title   = {Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning},
 journal = {Journal of Machine Learning Research},
 year    = {2020},
 volume  = {21},
 number  = {178},
 pages   = {1--51},
}

Conference Papers

Griddly

GriddlyJS: A Web IDE for Reinforcement Learning
C Bamford, M Jiang, M Samvelyan, T Rocktäschel
Conference on Neural Information Processing Systems (NeurIPS), 2022

@inproceedings{bamford2022griddlyjs,
title={Griddly{JS}: A Web {IDE} for Reinforcement Learning},
author={Christopher Bamford and Minqi Jiang and Mikayel Samvelyan and Tim Rockt{\"a}schel},
booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2022},
url={https://openreview.net/forum?id=YmacJv0i_UR}
}
Accel

Evolving Curricula with Regret-Based Environment Design
J Parker-Holder*, M Jiang*, M Dennis, M Samvelyan, J Foerster, E Grefenstette, T Rocktäschel
International Conference on Machine Learning (ICML), 2022

@article{parkerholder2022evolving,
title={Evolving Curricula with Regret-Based Environment Design},
author={Parker-Holder, Jack and Jiang, Minqi and Dennis, Michael D and Samvelyan, Mikayel and Foerster, Jakob Nicolaus and Grefenstette, Edward and Rockt{\"a}schel, Tim},
journal={arXiv preprint arXiv:2203.01302},
year={2022}
}
Skillhack

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning
M Matthews, M Samvelyan, J Parker-Holder, E Grefenstette, T Rocktäschel
Conference on Lifelong Learning Agents (CoLLAs), 2022

@misc{matthews2022hierarchical,
url = {https://arxiv.org/abs/2207.11584},
author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rocktäschel, Tim},
keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning},
publisher = {arXiv},
year = {2022},
}
Minihack

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
M Samvelyan, R Kirk, V Kurin, J Parker-Holder, M Jiang, E Hambro, F Petroni, H Küttler, E Grefenstette, T Rocktäschel
Conference on Neural Information Processing Systems (NeurIPS), 2021

@inproceedings{samvelyan2021minihack,
title={MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research},
author={Mikayel Samvelyan and Robert Kirk and Vitaly Kurin and Jack Parker-Holder and Minqi Jiang and Eric Hambro and Fabio Petroni and Heinrich Kuttler and Edward Grefenstette and Tim Rockt{\"a}schel},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
year={2021},
url={https://openreview.net/forum?id=skFwlyefkWJ}
}
Tesseract

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
A Mahajan, M Samvelyan, L Mao, V Makoviychuk, A Garg, J Kossaifi, S Whiteson, Y Zhu, A Anandkumar
International Conference on Machine Learning (ICML), 2021

@inproceedings{mahajan2021tesseract,
title = {Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning},
author = {Mahajan, Anuj and Samvelyan, Mikayel and Mao, Lei and Makoviychuk, Viktor and Garg, Animesh and Kossaifi, Jean and Whiteson, Shimon and Zhu, Yuke and Anandkumar, Animashree},
booktitle = {Proceedings of the 38th International Conference on Machine Learning},
publisher = {PMLR},
volume = {139},
pages = {7301--7312},
year = {2021},
}
Maven

MAVEN: Multi-Agent Variational Exploration
A Mahajan, T Rashid, M Samvelyan, S Whiteson
Conference on Neural Information Processing Systems (NeurIPS), 2019

@incollection{mahajan2019maven,
title = {{MAVEN}: {Multi}-{Agent} {Variational} {Exploration}},
author = {Mahajan, Anuj and Rashid, Tabish and Samvelyan, Mikayel and Whiteson, Shimon},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {7611--7622},
year = {2019},
}
SMAC

The StarCraft Multi-Agent Challenge
M Samvelyan*, T Rashid*, C Schroeder de Witt, G Farquhar, N Nardelli, T Rudner, C Hung, P Torr, J Foerster, S Whiteson
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2019

@inproceedings{samvelyan2019smac,
title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}},
author = {Samvelyan, Mikayel and Rashid, Tabish and Schroeder de Witt, Christian and Farquhar, Gregory and Nardelli, Nantas and Rudner, Tim G. J. and Hung, Chia-Man and Torr, Philip H. S. and Foerster, Jakob and Whiteson, Shimon},
booktitle = {Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems},
pages = {2186--2188},
year = {2019},
}
QMIX

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
T Rashid*, M Samvelyan*, C Schroeder de Witt, G Farquhar, J Foerster, S Whiteson
International Conference on Machine Learing (ICML), 2018

@inproceedings{rashid18qmix,
title = {{QMIX}: {Monotonic} {Value} {Function} {Factorisation} {for} {Deep} {Multi}-{Agent} {Reinforcement} {Learning}},
author = {Rashid, Tabish and Samvelyan, Mikayel and Schroeder, Christian and Farquhar, Gregory and Foerster, Jakob and Whiteson, Shimon},
booktitle = {Proceedings of the 35th International Conference on Machine Learning},
publisher = {PMLR},
volume = {80},
pages = {4295--4304},
year = {2018},
}

Preprints

SMAC v2

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning
B Ellis, S Moalla, M Samvelyan, M Sun, A Mahajan, J Foerster, S Whiteson
arXiv, 2022

@misc{ellis2022smacv2,
url = {https://arxiv.org/abs/2212.07489},
author = {Ellis, Benjamin and Moalla, Skander and Samvelyan, Mikayel and Sun, Mingfei and Mahajan, Anuj and Foerster, Jakob N. and Whiteson, Shimon},
title = {SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning},
publisher = {arXiv},
year = {2022},
}
GENMAS

Generalization in Cooperative Multi-Agent Systems
A Mahajan, M Samvelyan, T Gupta, B Ellis, M Sun, T Rocktäschel, S Whiteson
arXiv, 2022

@article{mahajan2022generalization, 
title={Generalization in Cooperative Multi-Agent Systems}, 
author={Mahajan, Anuj and Samvelyan, Mikayel and Gupta, Tarun and Ellis, Benjamin and Sun, Mingfei and Rockt{\"a}schel, Tim and Whiteson, Shimon}, 
journal={arXiv preprint arXiv:2202.00104}, 
year={2022},
}

Teaching

  • Spring 2020 - Data Structures (TA)
  • Fall 2019 - Machine Learning (Lecturer)
  • Fall 2018 - Machine Learning (Lecturer)
  • Fall 2018 - Operating Systems (TA)