I'm a final year PhD student at Meta AI (FAIR) and University College London (UCL). At UCL, I am supervised by Tim Rocktäschel at UCL DARK Lab. I'm also part of the ELLIS PhD & Postdoc Program.
I hold an MSc in Computer Science degree from the University of Oxford where I worked in the Whiteson Research Lab advised by Shimon Whiteson. Prior to that, I obtained Master’s and Bachelor’s degrees from Yerevan State University in Informatics and Applied Mathematics. I have held research and development positions at Reddit, Mentor, Toptal and USC Information Sciences Institute.
My research focuses on reinforcement learning, multi-agent learning, and open-endedness. My early works include widely-used tools for multi-agent RL, such as the QMIX method and SMAC benchmark. Much of my follow up work focuses on using open-ended learning to train generally capable RL agents and diagnose their robustness. Recently, I used these techniques to enhance the safety of LLMs with approaches like Rainbow Teaming, which identifies vulnerabilities and generates synthetic data to improve LLM robustness, and also contributed to Meta Llama 3.
My long-term goal is to develop methods that give AI agents endless learning opportunities, enabling them to perform an ever-expanding range of tasks and become increasingly robust.
Contact: mikayel [at] samvelyan [dot] com
News
- [Sep 2024] Rainbow Teaming and JaxMARL have been accepted to NeurIPS 2024.
- [Apr 2024] We released Meta Llama 3, the most capable openly available LLM to date. See our model card.
- [Feb 2024] We released Rainbow Teaming, a new approach for generating diverse adversarial prompts!
- [Jan 2024] MADRID has been accepted to AAMAS 2024.
- [Sep 2023] SMACv2 has been accepted to NeurIPS 2023.
- [Jul 2023] Co-organizing 2nd Workshop on Agent Learning in Open-Endedness (ALOE) at NeurIPS 2023.
- [Mar-May 2023] Invited talks at UC Berkeley MARL Seminar, InstaDeep, and University of Maryland.
- [Feb 2023] MAESTRO has been accepted to ICLR 2023.
- [Dec 2022] We released SMACv2, an improved version of the StarCraft Multi-Agent Challenge.
Libraries
Featured Media
Highlighted Papers
The Llama 3 Herd of Models
Llama Team
arXiv, 2024
Rainbow Teaming: Open-Ended Generation of
Diverse
Adversarial Prompts
M Samvelyan*,
S Raparthy*,
A Lupu*,
E Hambro,
A Markosyan,
M Bhatt,
Y Mao,
M Jiang,
J Parker-Holder,
J Foerster,
T Rocktäschel,
R Raileanu
NeurIPS, 2024
@misc{samvelyan2024rainbow, title={Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts}, author={Mikayel Samvelyan and Sharath Chandra Raparthy and Andrei Lupu and Eric Hambro and Aram H. Markosyan and Manish Bhatt and Yuning Mao and Minqi Jiang and Jack Parker-Holder and Jakob Foerster and Tim Rocktäschel and Roberta Raileanu}, year={2024}, eprint={2402.16822}, archivePrefix={arXiv}, primaryClass={cs.CL} }
Multi-Agent
Diagnostics for Robustness via Illuminated Diversity
M Samvelyan*,
D Paglieri*,
M Jiang,
J Parker-Holder,
T Rocktäschel
AAMAS, 2024 (Oral)
MAESTRO:
Open-Ended Environment Design for Multi-Agent
Reinforcement Learning
M Samvelyan,
A Khan,
M Dennis,
M Jiang,
J Parker-Holder,
J Foerster,
R Raileanu,
T Rocktäschel
ICLR, 2023
@inproceedings{samvelyan2023maestro, title={{MAESTRO}: Open-Ended Environment Design for Multi-Agent Reinforcement Learning}, author={Mikayel Samvelyan and Akbir Khan and Michael D Dennis and Minqi Jiang and Jack Parker-Holder and Jakob Nicolaus Foerster and Roberta Raileanu and Tim Rockt{\"a}schel}, booktitle={International Conference on Learning Representations}, year={2023}, url={https://openreview.net/forum?id=sKWlRDzPfd7} }
MiniHack
the Planet: A Sandbox for Open-Ended Reinforcement
Learning Research
M Samvelyan,
R Kirk,
V Kurin,
J Parker-Holder, M Jiang,
E Hambro,
F Petroni,
H Küttler,
E Grefenstette,
T Rocktäschel
NeurIPS, 2021
@inproceedings{samvelyan2021minihack, title={MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research}, author={Mikayel Samvelyan and Robert Kirk and Vitaly Kurin and Jack Parker-Holder and Minqi Jiang and Eric Hambro and Fabio Petroni and Heinrich Kuttler and Edward Grefenstette and Tim Rockt{\"a}schel}, booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)}, year={2021}, url={https://openreview.net/forum?id=skFwlyefkWJ} }
Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning
T Rashid*, M Samvelyan*,
C Schroeder de Witt,
G Farquhar,
J Foerster,
S Whiteson
Journal of Machine Learning Research (JMLR), 2020
@article{rashid20monotonic, author = {Tabish Rashid and Mikayel Samvelyan and Christian Schroeder de Witt and Gregory Farquhar and Jakob Foerster and Shimon Whiteson}, title = {Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning}, journal = {Journal of Machine Learning Research}, year = {2020}, volume = {21}, number = {178}, pages = {1--51}, }
The
StarCraft Multi-Agent Challenge
M Samvelyan*,
T Rashid*,
C Schroeder de Witt,
G Farquhar,
N Nardelli, T Rudner,
C Hung,
P Torr, J Foerster,
S Whiteson
AAMAS, 2019
@inproceedings{samvelyan2019smac, title = {{The} {StarCraft} {Multi}-{Agent} {Challenge}}, author = {Samvelyan, Mikayel and Rashid, Tabish and Schroeder de Witt, Christian and Farquhar, Gregory and Nardelli, Nantas and Rudner, Tim G. J. and Hung, Chia-Man and Torr, Philip H. S. and Foerster, Jakob and Whiteson, Shimon}, booktitle = {Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems}, pages = {2186--2188}, year = {2019}, }
QMIX:
Monotonic Value Function Factorisation for Deep
Multi-Agent Reinforcement Learning
T Rashid*, M Samvelyan*,
C Schroeder de Witt,
G Farquhar,
J Foerster,
S Whiteson
ICML,
2018
@inproceedings{rashid18qmix, title = {{QMIX}: {Monotonic} {Value} {Function} {Factorisation} {for} {Deep} {Multi}-{Agent} {Reinforcement} {Learning}}, author = {Rashid, Tabish and Samvelyan, Mikayel and Schroeder, Christian and Farquhar, Gregory and Foerster, Jakob and Whiteson, Shimon}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, publisher = {PMLR}, volume = {80}, pages = {4295--4304}, year = {2018}, }
Other Selected Papers
See Google Scholar for more publications.
JaxMARL:
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
A Rutherford,
B Ellis,
M Gallici,
J Cook,
A Lupu,
G Ingvarsson,
T Willi,
A Khan,
C Schroeder de Witt,
A Souly,
S
Bandyopadhyay,
M Samvelyan,
M Jiang,
R Lange,
S Whiteson,
B Lacerda,
N Hawes,
T Rocktäschel,
C Lu,
J Foerster
NeurIPS, 2024
@misc{rutherford2023jaxmarl, title={JaxMARL: Multi-Agent RL Environments in JAX}, author={Alexander Rutherford and Benjamin Ellis and Matteo Gallici and Jonathan Cook and Andrei Lupu and Gardar Ingvarsson and Timon Willi and Akbir Khan and Christian Schroeder de Witt and Alexandra Souly and Saptarashmi Bandyopadhyay and Mikayel Samvelyan and Minqi Jiang and Robert Tjarko Lange and Shimon Whiteson and Bruno Lacerda and Nick Hawes and Tim Rocktaschel and Chris Lu and Jakob Nicolaus Foerster}, year={2023}, eprint={2311.10090}, archivePrefix={arXiv}, primaryClass={cs.LG} }
Craftax: A
Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
M Matthews,
M Beukmans,
B Ellis,
M Samvelyan,
M Jackson,
S Coward,
J Foerster
ICML, 2024 (Spotlight)
SMACv2:
An Improved Benchmark for Cooperative Multi-Agent
Reinforcement Learning
B Ellis,
J Cook,
S Moalla,
M Samvelyan,
M Sun, A Mahajan,
J Foerster,
S Whiteson
NeurIPS, 2023
@inproceedings{ellis2023smacv2, title={{SMAC}v2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning}, author={Benjamin Ellis and Jonathan Cook and Skander Moalla and Mikayel Samvelyan and Mingfei Sun and Anuj Mahajan and Jakob Nicolaus Foerster and Shimon Whiteson}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track}, year={2023}, url={https://openreview.net/forum?id=5OjLGiJW3u} }
Mix-ME:
Quality-Diversity for Multi-Agent Learning
G Ingvarsson,
M Samvelyan,
M Flageat,
B Lim,
A Cully,
T Rocktäschel
ALOE Workshop @ NeurIPS, 2023
@inproceedings{ingvarsson2023mixme, title={Mix-{ME}: Quality-Diversity for Multi-Agent Learning}, author={Gar{\dh}ar Ingvarsson and Mikayel Samvelyan and Manon Flageat and Bryan Lim and Antoine Cully and Tim Rockt{\"a}schel}, booktitle={Second Agent Learning in Open-Endedness Workshop}, year={2023}, url={https://openreview.net/forum?id=acD8BxMjwV} }
GriddlyJS: A
Web IDE for Reinforcement
Learning
C Bamford, M Jiang,
M Samvelyan,
T Rocktäschel
NeurIPS, 2022
@inproceedings{bamford2022griddlyjs, title={Griddly{JS}: A Web {IDE} for Reinforcement Learning}, author={Christopher Bamford and Minqi Jiang and Mikayel Samvelyan and Tim Rockt{\"a}schel}, booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track}, year={2022}, url={https://openreview.net/forum?id=YmacJv0i_UR} }
Evolving
Curricula with Regret-Based Environment Design
J Parker-Holder*, M Jiang*,
M Dennis,
M Samvelyan,
J Foerster,
E Grefenstette,
T Rocktäschel
ICML,
2022
@article{parkerholder2022evolving, title={Evolving Curricula with Regret-Based Environment Design}, author={Parker-Holder, Jack and Jiang, Minqi and Dennis, Michael D and Samvelyan, Mikayel and Foerster, Jakob Nicolaus and Grefenstette, Edward and Rockt{\"a}schel, Tim}, journal={arXiv preprint arXiv:2203.01302}, year={2022} }
Hierarchical
Kickstarting for Skill Transfer in
Reinforcement Learning
M Matthews,
M Samvelyan,
J Parker-Holder, E Grefenstette,
T Rocktäschel
CoLLAs, 2022
@misc{matthews2022hierarchical, url = {https://arxiv.org/abs/2207.11584}, author = {Matthews, Michael and Samvelyan, Mikayel and Parker-Holder, Jack and Grefenstette, Edward and Rocktäschel, Tim}, keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning}, publisher = {arXiv}, year = {2022}, }
Generalization in Cooperative Multi-Agent
Systems
A Mahajan,
M Samvelyan,
T Gupta,
B Ellis,
M Sun, T Rocktäschel,
S Whiteson
arXiv, 2022
Tesseract:
Tensorised Actors for Multi-Agent Reinforcement
Learning
A Mahajan,
M Samvelyan,
L Mao,
V Makoviychuk, A Garg,
J Kossaifi,
S Whiteson, Y Zhu,
A Anandkumar
ICML,
2021
@inproceedings{mahajan2021tesseract, title = {Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning}, author = {Mahajan, Anuj and Samvelyan, Mikayel and Mao, Lei and Makoviychuk, Viktor and Garg, Animesh and Kossaifi, Jean and Whiteson, Shimon and Zhu, Yuke and Anandkumar, Animashree}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, publisher = {PMLR}, volume = {139}, pages = {7301--7312}, year = {2021}, }
MAVEN:
Multi-Agent Variational Exploration
A Mahajan,
T Rashid, M Samvelyan,
S Whiteson
NeurIPS, 2019
Teaching
- Spring 2023 - COMP0087 Statistical Natural Language Processing (TA)
- Spring 2022 - COMP0089 Reinforcement Learning (TA)
- Spring 2022 - COMP0087 Statistical Natural Language Processing (TA)
- Spring 2021 - COMP0089 Reinforcement Learning (TA)
- Spring 2021 - COMP0087 Statistical Natural Language Processing (TA)
- Spring 2020 - Data Structures (TA)
- Fall 2019 - Machine Learning (Lecturer)
- Fall 2018 - Machine Learning (Lecturer)
- Fall 2018 - Operating Systems (TA)
- Fall 2018 - Artificial Intelligence (Guest Lecturer and TA)