Mohammad Gheshlaghi Azar

Cited by

	All	Since 2019
Citations	11760	11252
h-index	26	24
i10-index	34	33

3800

1900

950

2850

201520162017201820192020202120222023202440 52 75 262 559 844 1807 2938 3709 1384

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Rémi MunosDeepMindVerified email at inria.fr
Bilal PiotGoogle DeepmindVerified email at google.com
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMindVerified email at meta.com
Zhaohan Daniel GuoDeepMindVerified email at google.com
Florent AltchéResearch Engineer, DeepMindVerified email at google.com
Jean-bastien GrillVerified email at google.com
Corentin TallecDeepMindVerified email at google.com
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCLVerified email at google.com
Hilbert Johan KappenRadboud UniversityVerified email at science.ru.nl
Will DabneyDeepMindVerified email at google.com
Pierre RichemondGoogle DeepMindVerified email at deepmind.com
Florian STRUBDeepMindVerified email at google.com
Elena BuchatskayaResearch Engineer, Google DeepMindVerified email at google.com
Matteo HesselResearch Engineer, Google DeepMindVerified email at google.com
Dan HorganGoogle DeepMindVerified email at google.com
Eva L. DyerGeorgia Institute of TechnologyVerified email at gatech.edu
Carl DoerschGoogle DeepMindVerified email at google.com
Shantanu ThakoorResearch Engineer at DeepMindVerified email at google.com
Tom SchaulSenior Staff Scientist, DeepMindVerified email at nyu.edu
Mark RowlandResearch Scientist, Google DeepMindVerified email at google.com

Mohammad Gheshlaghi Azar

Cohere

Verified email at google.com - Homepage

RL for Generative AI Self-Supervised Learning Exploration Optimization


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	5874	2020
Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	2496	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	781	2017
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, M Azabou, EL Dyer, R Munos, P Veličković, ... arXiv preprint arXiv:2102.06514, 2021	337*	2021
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model M Gheshlaghi Azar, R Munos, HJ Kappen Machine learning 91, 325-349, 2013	281	2013
Speedy Q-Learning MG Azar, M Ghavamzadeh, HJ Kappen, R Munos Advances in Neural Information Processing Systems, 2411-2419, 2011	199*	2011
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos arXiv preprint arXiv:1704.04651, 2017	165*	2017
Dynamic Policy Programming M Gheshlaghi Azar, V Gomez, HJ Kappen Journal of Machine Learning Research 13, 3207-3245, 2012	144	2012
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	138	2020
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018	129	2018
Sequential transfer in multi-armed bandit with finite set of models MG Azar, A Lazaric, E Brunskill Advances in Neural Information Processing Systems, 2220-2228, 2013	113	2013
On the sample complexity of reinforcement learning with a generative model MG Azar, R Munos, B Kappen arXiv preprint arXiv:1206.6461, 2012	113	2012
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019	89	2019
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	83	2018
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	82	2019
Stochastic optimization of a locally smooth function under correlated bandit feedback MG Azar, A Lazaric, E Brunskill 31st International Conference on Machine Learning (ICML), 2014	66*	2014
k. kavukcuoglu, R JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Munos, and M. Valko,“Bootstrap your own latent-a new approach to self …, 2020	63	2020
A cryptography-based approach for movement decoding EL Dyer, M Gheshlaghi Azar, MG Perich, HL Fernandes, S Naufel, ... Nature biomedical engineering 1 (12), 967-976, 2017	63	2017
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	53	2024
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	53	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors