Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 2112 | 2022 |
Deep graph infomax P Veličković, W Fedus, WL Hamilton, P Liņ, Y Bengio, RD Hjelm arXiv preprint arXiv:1809.10341, 2018 | 1933 | 2018 |
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity W Fedus, B Zoph, N Shazeer The Journal of Machine Learning Research 23 (1), 5232-5270, 2022 | 1012 | 2022 |
Emergent abilities of large language models J Wei, Y Tay, R Bommasani, C Raffel, B Zoph, S Borgeaud, D Yogatama, ... arXiv preprint arXiv:2206.07682, 2022 | 954 | 2022 |
Scaling instruction-finetuned language models HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ... arXiv preprint arXiv:2210.11416, 2022 | 789 | 2022 |
MaskGAN: Better Text Generation via Filling in the ______ W Fedus, I Goodfellow, AM Dai International Conference on Learning Representations (ICLR 2018), 2018 | 565 | 2018 |
In silico labeling: Predicting fluorescent labels in unlabeled images SF Eric Christiansen, Samuel J. Yang, D. Michael Ando, Ashkan Javaherian ... Cell, 2018 | 545 | 2018 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 444 | 2022 |
Revisiting resnets: Improved training and scaling strategies I Bello, W Fedus, X Du, ED Cubuk, A Srinivas, TY Lin, J Shlens, B Zoph Advances in Neural Information Processing Systems 34, 22614-22627, 2021 | 260 | 2021 |
Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step W Fedus, M Rosca, B Lakshminarayanan, AM Dai, S Mohamed, ... International Conference on Learning Representations (ICLR 2018), 2017 | 245 | 2017 |
The case for a directional dark matter detector and the status of current experimental efforts S Ahlen, N Afshordi, JBR Battat, J Billard, N Bozorgnia, S Burgos, ... International Journal of Modern Physics A 25 (01), 1-51, 2010 | 238 | 2010 |
Glam: Efficient scaling of language models with mixture-of-experts N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ... International Conference on Machine Learning, 5547-5569, 2022 | 230 | 2022 |
Revisiting fundamentals of experience replay W Fedus, P Ramachandran, R Agarwal, Y Bengio, H Larochelle, ... International Conference on Machine Learning, 3061-3071, 2020 | 217 | 2020 |
Language GANs Falling Short M Caccia, L Caccia, W Fedus, H Larochelle, J Pineau, L Charlin International Conference on Learning Representations (ICLR 2020), 2018 | 212 | 2018 |
ChatGPT: Optimizing language models for dialogue J Schulman, B Zoph, C Kim, J Hilton, J Menick, J Weng, JFC Uribe, ... OpenAI blog, 2022 | 200 | 2022 |
First dark matter search results from a surface run of the 10-L DMTPC directional dark matter detector S Ahlen, JBR Battat, T Caldwell, C Deaconu, D Dujmic, W Fedus, P Fisher, ... Physics Letters B 695 (1-4), 124-129, 2011 | 107 | 2011 |
Hyperbolic discounting and learning over multiple horizons W Fedus, C Gelada, Y Bengio, MG Bellemare, H Larochelle Reinforcement Learning and Decision Making (RLDM 2019), 2019 | 98 | 2019 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 94* | 2021 |
On bonus-based exploration methods in the arcade learning environment AA Taiga, W Fedus, MC Machado, A Courville, MG Bellemare arXiv preprint arXiv:2109.11052, 2021 | 85* | 2021 |
Deep Graph Infomax. P Velickovic, W Fedus, WL Hamilton, P Liņ, Y Bengio, RD Hjelm ICLR (Poster) 2 (3), 4, 2019 | 74* | 2019 |