Deep double descent: Where bigger models and more data hurt P Nakkiran, G Kaplun, Y Bansal, T Yang, B Barak, I Sutskever Journal of Statistical Mechanics: Theory and Experiment 2021 (12), 124003, 2021 | 941 | 2021 |
On the information bottleneck theory of deep learning AM Saxe, Y Bansal, J Dapello, M Advani, A Kolchinsky, BD Tracey, ... Journal of Statistical Mechanics: Theory and Experiment 2019 (12), 124020, 2019 | 579 | 2019 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 568 | 2023 |
Revisiting model stitching to compare neural representations Y Bansal, P Nakkiran, B Barak Advances in Neural Information Processing Systems 34, 225-236, 2021 | 75 | 2021 |
The unreasonable effectiveness of few-shot learning for machine translation X Garcia, Y Bansal, C Cherry, G Foster, M Krikun, M Johnson, O Firat International Conference on Machine Learning, 10867-10878, 2023 | 48 | 2023 |
Distributional Generalization: A New Kind of Generalization P Nakkiran, Y Bansal arXiv preprint arXiv:2009.08092, 2020 | 34 | 2020 |
Data Scaling Laws in NMT: The Effect of Noise and Architecture Y Bansal, B Ghorbani, A Garg, B Zhang, C Cherry, B Neyshabur, O Firat International Conference on Machine Learning, 1466-1482, 2022 | 32 | 2022 |
For self-supervised learning, Rationality implies generalization, provably Y Bansal, G Kaplun, B Barak arXiv preprint arXiv:2010.08508, 2020 | 28 | 2020 |
Minnorm training: an algorithm for training over-parameterized deep neural networks Y Bansal, M Advani, DD Cox, AM Saxe arXiv preprint arXiv:1806.00730, 2018 | 24* | 2018 |
Limitations of the NTK for Understanding Generalization in Deep Learning N Vyas, Y Bansal, P Nakkiran arXiv preprint arXiv:2206.10012, 2022 | 21 | 2022 |
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modelling A Srivastava, Y Bansal, Y Ding, C Hurwitz, K Xu, B Egger, P Sattigeri, ... arXiv preprint arXiv:2010.13187, 2020 | 4 | 2020 |
On Privileged and Convergent Bases in Neural Network Representations D Brown, N Vyas, Y Bansal arXiv preprint arXiv:2307.12941, 2023 | 2 | 2023 |
Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning N Vyas, Y Bansal, P Nakkiran Transactions on Machine Learning Research, 2023 | 2 | 2023 |
Building the Theoretical Foundations of Deep Learning: An Empirical Approach Y Bansal Harvard University, 2022 | | 2022 |