Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 4654 | 2023 |
Towards a human-like open-domain chatbot D Adiwardana, MT Luong, DR So, J Hall, N Fiedel, R Thoppilan, Z Yang, ... arXiv preprint arXiv:2001.09977, 2020 | 1129 | 2020 |
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 1036 | 2022 |
Tfx: A tensorflow-based production-scale machine learning platform D Baylor, E Breck, HT Cheng, N Fiedel, CY Foo, Z Haque, S Haykal, ... Proceedings of the 23rd ACM SIGKDD international conference on knowledge …, 2017 | 475 | 2017 |
Gemma: Open models based on gemini research and technology G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ... arXiv preprint arXiv:2403.08295, 2024 | 471 | 2024 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 411 | 2024 |
Tensorflow-serving: Flexible, high-performance ml serving C Olston, N Fiedel, K Gorovoy, J Harmsen, L Lao, F Li, V Rajashekhar, ... arXiv preprint arXiv:1712.06139, 2017 | 347 | 2017 |
Wt5?! training text-to-text models to explain their predictions S Narang, C Raffel, K Lee, A Roberts, N Fiedel, K Malkan arXiv preprint arXiv:2004.14546, 2020 | 193 | 2020 |
Talm: Tool augmented language models A Parisi, Y Zhao, N Fiedel arXiv preprint arXiv:2205.12255, 2022 | 152 | 2022 |
Scaling up models and data with t5x and seqio A Roberts, HW Chung, G Mishra, A Levskaya, J Bradbury, D Andor, ... Journal of Machine Learning Research 24 (377), 1-8, 2023 | 144 | 2023 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 126* | 2021 |
Storage and distribution of content for a user device group V Mallet, J Cheng, N Fiedel, EW Gillum, G Ramanarayanan, NJ Woods US Patent 8,438,233, 2013 | 98 | 2013 |
Sharing content among a group of devices V Mallet, J Cheng, N Fiedel, EW Gillum, G Ramanarayanan, NJ Woods US Patent 8,386,619, 2013 | 84 | 2013 |
User device group formation V Mallet, J Cheng, N Fiedel, EW Gillum, G Ramanarayanan, NJ Woods US Patent 8,539,086, 2013 | 79 | 2013 |
Levels of AGI: Operationalizing Progress on the Path to AGI MR Morris, J Sohl-Dickstein, N Fiedel, T Warkentin, A Dafoe, A Faust, ... arXiv preprint arXiv:2311.02462, 2023 | 75 | 2023 |
Gemma 2: Improving open language models at a practical size G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ... arXiv preprint arXiv:2408.00118, 2024 | 68 | 2024 |
Pushing tuning parameters for logical group scoring V Mallet, J Cheng, N Fiedel, EW Gillum, G Ramanarayanan, NJ Woods US Patent 8,892,653, 2014 | 68 | 2014 |
Understanding html with large language models I Gur, O Nachum, Y Miao, M Safdari, A Huang, A Chowdhery, S Narang, ... arXiv preprint arXiv:2210.03945, 2022 | 67 | 2022 |
Sharing content among multiple devices V Mallet, J Cheng, N Fiedel, EW Gillum, G Ramanarayanan, NJ Woods US Patent 8,392,526, 2013 | 59 | 2013 |
Beyond human data: Scaling self-training for problem-solving with language models A Singh, JD Co-Reyes, R Agarwal, A Anand, P Patil, PJ Liu, J Harrison, ... arXiv preprint arXiv:2312.06585, 2023 | 47 | 2023 |