Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... Journal of machine learning research 21 (140), 1-67, 2020 | 18394 | 2020 |
Merging models with fisher-weighted averaging MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 17703-17716, 2022 | 219 | 2022 |
Do transformer modifications transfer across implementations and applications? S Narang, HW Chung, Y Tay, W Fedus, T Fevry, M Matena, K Malkan, ... arXiv preprint arXiv:2102.11972, 2021 | 96 | 2021 |
Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... arXiv preprint arXiv:1910.10683, 2019 | 87 | 2019 |
T5: Exploring the limits of transfer learning with a unified text-to-text transformer C Raffel, N Shazeer, A Roberts, K Lee, S Narang, M Matena, Y Zhou, W Li, ... Journal of Machine Learning Research 21, 1-67, 2020 | 54 | 2020 |
Exploring the limits of transfer learning with a unified text-to-text transformer A Roberts, C Raffel, K Lee, M Matena, N Shazeer, PJ Liu, S Narang, W Li, ... Google, Tech. Rep., 2019 | 51 | 2019 |
A combinatorial perspective on the optimization of shallow ReLU networks MS Matena, CA Raffel Advances in Neural Information Processing Systems 35, 22187-22198, 2022 | 2 | 2022 |
NPEFF: Non-Negative Per-Example Fisher Factorization M Matena, C Raffel arXiv preprint arXiv:2310.04649, 2023 | | 2023 |