Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1058 | 2021 |
Red teaming language models with language models E Perez, S Huang, F Song, T Cai, R Ring, J Aslanides, A Glaese, ... arXiv preprint arXiv:2202.03286, 2022 | 518 | 2022 |
Improving alignment of dialogue agents via targeted human judgements A Glaese, N McAleese, M Trębacz, J Aslanides, V Firoiu, T Ewalds, ... arXiv preprint arXiv:2209.14375, 2022 | 443 | 2022 |
Teaching language models to support answers with verified quotes J Menick, M Trebacz, V Mikulik, J Aslanides, F Song, M Chadwick, ... arXiv preprint arXiv:2203.11147, 2022 | 206 | 2022 |
Fine-tuning language models to find agreement among humans with diverse preferences M Bakker, M Chadwick, H Sheahan, M Tessler, L Campbell-Gillingham, ... Advances in Neural Information Processing Systems 35, 38176-38189, 2022 | 199 | 2022 |
Open-ended learning leads to generally capable agents OEL Team, A Stooke, A Mahajan, C Barros, C Deck, J Bauer, J Sygnowski, ... arXiv preprint arXiv:2107.12808, 2021 | 165 | 2021 |
Cyprien de Masson d’Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew J JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, HF Song, J Aslanides, ... Johnson, Blake A. Hechtman, Laura Weidinger, Iason Gabriel, William S. Isaac …, 2021 | 69 | 2021 |
Scaling Language Models: Methods JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... Analysis & Insights from Training Gopher. arXiv, 2021 | 31 | 2021 |
Llm critics help catch llm bugs N McAleese, RM Pokorny, JFC Uribe, E Nitishinskaya, M Trebacz, J Leike arXiv preprint arXiv:2407.00215, 2024 | 29 | 2024 |
Scaling language models: Methods, analysis & insights from training gopher. arXiv 2021 JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 17 | 2021 |
Fine-tuning language models via epistemic neural networks I Osband, SM Asghari, B Van Roy, N McAleese, J Aslanides, G Irving arXiv preprint arXiv:2211.01568, 2022 | 9 | 2022 |
Prover-verifier games improve legibility of llm outputs JH Kirchner, Y Chen, H Edwards, J Leike, N McAleese, Y Burda arXiv preprint arXiv:2407.13692, 2024 | 3 | 2024 |
Prover-Verifier Games improve legibility of LLM outputs J Hendrik Kirchner, Y Chen, H Edwards, J Leike, N McAleese, Y Burda arXiv e-prints, arXiv: 2407.13692, 2024 | | 2024 |