Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1186 | 2023 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 89 | 2022 |
A few thousand translations go a long way! leveraging pre-trained models for african news translation DI Adelani, JO Alabi, A Fan, J Kreutzer, X Shen, M Reid, D Ruiter, ... arXiv preprint arXiv:2205.02022, 2022 | 32 | 2022 |
Quality at a glance: An audit of web-crawled multilingual datasets I Caswell, J Kreutzer, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... arXiv e-prints, arXiv: 2103.12028, 2021 | 32 | 2021 |
BLOOM: A 176b-parameter open-access multilingual language model. CoRR, abs/2211.05100, 2022. doi: 10.48550 T Le Scao, A Fan, C Akiki, E Pavlick, S Ilic, D Hesslow, R Castagné, ... arXiv preprint arXiv.2211.05100, 0 | 20 | |
Bloom library: Multimodal datasets in 300+ languages for a variety of downstream tasks C Leong, J Nemecek, J Mansdorfer, A Filighera, A Owodunni, ... arXiv preprint arXiv:2210.14712, 2022 | 12 | 2022 |
Documenting geographically and contextually diverse data sources: The bigscience catalogue of language data and resources A McMillan-Major, Z Alyafeai, S Biderman, K Chen, F De Toni, G Dupont, ... arXiv preprint arXiv:2201.10066, 2022 | 12 | 2022 |
Guyo Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Peter Wairagala, Muhammad Umair Nasir D Adelani, J Alabi, A Fan, J Kreutzer, X Shen, M Reid, D Ruiter, D Klakow, ... | 11 | 2022 |
Bibletts: a large, high-fidelity, multilingual, and uniquely african speech corpus J Meyer, DI Adelani, E Casanova, A Öktem, DWJ Weber, S Kabongo, ... arXiv preprint arXiv:2207.03546, 2022 | 9 | 2022 |
Phone-ing it in: Towards flexible multi-modal language model training by phonetic representations of data C Leong, D Whitenack Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 3 | 2022 |
Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages C Leong, H Shandilya, BFP Dossou, AL Tonja, J Mathew, AH Omotayo, ... arXiv preprint arXiv:2303.16985, 2023 | 2 | 2023 |
JWSign: A Highly Multilingual Corpus of Bible Translations for more Diversity in Sign Language Processing S Gueuwou, S Siake, C Leong, M Müller arXiv preprint arXiv:2311.10174, 2023 | 1 | 2023 |
The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages V Akerman, D Baines, D Daspit, U Hermjakob, T Jang, C Leong, M Martin, ... arXiv preprint arXiv:2304.09919, 2023 | 1 | 2023 |
Characterization of CNN classifier performance with respect to variation in optical contrast, using synthetic electro-optical data C Menart, C Leong, O Mendoza-Schrock, E Zelnio Automatic Target Recognition XXIX 10988, 143-153, 2019 | 1 | 2019 |
Enhancing Multi-Domain Automatic Short Answer Grading through an Explainable Neuro-Symbolic Pipeline F Künnecke, A Filighera, C Leong, T Steuer arXiv preprint arXiv:2403.01811, 2024 | | 2024 |
A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation D Ifeoluwa Adelani, J Oluwadara Alabi, A Fan, J Kreutzer, X Shen, M Reid, ... arXiv e-prints, arXiv: 2205.02022, 2022 | | 2022 |