Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 742 | 2022 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 503 | 2023 |
Fleurs: Few-shot learning evaluation of universal representations of speech A Conneau, M Ma, S Khanuja, Y Zhang, V Axelrod, S Dalmia, J Riesa, ... 2022 IEEE Spoken Language Technology Workshop (SLT), 798-805, 2023 | 108 | 2023 |
Marketing internacional F Bradley, H Calderón, CE Rivera Pearson Prentice Hall, 2006 | 103 | 2006 |
Quality at a glance: An audit of web-crawled multilingual datasets J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... Transactions of the Association for Computational Linguistics 10, 50-72, 2022 | 89 | 2022 |
Open-source multi-speaker speech corpora for building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu speech synthesis systems F He, SHC Chu, O Kjartansson, CE Rivera, A Katanova, A Gutkin, ... 12th Language Resources and Evaluation Conference (LREC 2020), 6494‑-6503, 2020 | 69 | 2020 |
Multimodal pretraining for dense video captioning G Huang, B Pang, Z Zhu, C Rivera, R Soricut arXiv preprint arXiv:2011.11760, 2020 | 67 | 2020 |
Open-source multi-speaker corpora of the english accents in the british isles I Demirsahin, O Kjartansson, A Gutkin, C Rivera 12th Language Resources and Evaluation Conference (LREC 2020), 6532‑-6541 …, 2020 | 57 | 2020 |
Quality at a glance: An audit of web-crawled multilingual datasets I Caswell, J Kreutzer, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ... arXiv e-prints, arXiv: 2103.12028, 2021 | 32 | 2021 |
Open-source high quality speech datasets for Basque, Catalan and Galician O Kjartansson, A Gutkin, A Butryna, I Demirsahin, C Rivera Proceedings of the 1st Joint Workshop on Spoken Language Technologies for …, 2020 | 25 | 2020 |
Developing an open-source corpus of Yoruba speech A Gutkin, I Demirsahin, O Kjartansson, CE Rivera, K Túbòsún Proc. Interspeech 2020, Shanghai, China, 404-408, 2020 | 24 | 2020 |
Writing system and speaker metadata for 2,800+ language varieties D van Esch, T Lucassen, S Ruder, I Caswell, C Rivera Proceedings of the Thirteenth Language Resources and Evaluation Conference …, 2022 | 17 | 2022 |
XTREME-S: Evaluating Cross-lingual Speech Representations A Conneau, A Bapna, Y Zhang, M Ma, P von Platen, A Lozhkov, C Cherry, ... Interspeech 2022, 2022 | 13 | 2022 |
Google crowdsourced speech corpora and related open-source resources for low-resource languages and dialects: an overview A Butryna, SHC Chu, I Demirsahin, A Gutkin, L Ha, F He, M Jansche, ... 2019 UNESCO International Conference Language Technologies for All (LT4All …, 2020 | 9 | 2020 |
AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages O Ogundepo, TR Gwadabe, CE Rivera, JH Clark, S Ruder, DI Adelani, ... arXiv preprint arXiv:2305.06897, 2023 | 4 | 2023 |
MD3: The Multi-Dialect Dataset of Dialogues J Eisenstein, V Prabhakaran, C Rivera, D Demszky, D Sharma arXiv preprint arXiv:2305.11355, 2023 | 2 | 2023 |
TaTa: A Multilingual Table-to-Text Dataset for African Languages S Gehrmann, S Ruder, V Nikolaev, JA Botha, M Chavinda, A Parikh, ... arXiv preprint arXiv:2211.00142, 2022 | 2 | 2022 |
How an Interest in Mindfulness Influences Linguistic Markers in Online Microblogging Discourse CE Rivera, RJ Kaunhoven, GM Griffith Mindfulness 14 (4), 818-829, 2023 | 1 | 2023 |
XTREME-S: Evaluating Cross-lingual Speech Representations A Bapna, D van Esch, J Riesa, J Clark, M Johnson, M Kale, O Firat, ... Proc. Interspeech, 2022 | 1 | 2022 |
Cross-lingual Open-Retrieval Question Answering for African Languages O Ogundepo, T Gwadabe, C Rivera, JH Clark, S Ruder, D Adelani, ... Findings of the Association for Computational Linguistics: EMNLP 2023, 14957 …, 2023 | | 2023 |