The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 1312 | 2024 |
Transformer-based acoustic modeling for hybrid speech recognition Y Wang, A Mohamed, D Le, C Liu, A Xiao, J Mahadeokar, H Huang, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 270 | 2020 |
Voicebox: Text-guided multilingual universal speech generation at scale M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ... Advances in neural information processing systems 36, 2024 | 211 | 2024 |
Torchaudio: Building blocks for audio and speech processing YY Yang, M Hira, Z Ni, A Astafurov, C Chen, C Puhrsch, D Pollack, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 207 | 2022 |
Transformer-transducer: End-to-end speech recognition with self-attention CF Yeh, J Mahadeokar, K Kalgaonkar, Y Wang, D Le, M Jain, K Schubert, ... arXiv preprint arXiv:1910.12977, 2019 | 176 | 2019 |
Contextual RNN-T for open domain ASR M Jain, G Keren, J Mahadeokar, G Zweig, F Metze, Y Saraf arXiv preprint arXiv:2006.03411, 2020 | 99 | 2020 |
Prompting large language models with speech recognition abilities Y Fathullah, C Wu, E Lakomkin, J Jia, Y Shangguan, K Li, J Guo, W Xiong, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 95 | 2024 |
Contextualized streaming end-to-end speech recognition with trie-based deep biasing and shallow fusion D Le, M Jain, G Keren, S Kim, Y Shi, J Mahadeokar, J Chan, ... arXiv preprint arXiv:2104.02194, 2021 | 84 | 2021 |
Deep shallow fusion for RNN-T personalization D Le, G Keren, J Chan, J Mahadeokar, C Fuegen, ML Seltzer 2021 IEEE Spoken Language Technology Workshop (SLT), 251-257, 2021 | 82 | 2021 |
The llama 3 herd of models, 2024 A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... URL https://arxiv. org/abs/2407.21783 2407, 21783, 0 | 76 | |
Alignment restricted streaming recurrent neural network transducer J Mahadeokar, Y Shangguan, D Le, G Keren, H Su, T Le, CF Yeh, ... 2021 IEEE Spoken Language Technology Workshop (SLT), 52-59, 2021 | 70 | 2021 |
RNN-T for latency controlled ASR with improved beam search M Jain, K Schubert, J Mahadeokar, CF Yeh, K Kalgaonkar, A Sriram, ... arXiv preprint arXiv:1911.01629, 2019 | 44 | 2019 |
Improved neural language model fusion for streaming recurrent neural network transducer S Kim, Y Shangguan, J Mahadeokar, A Bruguier, C Fuegen, ML Seltzer, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 29 | 2021 |
Dissecting user-perceived latency of on-device E2E speech recognition Y Shangguan, R Prabhavalkar, H Su, J Mahadeokar, Y Shi, J Zhou, C Wu, ... arXiv preprint arXiv:2104.02207, 2021 | 26 | 2021 |
Computerized system and method for automatically identifying and providing digital content based on physical geographic location data V Mahadevan, SS Farfade, JK Mahadeokar, A Arasu, VKR Barakam, ... US Patent 11,194,856, 2021 | 19 | 2021 |
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs Y Fathullah, C Wu, E Lakomkin, K Li, J Jia, Y Shangguan, J Mahadeokar, ... Proceedings of the 2024 Conference of the North American Chapter of the …, 2024 | 18 | 2024 |
Dynamic encoder transducer: A flexible solution for trading off accuracy for latency Y Shi, V Nagaraja, C Wu, J Mahadeokar, D Le, R Prabhavalkar, A Xiao, ... arXiv preprint arXiv:2104.02176, 2021 | 16 | 2021 |
Streaming transformer transducer based speech recognition using non-causal convolution Y Shi, C Wu, D Wang, A Xiao, J Mahadeokar, X Zhang, C Liu, K Li, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 15 | 2022 |
Federated domain adaptation for asr with full self-supervision J Jia, J Mahadeokar, W Zheng, Y Shangguan, O Kalinli, F Seide arXiv preprint arXiv:2203.15966, 2022 | 13 | 2022 |
Streaming parallel transducer beam search with fast-slow cascaded encoders J Mahadeokar, Y Shi, K Li, D Le, J Zhu, V Chandra, O Kalinli, ML Seltzer arXiv preprint arXiv:2203.15773, 2022 | 13 | 2022 |