Follow
Nithin Rao Koluguri
Title
Cited by
Cited by
Year
Titanet: Neural model for speaker representation with 1d depth-wise separable convolutions and global context
NR Koluguri, T Park, B Ginsburg
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
1152022
Fast conformer with linearly scalable attention for efficient speech recognition
D Rekesh, NR Koluguri, S Kriman, S Majumdar, V Noroozi, H Huang, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
702023
SpeakerNet: 1D depth-wise separable convolutional network for text-independent speaker recognition and verification
NR Koluguri, J Li, V Lavrukhin, B Ginsburg
arXiv preprint arXiv:2010.12653, 2020
432020
Multi-scale speaker diarization with dynamic scale weighting
TJ Park, NR Koluguri, J Balam, B Ginsburg
arXiv preprint arXiv:2203.15974, 2022
292022
Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis.
BN Suhas, D Patel, NR Koluguri, Y Belur, P Reddy, A Nalini, R Yadav, ...
INTERSPEECH, 4564-4568, 2019
262019
Meta-learning for robust child-adult classification from speech
NR Koluguri, M Kumar, SH Kim, C Lord, S Narayanan
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
232020
Spectrogram enhancement using multiple window Savitzky-Golay (MWSG) filter for robust bird sound detection
NR Koluguri, GN Meenakshi, PK Ghosh
IEEE/ACM Transactions on Audio, Speech, and Language Processing 25 (6), 1183 …, 2017
202017
Enhancing speaker diarization with large language models: A contextual beam search approach
TJ Park, K Dhawan, N Koluguri, J Balam
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
152024
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition
KC Puvvada, NR Koluguri, K Dhawan, J Balam, B Ginsburg
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
142024
Open automatic speech recognition leaderboard
V Srivastav, S Majumdar, N Koluguri, A Moumen, S Gandhi
Open automatic speech recognition leaderboard, 2023
112023
Property-aware multi-speaker data simulation: A probabilistic modelling technique for synthetic data generation
TJ Park, H Huang, C Hooper, N Koluguri, K Dhawan, A Jukic, J Balam, ...
arXiv preprint arXiv:2310.12371, 2023
72023
Less is more: Accurate speech recognition & translation without web-scale data
KC Puvvada, P Żelasko, H Huang, O Hrinchuk, NR Koluguri, K Dhawan, ...
arXiv preprint arXiv:2406.19674, 2024
62024
Ambernet: A compact end-to-end model for spoken language identification
F Jia, NR Koluguri, J Balam, B Ginsburg
CoRR, 2022
6*2022
Spectral Codecs: Spectrogram-Based Audio Codecs for High Quality Speech Synthesis
R Langman, A Jukić, K Dhawan, NR Koluguri, B Ginsburg
arXiv preprint arXiv:2406.05298, 2024
52024
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System
TJ Park, H Huang, A Jukic, K Dhawan, KC Puvvada, N Koluguri, N Karpov, ...
arXiv preprint arXiv:2310.12378, 2023
52023
Investigating End-to-End ASR Architectures for Long Form Audio Transcription
NR Koluguri, S Kriman, G Zelenfroind, S Majumdar, D Rekesh, V Noroozi, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
32024
Sortformer: Seamless integration of speaker diarization and asr by bridging timestamps and tokens
T Park, I Medennikov, K Dhawan, W Wang, H Huang, NR Koluguri, ...
arXiv preprint arXiv:2409.06656, 2024
22024
Bestow: Efficient and streamable speech language model with the best of two worlds in gpt and t5
Z Chen, H Huang, O Hrinchuk, KC Puvvada, NR Koluguri, P Żelasko, ...
arXiv preprint arXiv:2406.19954, 2024
22024
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks
H Huang, T Park, K Dhawan, I Medennikov, KC Puvvada, NR Koluguri, ...
arXiv preprint arXiv:2408.13106, 2024
12024
NeMo Open Source Speaker Diarization System.
T Park, NR Koluguri, F Jia, J Balam, B Ginsburg
INTERSPEECH, 853-854, 2022
12022
The system can't perform the operation now. Try again later.
Articles 1–20