Follow
Igor Gitman
Igor Gitman
Applied Scientist, NVIDIA
Verified email at nvidia.com
Title
Cited by
Cited by
Year
Large batch training of convolutional networks
Y You, I Gitman, B Ginsburg
arXiv preprint arXiv:1708.03888, 2017
1278*2017
Understanding the role of momentum in stochastic gradient methods
I Gitman, H Lang, P Zhang, L Xiao
Advances in Neural Information Processing Systems 32, 2019
1132019
Comparison of batch normalization and weight normalization algorithms for the large-scale image classification
I Gitman, B Ginsburg
arXiv preprint arXiv:1709.08145, 2017
782017
Mixed-precision training for nlp and speech recognition with openseq2seq
O Kuchaiev, B Ginsburg, I Gitman, V Lavrukhin, J Li, H Nguyen, C Case, ...
arXiv preprint arXiv:1805.10387, 2018
512018
Openseq2seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models
O Kuchaiev, B Ginsburg, I Gitman, V Lavrukhin, C Case, P Micikevicius
Proceedings of Workshop for NLP Open Source Software (NLP-OSS), 41-46, 2018
432018
Openmathinstruct-1: A 1.8 million math instruction tuning dataset
S Toshniwal, I Moshkov, S Narenthiran, D Gitman, F Jia, I Gitman
arXiv preprint arXiv:2402.10176, 2024
422024
Nemotron-4 340B Technical Report
B Adler, N Agarwal, A Aithal, DH Anh, P Bhattacharya, A Brundyn, ...
arXiv preprint arXiv:2406.11704, 2024
372024
Large batch training of convolutional networks with layer-wise adaptive rate scaling
B Ginsburg, I Gitman, Y You
212018
Novel prediction techniques based on clusterwise linear regression
I Gitman, J Chen, E Lei, A Dubrawski
arXiv preprint arXiv:1804.10742, 2018
142018
Scaling SGD batch size to 32k for imagenet training. CoRR abs/1708.03888 (2017)
Y You, I Gitman, B Ginsburg
arXiv preprint arXiv:1708.03888, 2017
92017
Confidence-based ensembles of end-to-end speech recognition models
I Gitman, V Lavrukhin, A Laptev, B Ginsburg
arXiv preprint arXiv:2306.15824, 2023
52023
Convergence analysis of gradient descent algorithms with proportional updates
I Gitman, D Dilipkumar, B Parr
arXiv preprint arXiv:1801.03137, 2018
52018
Openmathinstruct-2: Accelerating ai for math with massive open-source instruction data
S Toshniwal, W Du, I Moshkov, B Kisacanin, A Ayrapetyan, I Gitman
arXiv preprint arXiv:2410.01560, 2024
22024
Weighted finite state transducer frameworks for conversational ai systems and applications
A Laptev, V Bataev, I Gitman, B Ginsburg
US Patent App. 18/355,653, 2024
2024
Weighted finite state transducer frameworks for conversational ai systems and applications
A Laptev, V Bataev, I Gitman, B Ginsburg
US Patent App. 18/355,646, 2024
2024
Powerful and Extensible WFST Framework for Rnn-Transducer Losses
A Laptev, V Bataev, I Gitman, B Ginsburg
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
2023
Canonical Least Squares Clustering on Sparse Medical Data
I Gitman, J Chen, A Dubrawski
The system can't perform the operation now. Try again later.
Articles 1–17