Follow
Konstantin Mishchenko
Konstantin Mishchenko
Verified email at samsung.com - Homepage
Title
Cited by
Cited by
Year
Tighter Theory for Local SGD on Identical and Heterogeneous Data
A Khaled, K Mishchenko, P Richtarik
International Conference on Artificial Intelligence and Statistics, 4519-4529, 2020
4812020
Distributed learning with compressed gradient differences
K Mishchenko, E Gorbunov, M Takáč, P Richtárik
Optimization Methods and Software, 1-16, 2019
227*2019
Stochastic distributed learning with gradient quantization and double-variance reduction
S Horváth, D Kovalev, K Mishchenko, P Richtárik, S Stich
Optimization Methods and Software, 1-16, 2022
1842022
First Analysis of Local GD on Heterogeneous Data
A Khaled, K Mishchenko, P Richtárik
NeurIPS FL Workshop, arXiv preprint arXiv:1909.04715, 2019
1792019
Random Reshuffling: Simple Analysis with Vast Improvements
K Mishchenko, A Khaled, P Richtárik
Advances in Neural Information Processing Systems 33, 17309-17320, 2020
1412020
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
K Mishchenko, G Malinovsky, S Stich, P Richtárik
International Conference on Machine Learning, 15750-15769, 2022
1332022
Adaptive gradient descent without descent
Y Malitsky, K Mishchenko
International Conference on Machine Learning, 6702-6712, 2020
1042020
SEGA: Variance Reduction via Gradient Sketching
F Hanzely, K Mishchenko, P Richtárik
Advances in Neural Information Processing Systems, 2082-2093, 2018
862018
Revisiting stochastic extragradient
K Mishchenko, D Kovalev, E Shulgin, P Richtárik, Y Malitsky
International Conference on Artificial Intelligence and Statistics, 4573-4582, 2020
852020
Learning-Rate-Free Learning by D-Adaptation
A Defazio, K Mishchenko
International Conference on Machine Learning, 2023
632023
Asynchronous SGD Beats Minibatch SGD under Arbitrary Delays
K Mishchenko, F Bach, M Even, B Woodworth
Advances in Neural Information Processing Systems 35, 420-433, 2022
512022
Stochastic Newton and cubic Newton methods with simple local linear-quadratic rates
D Kovalev, K Mishchenko, P Richtárik
NeurIPS Workshop Beyond First Order Methods in ML, arXiv preprint arXiv:1912 …, 2019
492019
A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning
K Mishchenko, F Iutzeler, J Malick, MR Amini
International Conference on Machine Learning, 3584-3592, 2018
482018
Regularized Newton Method with Global Convergence
K Mishchenko
SIAM Journal on Optimization 33 (3), 1440-1462, 2023
452023
Dualize, split, randomize: Toward fast nonsmooth optimization algorithms
A Salim, L Condat, K Mishchenko, P Richtárik
Journal of Optimization Theory and Applications 195 (1), 102-130, 2022
392022
Proximal and Federated Random Reshuffling
K Mishchenko, A Khaled, P Richtárik
International Conference on Machine Learning, 15718-15749, 2022
362022
99% of worker-master communication in distributed optimization is not needed
K Mishchenko, F Hanzely, P Richtárik
Conference on Uncertainty in Artificial Intelligence, 979-988, 2020
34*2020
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate
S Soori, K Mischenko, A Mokhtari, MM Dehnavi, M Gurbuzbalaban
AISTATS 2020, 2019
302019
A distributed flexible delay-tolerant proximal gradient algorithm
K Mishchenko, F Iutzeler, J Malick
SIAM Journal on Optimization 30 (1), 933-959, 2020
292020
IntSGD: Adaptive Floatless Compression of Stochastic Gradients
K Mishchenko, B Wang, D Kovalev, P Richtárik
ICLR 2022 - International Conference on Learning Representations, 2022
26*2022
The system can't perform the operation now. Try again later.
Articles 1–20