Follow
Yanping Huang
Title
Cited by
Cited by
Year
Regularized evolution for image classifier architecture search
E Real, A Aggarwal, Y Huang, QV Le
Proceedings of the aaai conference on artificial intelligence 33 (01), 4780-4789, 2019
32812019
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
Journal of Machine Learning Research 25 (70), 1-53, 2024
17852024
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Y Huang, Y Cheng, A Bapna, O Firat, MX Chen, D Chen, HJ Lee, J Ngiam, ...
Advances in Neural Information Processing Systems 32, 103--112, 2019
15142019
Lamda: Language models for dialog applications
R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ...
arXiv preprint arXiv:2201.08239, 2022
12832022
Palm 2 technical report
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 2023
9432023
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
7742023
Gshard: Scaling giant models with conditional computation and automatic sharding
D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat, Y Huang, M Krikun, N Shazeer, ...
International Conference on Learning Representations (ICLR), 2020
7502020
Predictive coding
Y Huang, RPN Rao
Wiley Interdisciplinary Reviews: Cognitive Science 2 (5), 580-593, 2011
6882011
Glam: Efficient scaling of language models with mixture-of-experts
N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ...
International Conference on Machine Learning, 5547-5569, 2022
496*2022
Alpa: Automating Inter-and Intra-Operator Parallelism for Distributed Deep Learning
L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ...
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
2052022
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
2012019
H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei. 2022. Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ...
arXiv preprint arXiv:2210.11416, 2022
180*2022
Just pick a sign: Optimizing deep multitask models with gradient sign dropout
Z Chen, J Ngiam, Y Huang, T Luong, H Kretzschmar, Y Chai, D Anguelov
Advances in Neural Information Processing Systems 33, 2039-2050, 2020
1662020
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition
Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ...
IEEE Journal of Selected Topics in Signal Processing 16 (6), 1519-1532, 2022
1532022
Mixture-of-experts with expert choice routing
Y Zhou, T Lei, H Liu, N Du, Y Huang, V Zhao, AM Dai, QV Le, J Laudon
Advances in Neural Information Processing Systems 35, 7103-7114, 2022
1402022
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
902021
Beyond distillation: Task-level mixture-of-experts for efficient inference
S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, MT Luong, O Firat
arXiv preprint arXiv:2110.03742, 2021
782021
Designing effective sparse expert models
B Zoph, I Bello, S Kumar, N Du, Y Huang, J Dean, N Shazeer, W Fedus
arXiv preprint arXiv:2202.08906 2 (3), 17, 2022
742022
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving
Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
642023
St-moe: Designing stable and transferable sparse expert models
B Zoph, I Bello, S Kumar, N Du, Y Huang, J Dean, N Shazeer, W Fedus
arXiv preprint arXiv:2202.08906, 2022
632022
The system can't perform the operation now. Try again later.
Articles 1–20