Follow
Zixiang Chen
Zixiang Chen
Graduate Student of CS, UCLA
Verified email at cs.ucla.edu - Homepage
Title
Cited by
Cited by
Year
How much over-parameterization is sufficient to learn deep ReLU networks?
Z Chen, Y Cao, D Zou, Q Gu
arXiv preprint arXiv:1911.12360, 2019
1262019
A generalized neural tangent kernel analysis for two-layer neural networks
Z Chen, Y Cao, Q Gu, T Zhang
Advances in Neural Information Processing Systems 33, 13363-13373, 2020
78*2020
Benign overfitting in two-layer convolutional neural networks
Y Cao, Z Chen, M Belkin, Q Gu
Advances in neural information processing systems 35, 25237-25250, 2022
682022
Towards understanding the mixture-of-experts layer in deep learning
Z Chen, Y Deng, Y Wu, Q Gu, Y Li
Advances in neural information processing systems 35, 23049-23062, 2022
53*2022
Almost optimal algorithms for two-player zero-sum linear mixture markov games
Z Chen, D Zhou, Q Gu
International Conference on Algorithmic Learning Theory, 227-261, 2022
51*2022
Stein neural sampler
T Hu, Z Chen, H Sun, J Bai, M Ye, G Cheng
arXiv preprint arXiv:1810.03545, 2018
432018
A general framework for sample-efficient function approximation in reinforcement learning
Z Chen, CJ Li, A Yuan, Q Gu, MI Jordan
arXiv preprint arXiv:2209.15634, 2022
282022
Self-play fine-tuning converts weak language models to strong language models
Z Chen, Y Deng, H Yuan, K Ji, Q Gu
arXiv preprint arXiv:2401.01335, 2024
242024
Self-training converts weak learners to strong learners in mixture models
S Frei, D Zou, Z Chen, Q Gu
International Conference on Artificial Intelligence and Statistics, 8003-8021, 2022
162022
Benign overfitting in two-layer ReLU convolutional neural networks
Y Kou, Z Chen, Y Chen, Q Gu
International Conference on Machine Learning, 17615-17659, 2023
14*2023
Rephrase and respond: Let large language models ask better questions for themselves
Y Deng, W Zhang, Z Chen, Q Gu
arXiv preprint arXiv:2311.04205, 2023
132023
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
J Wu, D Zou, Z Chen, V Braverman, Q Gu, PL Bartlett
arXiv preprint arXiv:2310.08391, 2023
112023
Learning high-dimensional single-neuron relu networks with finite samples
J Wu, D Zou, Z Chen, V Braverman, Q Gu, SM Kakade
arXiv preprint arXiv:2303.02255, 2023
5*2023
Faster perturbed stochastic gradient methods for finding local minima
Z Chen, D Zhou, Q Gu
International Conference on Algorithmic Learning Theory, 176-204, 2022
42022
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Z Chen, J Zhang, Y Kou, X Chen, CJ Hsieh, Q Gu
Advances in Neural Information Processing Systems 36, 2024
32024
Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Y Kou, Z Chen, Q Gu
Advances in Neural Information Processing Systems 36, 2024
22024
How does semi-supervised learning with pseudo-labelers work? a case study
Y Kou, Z Chen, Y Cao, Q Gu
The Eleventh International Conference on Learning Representations, 2022
22022
Understanding train-validation split in meta-learning with neural networks
X Zuo, Z Chen, H Yao, Y Cao, Q Gu
The Eleventh International Conference on Learning Representations, 2022
22022
Fast Sampling via De-randomization for Discrete Diffusion Models
Z Chen, H Yuan, Y Li, Y Kou, J Zhang, Q Gu
arXiv preprint arXiv:2312.09193, 2023
12023
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
H Yuan, Z Chen, K Ji, Q Gu
arXiv preprint arXiv:2402.10210, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20