Follow
Xianzhi Du
Xianzhi Du
Research Scientist, Apple AI/ML
Verified email at apple.com - Homepage
Title
Cited by
Cited by
Year
Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
X Du, M El-Khamy, J Lee, L Davis
2017 IEEE winter conference on applications of computer vision (WACV), 953-961, 2017
3602017
Revisiting resnets: Improved training and scaling strategies
I Bello, W Fedus, X Du, ED Cubuk, A Srinivas, TY Lin, J Shlens, B Zoph
Advances in Neural Information Processing Systems 34, 22614-22627, 2021
3462021
Spinenet: Learning scale-permuted backbone for recognition and localization
X Du, TY Lin, P Jin, G Ghiasi, M Tan, Y Cui, QV Le, X Song
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
2472020
A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
W Chen, X Du, F Yang, L Beyer, X Zhai, TY Lin, H Chen, J Li, X Song, ...
221*2022
Ferret: Refer and ground anything anywhere at any granularity
H You, H Zhang, Z Gan, X Du, B Zhang, Z Wang, L Cao, SF Chang, ...
arXiv preprint arXiv:2310.07704, 2023
1452023
Mm1: Methods, analysis & insights from multimodal llm pre-training
B McKinzie, Z Gan, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ...
arXiv preprint arXiv:2403.09611, 2024
1002024
Cyber-physical system enabled nearby traffic flow modelling for autonomous vehicles
B Chen, Z Yang, S Huang, X Du, Z Cui, J Bhimani, X Xie, N Mi
2017 IEEE 36th international performance computing and communications …, 2017
992017
TensorFlow model garden
H Yu, C Chen, X Du, Y Li, A Rashwan, L Hou, P Jin, F Yang, F Liu, J Kim, ...
Model Garden for TensorFlow., 2020
982020
System and method for deep network fusion for fast and robust object detection
M El-Khamy, X Du, J Lee
US Patent 10,657,364, 2020
742020
Amnet: Deep atrous multiscale stereo disparity estimation networks
X Du, M El-Khamy, J Lee
arXiv preprint arXiv:1904.09099, 2019
562019
Fused deep neural networks for efficient pedestrian detection
X Du, M El-Khamy, VI Morariu, J Lee, L Davis
arXiv preprint arXiv:1805.08688, 2018
442018
Guiding instruction-based image editing via multimodal large language models
TJ Fu, W Hu, X Du, WY Wang, Y Yang, Z Gan
arXiv preprint arXiv:2309.17102, 2023
412023
Simple training strategies and model scaling for object detection
X Du, B Zoph, WC Hung, TY Lin
arXiv preprint arXiv:2107.00057, 2021
412021
Auto-scaling vision transformers without training
W Chen, W Huang, X Du, X Song, Z Wang, D Zhou
arXiv preprint arXiv:2202.11921, 2022
302022
Development of Dose‐Response Models to Predict the Relationship for Human Toxoplasma gondii Infection Associated with Meat Consumption
M Guo, A Mishra, RL Buchanan, JP Dubey, DE Hill, HR Gamble, JL Jones, ...
Risk Analysis 36 (5), 926-938, 2016
252016
Towards a unified foundation model: Jointly pre-training transformers on unpaired images and text
Q Li, B Gong, Y Cui, D Kondratyuk, X Du, MH Yang, M Brown
arXiv preprint arXiv:2112.07074, 2021
242021
Adamv-moe: Adaptive multi-task vision mixture-of-experts
T Chen, X Chen, X Du, A Rashwan, F Yang, H Chen, Z Wang, Y Li
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
222023
Revisiting 3d resnets for video recognition
X Du, Y Li, Y Cui, R Qian, J Li, I Bello
arXiv preprint arXiv:2109.01696, 2021
222021
Provable stochastic optimization for global contrastive learning: Small batch does not harm performance
Z Yuan, Y Wu, ZH Qiu, X Du, L Zhang, D Zhou, T Yang
International Conference on Machine Learning, 25760-25782, 2022
212022
From scarcity to efficiency: Improving clip training via visual-enriched captions
Z Lai, H Zhang, W Wu, H Bai, A Timofeev, X Du, Z Gan, J Shan, ...
arXiv preprint arXiv:2310.07699, 2023
202023
The system can't perform the operation now. Try again later.
Articles 1–20