Follow
Haotian Zhang
Haotian Zhang
Research Scientist, Apple
Verified email at apple.com - Homepage
Title
Cited by
Cited by
Year
Grounded language-image pre-training
LH Li*, P Zhang*, H Zhang*, J Yang, C Li, Y Zhong, L Wang, L Yuan, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
10402022
Glipv2: Unifying localization and vision-language understanding
H Zhang*, P Zhang*, X Hu, YC Chen, LH Li, X Dai, L Wang, L Yuan, ...
NeurIPS, 2022
2902022
Exploit the connectivity: Multi-object tracking with trackletnet
G Wang, Y Wang, H Zhang, R Gu, JN Hwang
Proceedings of the 27th ACM international conference on multimedia, 482-490, 2019
2342019
Simple applications of BERT for ad hoc document retrieval
W Yang, H Zhang, J Lin
arXiv preprint arXiv:1903.10972, 2019
2342019
Transmvsnet: Global context-aware multi-view stereo network with transformers
Y Ding, W Yuan, Q Zhu, H Zhang, X Liu, Y Wang, X Liu
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
1992022
Ferret: Refer and ground anything anywhere at any granularity
H You*, H Zhang*, Z Gan, X Du, B Zhang, Z Wang, L Cao, SF Chang, ...
ICLR, 2023
1972023
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
B McKinzie*, Z Gan*, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ...
ECCV, 2024
1382024
An internal learning approach to video inpainting
H Zhang, L Mai, N Xu, Z Wang, J Collomosse, H Jin
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
892019
Eye in the sky: Drone-based object tracking and 3d localization
H Zhang, G Wang, Z Lei, JN Hwang
Proceedings of the 27th ACM international conference on multimedia, 899-907, 2019
852019
VisDrone-SOT2019: The vision meets drone single object tracking challenge results
D Du, P Zhu, L Wen, X Bian, H Ling, Q Hu, J Zheng, T Peng, X Wang, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
552019
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
K You, H Zhang, E Schoop, F Weers, A Swearngin, J Nichols, Y Yang, ...
ECCV, 2024
502024
Visdrone-mot2019: The vision meets drone multiple object tracking challenge results
L Wen, P Zhu, D Du, X Bian, H Ling, Q Hu, J Zheng, T Peng, X Wang, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019
292019
From scarcity to efficiency: Improving clip training via visual-enriched captions
Z Lai*, H Zhang*, W Wu, H Bai, A Timofeev, X Du, Z Gan, J Shan, ...
ECCV2024, 2023
262023
How easy is it to fool your multimodal llms? an empirical analysis on deceptive prompts
Y Qian, H Zhang, Y Yang, Z Gan
arXiv preprint arXiv:2402.13220, 2024
252024
Ferret-v2: An improved baseline for referring and grounding with large language models
H Zhang
arXiv preprint arXiv:2404.07973 3, 19, 2024
22*2024
Bundle adjustment for monocular visual odometry based on detections of traffic signs
Y Zhang, H Zhang, G Wang, J Yang, JN Hwang
IEEE transactions on vehicular technology 69 (1), 151-162, 2019
212019
Ia-mot: Instance-aware multi-object tracking with motion consistency
J Cai, Y Wang, H Zhang, HM Hsu, C Ma, JN Hwang
CVPR2020, 2020
152020
Apple intelligence foundation language models
T Gunter, Z Wang, C Wang, R Pang, A Narayanan, A Zhang, B Zhang, ...
arXiv preprint arXiv:2407.21075, 2024
142024
Empowering unsupervised domain adaptation with large-scale pre-trained vision-language models
Z Lai, H Bai, H Zhang, X Du, J Shan, Y Yang, CN Chuah, M Cao
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024
142024
Lifts: Lidar and monocular image fusion for multi-object tracking and segmentation
H Zhang, Y Wang, J Cai, HM Hsu, H Ji, JN Hwang
BMTT Challenge Workshop, IEEE Conference on Computer Vision and Pattern …, 2020
142020
The system can't perform the operation now. Try again later.
Articles 1–20