Xiaolong Xie
Xiaolong Xie
Software/hardware Engineer, Database BU, Alibaba Inc.
Verified email at alibaba-inc.com
Title
Cited by
Cited by
Year
Coordinated static and dynamic cache bypassing for GPUs
X Xie, Y Liang, Y Wang, G Sun, T Wang
2015 IEEE 21st International Symposium on High Performance Computer …, 2015
1032015
An efficient compiler framework for cache bypassing on GPUs
X Xie, Y Liang, G Sun, D Chen
2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 516-523, 2013
862013
Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs
X Xie, Y Liang, X Li, Y Wu, S Guangyu, T Wang, D Fan
IEEE/ACM International Symposium on Microarchitecture,, 2015
532015
CuMF_SGD: Parallelized stochastic gradient descent for matrix factorization on GPUS
X Xie, W Tan, LL Fong, Y Liang
Proceedings of the 26th International Symposium on High-Performance Parallel …, 2017
192017
An Efficient Compiler Framework for Cache Bypassing on GPUs
Y Liang, X Xie, G Sun, D Chen
IEEE, 2015
192015
Performance-centric register file design for GPUs using racetrack memory
S Wang, Y Liang, C Zhang, X Xie, G Sun, Y Liu, Y Wang, X Li
2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), 25-30, 2016
122016
Cumf_sgd: Fast and scalable matrix factorization
X Xie, W Tan, LL Fong, Y Liang
arXiv preprint arXiv:1610.05838, 2016
102016
Optimizing cache bypassing and warp scheduling for GPUs
Y Liang, X Xie, Y Wang, G Sun, T Wang
IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2017
52017
Exploring cache bypassing and partitioning for multi-tasking on GPUs
Y Liang, X Li, X Xie
2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 9-16, 2017
42017
CuLDA: solving large-scale LDA Problems on GPUs
X Xie, Y Liang, X Li, W Tan
Proceedings of the 28th International Symposium on High-Performance Parallel …, 2019
32019
CuLDA_CGS: Solving Large-scale LDA Problems on GPUs
X Xie, L Yun, X Li, W Tan
arxiv preprint, 2018
22018
CRAT: Enabling Coordinated Register Allocation and Thread-Level Parallelism Optimization for GPUs
X Xie, Y Liang, X Li, Y Wu, G Sun, T Wang, D Fan
IEEE Transactions on Computers 67 (6), 890-897, 2017
22017
Adaptive parallelism of task execution on machines with accelerators
LL Fong, W Tan, X Xie, H Zhou
US Patent 10,203,988, 2019
12019
Matrix factorization with two-stage data block dispatch associated with graphics processing units
E Duesterwald, LL Fong, W Tan, X Xie
US Patent 10,380,222, 2019
2019
Efficient Data-Parallel Primitives on Heterogeneous Systems
Z Lai, Q Luo, X Xie
Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019
2019
The system can't perform the operation now. Try again later.
Articles 1–15