Enabling coordinated register allocation and thread-level parallelism optimization for GPUs X Xie, Y Liang, X Li, Y Wu, G Sun, T Wang, D Fan Proceedings of the 48th International Symposium on Microarchitecture, 395-406, 2015 | 79 | 2015 |
TGPA: Tile-grained pipeline architecture for low latency CNN inference X Wei, Y Liang, X Li, CH Yu, P Zhang, J Cong 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2018 | 68 | 2018 |
A coordinated tiling and batching framework for efficient GEMM on GPUs X Li, Y Liang, S Yan, L Jia, Y Li Proceedings of the 24th symposium on principles and practice of parallel …, 2019 | 50 | 2019 |
AMOS: enabling automatic mapping for tensor computations on spatial accelerators with hardware abstraction S Zheng, R Chen, A Wei, Y Jin, Q Han, L Lu, B Wu, X Li, S Yan, Y Liang Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 21 | 2022 |
Performance-centric register file design for GPUs using racetrack memory S Wang, Y Liang, C Zhang, X Xie, G Sun, Y Liu, Y Wang, X Li 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), 25-30, 2016 | 21 | 2016 |
Enabling efficient fast convolution algorithms on GPUs via MegaKernels L Jia, Y Liang, X Li, L Lu, S Yan IEEE Transactions on Computers 69 (7), 986-997, 2020 | 16 | 2020 |
CRAT: Enabling coordinated register allocation and thread-level parallelism optimization for GPUs X Xie, Y Liang, X Li, Y Wu, G Sun, T Wang, D Fan IEEE Transactions on Computers 67 (6), 890-897, 2017 | 16 | 2017 |
Efficient kernel management on GPUs X Li, Y Liang 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), 85-90, 2016 | 16 | 2016 |
cuMBIR: An efficient framework for low-dose X-ray CT image reconstruction on GPUs X Li, Y Liang, W Zhang, T Liu, H Li, G Luo, M Jiang Proceedings of the 2018 International Conference on Supercomputing, 184-194, 2018 | 13 | 2018 |
Efficient kernel management on GPUs Y Liang, X Li ACM Transactions on Embedded Computing Systems (TECS) 16 (4), 1-24, 2017 | 13 | 2017 |
Exploring cache bypassing and partitioning for multi-tasking on GPUs Y Liang, X Li, X Xie 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 9-16, 2017 | 12 | 2017 |
Neoflow: A flexible framework for enabling efficient compilation for high performance dnn training S Zheng, R Chen, Y Jin, A Wei, B Wu, X Li, S Yan, Y Liang IEEE Transactions on Parallel and Distributed Systems 33 (11), 3220-3232, 2021 | 8 | 2021 |
CuLDA: solving large-scale LDA Problems on GPUs X Xie, Y Liang, X Li, W Tan Proceedings of the 28th International Symposium on High-Performance Parallel …, 2019 | 7 | 2019 |
CuLDA_CGS: Solving large-scale LDA problems on GPUs X Xie, Y Liang, X Li, W Tan Proceedings of the 24th Symposium on Principles and Practice of Parallel …, 2019 | 6 | 2019 |
Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion S Zheng, S Chen, P Song, R Chen, X Li, S Yan, D Lin, J Leng, Y Liang 2023 IEEE International Symposium on High-Performance Computer Architecture …, 2023 | 3 | 2023 |
Theoretical linear convergence of deep unfolding network for block-sparse signal recovery R Fu, Y Liu, X Li Third International Conference on Computer Science and Communication …, 2022 | 3 | 2022 |
LongTail-Bench: A Benchmark Suite for Domain-Specific Operators in Deep Learning X Li, S Yan, L Jiang, P Xu, J Ma, X Zhang, D Lin 2022 IEEE International Symposium on Workload Characterization (IISWC), 282-295, 2022 | 1 | 2022 |
EasyView: Enabling and Scheduling Tensor Views in Deep Learning Compilers L Jiang, P Xu, Q Zhu, X Li, S Yan, X Zhang, D Lin, W Ma, Z Li, J Liu, J Ma, ... Proceedings of the 51st International Conference on Parallel Processing, 1-11, 2022 | 1 | 2022 |
FlashDecoding++: Faster Large Language Model Inference on GPUs K Hong, G Dai, J Xu, Q Mao, X Li, J Liu, K Chen, H Dong, Y Wang arXiv preprint arXiv:2311.01282, 2023 | | 2023 |
Proteus: Simulating the Performance of Distributed DNN Training J Duan, X Li, P Xu, X Zhang, S Yan, Y Liang, D Lin arXiv preprint arXiv:2306.02267, 2023 | | 2023 |