Analytical modeling is enough for high-performance BLIS TM Low, FD Igual, TM Smith, ES Quintana-Orti ACM Transactions on Mathematical Software (TOMS) 43 (2), 1-18, 2016 | 185 | 2016 |
The BLIS framework: Experiments in portability FG Van Zee, TM Smith, B Marker, TM Low, RAVD Geijn, FD Igual, ... ACM Transactions on Mathematical Software (TOMS) 42 (2), 1-19, 2016 | 127 | 2016 |
SPIRAL: Extreme performance portability F Franchetti, TM Low, DT Popovici, RM Veras, DG Spampinato, ... Proceedings of the IEEE 106 (11), 1935-1968, 2018 | 126 | 2018 |
A unified coded deep neural network training strategy based on generalized polydot codes S Dutta, Z Bai, H Jeong, TM Low, P Grover 2018 IEEE International Symposium on Information Theory (ISIT), 1585-1589, 2018 | 121 | 2018 |
3D-stacked memory-side acceleration: Accelerator and system design Q Guo, N Alachiotis, B Akin, F Sadi, G Xu, TM Low, L Pileggi, JC Hoe, ... 2nd Workshop on Near Data Processing, WONDP 2014, 2014 | 115 | 2014 |
High performance zero-memory overhead direct convolutions J Zhang, F Franchetti, TM Low Proceedings of the 35th International Conference on Machine Learning 80 …, 2018 | 95 | 2018 |
Efficient spmv operation for large and highly sparse matrices using scalable multi-way merge parallelization F Sadi, J Sweeney, TM Low, JC Hoe, L Pileggi, F Franchetti Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 82 | 2019 |
Exploiting symmetry in tensors for high performance: Multiplication with symmetric tensors MD Schatz, TM Low, RA van de Geijn, TG Kolda SIAM Journal on Scientific Computing 36 (5), C453-C479, 2014 | 67* | 2014 |
An API for manipulating matrices stored by blocks TM Low, RA Van de Geijn, FW Note Computer Science Department, University of Texas at Austin, 2004 | 60 | 2004 |
Accumulating Householder transformations, revisited T Joffrain, TM Low, ES Quintana-Ortí, R Geijn, FGV Zee ACM Transactions on Mathematical Software (TOMS) 32 (2), 169-179, 2006 | 56 | 2006 |
Evaluation of graph analytics frameworks using the gap benchmark suite A Azad, MM Aznaveh, S Beamer, MP Blanco, J Chen, L D'Alessandro, ... 2020 IEEE International Symposium on Workload Characterization (IISWC), 216-227, 2020 | 37 | 2020 |
Analytical cache modeling and tilesize optimization for tensor contractions R Li, A Sukumaran-Rajam, R Veras, TM Low, F Rastello, A Rountev, ... Proceedings of the International Conference for High Performance Computing …, 2019 | 34 | 2019 |
Scalable parallelization of FLAME code via the workqueuing model FG Van Zee, P Bientinesi, TM Low, RA Van De Geijn ACM Transactions on Mathematical Software (TOMS) 34 (2), 2008 | 30 | 2008 |
FFTX and SpectralPack: A first look F Franchetti, DG Spampinato, A Kulkarni, DT Popovici, TM Low, ... 2018 IEEE 25th International Conference on High Performance Computing …, 2018 | 28 | 2018 |
CodeNet: Training large scale neural networks in presence of soft-errors S Dutta, Z Bai, TM Low, P Grover arXiv preprint arXiv:1903.01042, 2019 | 27 | 2019 |
First look: Linear algebra-based triangle counting without matrix multiplication TM Low, VN Rao, M Lee, D Popovici, F Franchetti, S McMillan 2017 IEEE High Performance Extreme Computing Conference (HPEC), 1-6, 2017 | 27 | 2017 |
Masterless coded computing: A fully-distributed coded FFT algorithm H Jeong, TM Low, P Grover 2018 56th Annual Allerton Conference on Communication, Control, and …, 2018 | 26 | 2018 |
High-assurance SPIRAL: End-to-end guarantees for robot and car control F Franchetti, TM Low, S Mitsch, JP Mendoza, L Gui, A Phaosawasdi, ... IEEE Control Systems Magazine 37 (2), 82-103, 2017 | 26 | 2017 |
Large bandwidth-efficient FFTs on multicore and multi-socket systems DT Popovici, TM Low, F Franchetti 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 24 | 2018 |
Exploration of fine-grained parallelism for load balancing eager k-truss on gpu and cpu MP Blanco, TM Low, K Kim 2019 IEEE High Performance Extreme Computing Conference (HPEC), 1-7, 2019 | 23 | 2019 |