Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs DA Bader, G Cong Journal of Parallel and Distributed Computing 66 (11), 1366-1378, 2006 | 285* | 2006 |
On the convergence properties of a -step averaging stochastic gradient descent algorithm for nonconvex optimization F Zhou, G Cong arXiv preprint arXiv:1708.01012, 2017 | 260 | 2017 |
Solving large, irregular graph problems using adaptive work-stealing G Cong, S Kodali, S Krishnamoorthy, D Lea, V Saraswat, T Wen 2008 37th International Conference on Parallel Processing, 536-545, 2008 | 132 | 2008 |
On the architectural requirements for efficient execution of graph algorithms DA Bader, G Cong, J Feo 2005 International Conference on Parallel Processing (ICPP'05), 547-556, 2005 | 125 | 2005 |
Fast PGAS implementation of distributed graph algorithms G Cong, G Almasi, V Saraswat SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 67 | 2010 |
Accelerating data loading in deep neural network training CC Yang, G Cong 2019 IEEE 26th International Conference on High Performance Computing, Data …, 2019 | 50 | 2019 |
Automated detection of application performance bottlenecks IH Chung, G Cong, DJ Klepacki, S Sbaraglia, SR Seelam, HF Wen US Patent 8,225,291, 2012 | 50 | 2012 |
Iterative, non-uniform profiling method for automatically refining performance bottleneck regions in scientific code G Cong, PK Malkin US Patent 8,214,806, 2012 | 46 | 2012 |
Optimizing large-scale graph analysis on multithreaded, multicore platforms G Cong, K Makarychev 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 45 | 2012 |
An experimental study of parallel biconnected components algorithms on symmetric multiprocessors (SMPs) G Cong, DA Bader 19th IEEE International Parallel and Distributed Processing Symposium, 9 pp., 2005 | 44 | 2005 |
Programmable framework for automatic tuning of software applications IH Chung, G Cong, DJ Klepacki, S Sbaraglia, SR Seelam, HF Wen US Patent 8,327,325, 2012 | 34 | 2012 |
Artificial intelligence for accelerating time integrations in multiscale modeling C Han, P Zhang, D Bluestein, G Cong, Y Deng Journal of computational physics 427, 110053, 2021 | 30 | 2021 |
Application data prefetching on the IBM Blue Gene/Q supercomputer IH Chung, C Kim, HF Wen, G Cong SC'12: Proceedings of the International Conference on High Performance …, 2012 | 29 | 2012 |
Profiling application performance according to data structure IH Chung, G Cong, K Ekanadham, D Klepacki, S Sbaraglia, HF Wen US Patent 8,490,061, 2013 | 28 | 2013 |
Lock-free parallel algorithms: An experimental study G Cong, D Bader High Performance Computing-HiPC 2004: 11th International Conference …, 2005 | 28 | 2005 |
A framework for automated performance bottleneck detection IH Chung, G Cong, D Klepacki, S Sbaraglia, S Seelam, HF Wen 2008 IEEE International Symposium on Parallel and Distributed Processing, 1-7, 2008 | 25 | 2008 |
Techniques for designing efficient parallel graph algorithms for SMPs and multicore processors G Cong, DA Bader Parallel and Distributed Processing and Applications: 5th International …, 2007 | 23 | 2007 |
An Empirical Analysis of Parallel Random Permutation Algorithms ON SMPs. G Cong, DA Bader PDCS, 27-34, 2005 | 22 | 2005 |
A productivity centered tools framework for application performance tuning H Wen, S Sbaraglia, S Seelam, I Chung, G Cong, D Klepacki Fourth International Conference on the Quantitative Evaluation of Systems …, 2007 | 21 | 2007 |
IBM System Blue Gene Solution: Performance Analysis Tools G Lakner, IH Chung, G Cong, S Fadden, N Goracke, D Klepacki, J Lien, ... IBM Redpaper Publication, 2008 | 19 | 2008 |