Pin: building customized program analysis tools with dynamic instrumentation CK Luk, R Cohn, R Muth, H Patil, A Klauser, G Lowney, S Wallace, ... Acm sigplan notices 40 (6), 190-200, 2005 | 5754 | 2005 |
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping CK Luk, S Hong, H Kim Proceedings of the 42nd Annual IEEE/ACM international symposium on …, 2009 | 766 | 2009 |
Compiler-based prefetching for recursive data structures CK Luk, TC Mowry Proceedings of the seventh international conference on Architectural support …, 1996 | 566 | 1996 |
The pochoir stencil compiler Y Tang, RA Chowdhury, BC Kuszmaul, CK Luk, CE Leiserson Proceedings of the twenty-third annual ACM symposium on Parallelism in …, 2011 | 458 | 2011 |
Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors CK Luk Proceedings of the 28th annual international symposium on Computer …, 2001 | 376 | 2001 |
Asim: A performance model framework J Emer, P Ahuja, E Borch, A Klauser, CK Luk, S Manne, SS Mukherjee, ... Computer 35 (2), 68-76, 2002 | 321 | 2002 |
Pytorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation J Ansel, E Yang, H He, N Gimelshein, A Jain, M Voznesensky, B Bao, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 270 | 2024 |
CMP $ im: A Pin-based on-the-fly multi-core cache simulator A Jaleel, RS Cohn, CK Luk, B Jacob Proceedings of the Fourth Annual Workshop on Modeling, Benchmarking and …, 2008 | 241 | 2008 |
Analyzing parallel programs with pin M Bach, M Charney, R Cohn, E Demikhovsky, T Devor, K Hazelwood, ... Computer 43 (3), 34-41, 2010 | 161 | 2010 |
SD3: A scalable approach to dynamic data-dependence profiling M Kim, H Kim, CK Luk 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 535-546, 2010 | 146 | 2010 |
PinOS: A programmable framework for whole-system dynamic instrumentation PP Bungale, CK Luk Proceedings of the 3rd international conference on Virtual execution …, 2007 | 137 | 2007 |
Cooperative prefetching: Compiler and hardware support for effective instruction prefetching in modern processors CK Luk, TC Mowry Proceedings. 31st Annual ACM/IEEE International Symposium on …, 1998 | 108 | 1998 |
Ispike: a post-link optimizer for the intel/spl reg/itanium/spl reg/architecture CK Luk, R Muth, H Patil, R Cohn, G Lowney International Symposium on Code Generation and Optimization, 2004. CGO 2004 …, 2004 | 88 | 2004 |
Automatic compiler-inserted prefetching for pointer-based applications CK Luk, TC Mowry IEEE Transactions on Computers 48 (2), 134-141, 1999 | 87 | 1999 |
Predicting data cache misses in non-numeric applications through correlation profiling TC Mowry, CK Luk Proceedings of 30th Annual International Symposium on Microarchitecture, 314-320, 1997 | 69 | 1997 |
Prospector: A dynamic data-dependence profiler to help parallel programming M Kim, H Kim, CK Luk HotPar’10: Proceedings of the USENIX workshop on Hot Topics in parallelism …, 2010 | 60 | 2010 |
Memory forwarding: Enabling aggressive layout optimizations by guaranteeing the safety of data relocation CK Luk, TC Mowry Proceedings of the 26th Annual International Symposium on Computer …, 1999 | 57 | 1999 |
Profile-guided post-link stride prefetching CK Luk, R Muth, H Patil, R Weiss, PG Lowney, R Cohn Proceedings of the 16th international conference on Supercomputing, 167-178, 2002 | 55 | 2002 |
Controlling program execution through binary instrumentation H Pan, K Asanović, R Cohn, CK Luk ACM SIGARCH Computer Architecture News 33 (5), 45-50, 2005 | 51 | 2005 |
Methods and apparatus for stride profiling a software application CK Luk, G Lowney US Patent 7,181,723, 2007 | 48 | 2007 |