Pin: building customized program analysis tools with dynamic instrumentation CK Luk, R Cohn, R Muth, H Patil, A Klauser, G Lowney, S Wallace, ... Acm sigplan notices 40 (6), 190-200, 2005 | 5231 | 2005 |
Applied machine learning at facebook: A datacenter infrastructure perspective K Hazelwood, S Bird, D Brooks, S Chintala, U Diril, D Dzhulgakov, ... 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018 | 487 | 2018 |
Profiling a warehouse-scale computer S Kanev, JP Darago, K Hazelwood, P Ranganathan, T Moseley, GY Wei, ... Proceedings of the 42nd Annual International Symposium on Computer …, 2015 | 381 | 2015 |
Where is the data? Why you cannot debate CPU vs. GPU performance without the answer C Gregg, K Hazelwood (IEEE ISPASS) IEEE International Symposium on Performance Analysis of …, 2011 | 376 | 2011 |
Machine learning at facebook: Understanding inference at the edge CJ Wu, D Brooks, K Chen, D Chen, S Choudhury, M Dukhan, ... 2019 IEEE international symposium on high performance computer architecture …, 2019 | 333 | 2019 |
Mlperf training benchmark P Mattson, C Cheng, G Diamos, C Coleman, P Micikevicius, D Patterson, ... Proceedings of Machine Learning and Systems 2, 336-349, 2020 | 179 | 2020 |
Analyzing parallel programs with pin M Bach, M Charney, R Cohn, E Demikhovsky, T Devor, K Hazelwood, ... Computer 43 (3), 34-41, 2010 | 151 | 2010 |
The architectural implications of facebook's dnn-based personalized recommendation U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ... 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020 | 150 | 2020 |
Reducing DRAM footprint with NVM in Facebook A Eisenman, D Gardner, I AbdelRahman, J Axboe, S Dong, K Hazelwood, ... Proceedings of the Thirteenth EuroSys Conference, 1-13, 2018 | 135 | 2018 |
Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ... arXiv preprint arXiv:1811.09886, 2018 | 128 | 2018 |
{Fine-Grained} Resource Sharing for Concurrent {GPGPU} Kernels C Gregg, J Dorn, K Hazelwood, K Skadron 4th USENIX Workshop on Hot Topics in Parallelism (HotPar 12), 2012 | 120 | 2012 |
Enabling task parallelism in the cuda scheduler M Guevara, C Gregg, K Hazelwood, K Skadron Workshop on Programming Models for Emerging Architectures 9, 84, 2009 | 119 | 2009 |
Superpin: Parallelizing dynamic instrumentation for real-time performance S Wallace, K Hazelwood International Symposium on Code Generation and Optimization (CGO'07), 209-220, 2007 | 118 | 2007 |
A dynamic binary instrumentation engine for the arm architecture K Hazelwood, A Klauser Proceedings of the 2006 international conference on Compilers, architecture …, 2006 | 106 | 2006 |
Adaptive online context-sensitive inlining K Hazelwood, D Grove International Symposium on Code Generation and Optimization, 2003. CGO 2003 …, 2003 | 83 | 2003 |
Recnmp: Accelerating personalized recommendation with near-memory processing L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 82 | 2020 |
Dynamic heterogeneous scheduling decisions using historical runtime data C Gregg, M Boyer, K Hazelwood, K Skadron Workshop on Applications for Multi-and Many-Core Processors (A4MMC), 1-12, 2011 | 80 | 2011 |
Tradeoffs between power management and tail latency in warehouse-scale applications S Kanev, K Hazelwood, GY Wei, D Brooks 2014 IEEE International Symposium on Workload Characterization (IISWC), 31-40, 2014 | 79 | 2014 |
Improving region selection in dynamic optimization systems D Hiniker, K Hazelwood, MD Smith 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05 …, 2005 | 67 | 2005 |
Dynamic program analysis of microsoft windows applications A Skaletsky, T Devor, N Chachmon, R Cohn, K Hazelwood, V Vladimirov, ... 2010 IEEE International Symposium on Performance Analysis of Systems …, 2010 | 59 | 2010 |