Follow
Wesley Bland
Title
Cited by
Cited by
Year
Post-failure recovery of MPI communication capability: Design and rationale
W Bland, A Bouteiller, T Herault, G Bosilca, J Dongarra
The International Journal of High Performance Computing Applications 27 (3 …, 2013
2752013
An Evaluation of User-Level Failure Mitigation Support in MPI
W Bland, A Bouteiller, T Herault, J Hursey, G Bosilca, J Dongarra
Recent Advances in the Message Passing Interface, 193-203, 2012
1432012
A proposal for User-Level Failure Mitigation in the MPI-3 Standard
W Bland, G Bosilca, A Bouteiller, T Herault, J Dongarra
Department of Electrical Engineering and Computer Science, University of …, 2012
582012
An evaluation of user-level failure mitigation support in MPI
W Bland, A Bouteiller, T Herault, J Hursey, G Bosilca, JJ Dongarra
Computing 95, 1171-1184, 2013
562013
Fault injection framework for system resilience evaluation: fake faults for finding future failures
T Naughton, W Bland, G Vallée, C Engelmann, SL Scott
Proceedings of the 2009 workshop on Resiliency in High Performance Computing …, 2009
532009
A Checkpoint-on-Failure protocol for algorithm-based recovery in standard MPI
W Bland, P Du, A Bouteiller, T Herault, G Bosilca, J Dongarra
Euro-Par 2012 Parallel Processing, 477-488, 2012
512012
Fault tolerant MapReduce-MPI for HPC clusters
Y Guo, W Bland, P Balaji, X Zhou
proceedings of the international conference for high performance computing …, 2015
472015
Why is MPI so slow? analyzing the fundamental limits in implementing MPI-3.1
K Raffenetti, A Amer, L Oden, C Archer, W Bland, H Fujita, Y Guo, ...
Proceedings of the international conference for high performance computing …, 2017
392017
Mpi sessions: Leveraging runtime infrastructure to increase scalability of applications at exascale
D Holmes, K Mohror, RE Grant, A Skjellum, M Schulz, W Bland, ...
Proceedings of the 23rd European MPI Users' Group Meeting, 121-129, 2016
372016
User level failure mitigation in MPI
W Bland
Euro-Par 2012: Parallel Processing Workshops: BDMC, CGWS, HeteroPar, HiBB …, 2013
352013
Extending the scope of the Checkpoint‐on‐Failure protocol for forward recovery in standard MPI
W Bland, P Du, A Bouteiller, T Herault, G Bosilca, JJ Dongarra
Concurrency and computation: Practice and experience 25 (17), 2381-2393, 2013
302013
VOCL-FT: introducing techniques for efficient soft error coprocessor recovery
AJ Peña, W Bland, P Balaji
Proceedings of the International Conference for High Performance Computing …, 2015
282015
Mpich user’s guide
P Balaji, W Bland, W Gropp, R Latham, H Lu, AJ Pena, K Raffenetti, S Seo, ...
Argonne National Laboratory, 2014
252014
Lessons learned implementing user-level failure mitigation in mpich
W Bland, H Lu, S Seo, P Balaji
2015 15th IEEE/ACM international symposium on cluster, cloud and grid …, 2015
222015
Simplifying the recovery model of user-level failure mitigation
W Bland, K Raffenetti, P Balaji
2014 Workshop on Exascale MPI at Supercomputing Conference, 20-25, 2014
202014
Portable, MPI-Interoperable Coarray Fortran
C Yang, W Bland, J Mellor-Crummey, P Balaji
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of …, 2014
202014
Design and implementation of a menu based OSCAR command line interface
W Bland, T Naughton, G Vallee, SL Scott
High Performance Computing Systems and Applications, 2007. HPCS 2007. 21st …, 2007
162007
Memory compression techniques for network address management in MPI
Y Guo, CJ Archer, M Blocksome, S Parker, W Bland, K Raffenetti, P Balaji
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
102017
Toward Message Passing Failure Management
W Bland
University of Tennessee, Knoxville, 2013
92013
MPICH user’s guide
A Amer, P Balaji, W Bland, W Gropp, Y Guo, R Latham, H Lu, L Oden, ...
Mathematics and Computer Science Division-Argonne National Laboratory, 2015
82015
The system can't perform the operation now. Try again later.
Articles 1–20