




  • D. Buntinas, C. Coti, T. Hérault, P. Lemarinier, L. Pilard, A. Rezmerita, E. Rodriguez and F. Cappello, "Blocking vs. Non-blocking Coordinated Checkpointing for Large-Scale Fault Tolerant MPI", Future Generation Computer Systems, Volume 24, Issue 1, January 2008, Pages 73-84. (pdf)
  • P. Balaji, W. Feng, S. Bhagvat, D. K. Panda, R. Thakur, W. Gropp, "Analyzing the Impact of Supporting Out-of-Order Communication on In-Order Performance with iWARP," in Proc. of SC07, November 2007. (pdf)
  • J. L. Träff, W. Gropp, and R. Thakur, "Self-Consistent MPI Performance Requirements," in Proc. of the 14th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 36-45. (pdf) (selected for the outstanding papers session)
  • R. Thakur and W. Gropp, "Test Suite for Evaluating Performance of MPI Implementations That Support MPI_THREAD_MULTIPLE," in Proc. of the 14th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 46-55. (pdf) (selected for the outstanding papers session)
  • W. Gropp and R. Thakur, "Revealing the Performance of MPI RMA Implementations," in Proc. of the 14th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 272-280. (pdf)
  • S. Pervez, G. Gopalakrishnan, R. M. Kirby, R. Palmer, R. Thakur, and W. Gropp, "Practical Model Checking Method for Verifying Correctness of MPI Programs," Proc. of the 14th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 344-353. (pdf)
  • R. Latham, W. Gropp, R. Ross, and R. Thakur, "Extending the MPI-2 Generalized Request Interface," in Proc. of the 14th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2007), September 2007, pp. 223-232. (pdf)
  • D. Buntinas, G. Mercier and W. Gropp, "Implementation and Evaluation of Shared-Memory Communication and Synchronization Operations in MPICH2 using the Nemesis Communication Subsystem", Parallel Computing, Volume 33, Issue 9, September 2007, Pages 634-644. (pdf)
  • R. Thakur and W. Gropp, "Open Issues in MPI Implementation," in Proc. of the 12th Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), August 2007, pp. 327-338. (pdf)
  • W. Gropp and R. Thakur, "Thread Safety in an MPI Implementation: Requirements and Analysis," Parallel Computing, (33)9:595-604, September 2007. (pdf)
  • P. Balaji, S. Bhagvat, D. Panda, R. Thakur, and W. Gropp, "Advanced Flow-control Mechanisms for the Sockets Direct Protocol over InfiniBand," in Proc. of the 2007 Int'l Conference on Parallel Processing, September 2007. (pdf)
  • R. Latham, R. Ross, and R. Thakur, "Implementing MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided Communication," Int'l Journal of High Performance Computing Applications, (21)2:132--143, Summer 2007. (pdf)
  • P. Balaji, D. Buntinas, S. Balay, B. Smith, R. Thakur, W. Gropp, "Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI," in Proc. of the 21st IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS 2007), March 2007. (pdf)
  • K. Coloma, A. Ching, A. Choudhary, W. Liao, R. Ross, R. Thakur, and L. Ward, "A New Flexible MPI Collective I/O Implementation" in Proc. of the IEEE Int'l Conference on Cluster Computing (Cluster 2006), September 2006. (pdf)
  • D. Buntinas, G. Mercier and W. Gropp, "Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem", in Proc. of the 13th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006. (pdf)
  • W. Gropp and R. Thakur, "Issues in Developing a Thread-Safe MPI Implementation," in Proc. of the 13th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 12-21. (pdf) (selected as 1 of 3 outstanding papers at the conference)
  • S. Pervez, G. Gopalakrishnan, R. M. Kirby, R. Thakur, and W. Gropp, "Formal Verification of Programs That Use MPI One-Sided Communication," in Proc. of the 13th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 30-39. (pdf) (selected as 1 of 3 outstanding papers at the conference)
  • R. Latham, R. Ross, and R. Thakur, "Can MPI Be Used for Persistent Parallel Services?," in Proc. of the 13th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 275-284. (pdf)
  • S. Byna, X. Sun, R. Thakur, and W. Gropp, "Automatic Memory Optimizations for Improving MPI Derived Datatype Performance," in Proc. of the 13th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2006), September 2006, pp. 238-246. (pdf)
  • D. Buntinas, G. Mercier and W. Gropp, "Data Transfers Between Processes in an SMP System: Performance Study and Application to MPI", in Proceedings of the International Conference on Parallel Processing 2006 (ICPP 06), August 2006. (pdf)
  • D. Buntinas, G. Mercier and W. Gropp, "Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem", in Proceedings of the International Symposium on Cluster Computing and the Grid 2006 (CCGRID '06), May 2006. pdf)
  • E. Chan, R. Geijn, W. Gropp, R. Thakur, "Collective Communication on Architectures that Support Simultaneous Communication over Multiple Links," in Proc. of the ACM SIGPLAN 2006 Symposium on Principles and Practice of Parallel Programming (PPoPP 2006), March 2006. (pdf)
  • J. Lee, R. Ross, S. Atchley, M. Beck, and R. Thakur, "MPI-IO/L: Efficient Remote I/O for MPI-IO via Logistical Networking," in Proc. of the 20th IEEE Int'l Parallel and Distributed Processing Symposium (IPDPS 2006), April 2006. (pdf)
  • H. Yu, R. K. Sahoo, C. Howson, G. Almasi, J. G. Castanos, M. Gupta J. E. Moreira, J. J. Parker, T. E. Engelsiepen, R. Ross, R. Thakur, R. Latham, and W. D. Gropp, "High Performance File I/O for the BlueGene/L Supercomputer," in Proc. of the 12th International Symposium on High-Performance Computer Architecture (HPCA-12), February 2006. (pdf)
  • D. Buntinas and W. Gropp, "Designing a Common Communication Subsystem", in Proc. of the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3666, Springer, September 2005. (pdf)
  • W. Gropp and R. Thakur, "An Evaluation of Implementation Options for MPI One-Sided Communication," in Proc. of the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3666, Springer, September 2005, pp. 415-424. (ps, pdf)
  • R. Thakur, R. Ross, and R. Latham, "Implementing Byte-Range Locks Using MPI One-Sided Communication," in Proc. of the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3666, Springer, September 2005, pp. 119-128. (ps, pdf) (Note: We recently discovered a bug in this algorithm that can lead to deadlock. See this paper published in Euro PVM/MPI 2006 for details and proposed fixes.)
  • R. Latham, R. Ross, R. Thakur, and B. Toonen, "Implementing MPI-IO Shared File Pointers without File System Support," in Proc. of the 12th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2005), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3666, Springer, September 2005, pp. 84-93. (ps, pdf)
  • R. Thakur, W. Gropp, and B. Toonen, "Optimizing the Synchronization Operations in MPI One-Sided Communication," Int'l Journal of High Performance Computing Applications, (19)2:119-128, Summer 2005. (ps, pdf)
  • R. Ross, R. Latham, W. Gropp, R. Thakur, and B. Toonen, "Implementing MPI-IO Atomic Mode Without File System Support," in Proc. of the 5th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2005), May 2005. (pdf)
  • R. Thakur, R. Rabenseifner, and W. Gropp, "Optimization of Collective Communication Operations in MPICH," Int'l Journal of High Performance Computing Applications, (19)1:49-66, Spring 2005. (ps, pdf)
  • R. Latham, R. Ross, and R. Thakur, "The Impact of File Systems on MPI-IO Scalability," in Proc. of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 87-96. (pdf)
  • W. Jiang, J. Liu, H. Jin, D. K. Panda, D. Buntinas, R. Thakur, and W. Gropp, "Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters," in Proc. of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 68-76. (pdf)
  • R. Thakur, W. Gropp, and B. Toonen, "Minimizing Synchronization Overhead in the Implementation of MPI One-Sided Communication," in Proc. of the 11th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2004), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 3241, Springer, September 2004, pp. 57-67. (ps, pdf)
  • J. Lee, X. Ma, R. Ross, R. Thakur, and M. Winslett, "RFS: Efficient and Flexible Remote File Access for MPI-IO," in Proc. of the IEEE Int'l Conference on Cluster Computing (Cluster 2004), September 2004. (pdf)
  • S. Byna, X. Sun, W. Gropp, and R. Thakur, "Predicting Memory-Access Cost Based on Data-Access Patterns," in Proc. of the IEEE Int'l Conference on Cluster Computing (Cluster 2004), September 2004. (pdf)
  • W. Jiang, J. Liu, H. Jin, D. K. Panda, W. Gropp, and R. Thakur, "High Performance MPI-2 One-Sided Communication over Infiniband," in Proc. of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), April 2004. (pdf)
  • R. Thakur and W. Gropp, "Improving the Performance of Collective Operations in MPICH," in Proc. of the 10th European PVM/MPI Users' Group Meeting (Euro PVM/MPI 2003), Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science, LNCS 2840, Springer, September 2003, pp. 257-267. (ps, pdf)
  • S. Byna, W. Gropp, X. Sun, and R. Thakur, "Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost," in Proc. of the IEEE Int'l Conference on Cluster Computing (Cluster 2003), December 2003, pp. 412-419. (ps, pdf)


  • R. Thakur, W. Gropp, and E. Lusk, "Optimizing Noncontiguous Accesses in MPI-IO," Parallel Computing, (28)1:83-105, January 2002. (ps, pdf)
  • R. Thakur, W. Gropp, and E. Lusk, "On Implementing MPI-IO Portably and with High Performance," in Proc. of the Sixth Workshop on I/O in Parallel and Distributed Systems, May 1999, pp. 23-32. (ps, pdf)
  • R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," in Proc. of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182-189. (ps, pdf)
  • R. Thakur, W. Gropp, and E. Lusk, "A Case for Using MPI's Derived Datatypes to Improve I/O Performance," in Proc. of SC98: High Performance Networking and Computing, November 1998. (html)
  • R. Thakur, W. Gropp, and E. Lusk, "An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces," in Proc. of the 6th Symposium on the Frontiers of Massively Parallel Computation, October 1996, pp. 180-187. (ps, pdf)
  • R. Thakur, R. Ross, E. Lusk, and W. Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004. (ps, pdf)