CCIM LogoSandia National Laboratories Logo
HomeCapabilitiesOrganizationAwardsPublications and PresentationsCareer OpportunitiesCSRIPlatforms


Parallel Systems Software

The Parallel System Software Research capability of the Scalable Computing Systems department in CCIM supports the needs of ASC program by providing the software foundation that enables the scaling of Massively Parallel Processors (MPPs) to thousands of processors.  In collaboration with the University of New Mexico,  Sandia personnel have developed operating systems, communications technology, and run-time system software to provide these capabilities.  Sandia National Labs has also performed pioneering work on the software needed to enable large scale cluster computers.  We continue this tradition with research in communications technology, operating systems, and system software that will enable novel new architectures and system scaling to unprecedented levels.



Sandia's open-source software used as communication middleware for high performance computing and scalable parallel file systems.


Areas of Research:

  • Alternative Programming Models:  Although MPI is the most commonly used programming model at Sandia, some within the broader research community believe that that it could ultimately impose barriers to higher productivity and greater scalability.  Researchers at Sandia are beginning to investigate alternative programming models such as UPC. (Contact: Zhaofang Wen)
  • Configurable Operating/Runtime Systems: This is a callaborative research project with the University of New Mexico and the California Institute of Technology to design and implement a framework for configuring, building, and deploying application-specific operating and runtime systems for peta-scale scientific computing environments.
    (Contact: Ron B. Brightwell)
  • Georgia Tech Collaboration: Sandia is collaborating with Georgia Tech to enable high performance simulation of large scale supercomputer networks on large scale supercomputers. The goal is to leverage a variety of simulation techniques to provide both high fidelity simulations and accurate simulations using high-level network models.
    (Contact: Keith D. Underwood)
  • Light Weight Kernel Development:  The success of Cougar on ASCI Red led to its selection as the OS for ASCI Red Storm.  The Scalable Computing Systems department is contributing to the effort to implement the next version of Cougar (called Catamount) and develop the full system software environment for ASCI Red Storm.
    (Contacts: John P. Vandyke and Kevin Pedretti)
  • Light Weight Kernel Research:  The light weight kernel (LWK) known as Cougar on ASCI Red is a fundamental component of its success.   Indeed, the LWK approach has prevented the "Rogue OS" effects seen on other large scale systems.  LWK research is seeking to extend this technology to new architectures and system scales of 100,000 processors or more. (Contact: Ron B. Brightwell).
  • Network Simulation: In an effort to validate future generations of supercomputers before buying them, this project seeks to build a simulation infrastructure to allow for the ready exploration of the supercomputer architecture space. The goal is for real applications to run on high fidelity simulations of both the processor and the network. (Contact: Rolf E. Riesen)
  • Network Usage Analysis: A key aspect of application performance is the way in which the application uses the network.  The network usage model is also a key parameter for optimizing the network and communications libraries. MPI level network usage analysis
    (Contacts: Ron B. Brightwell and Keith D. Underwood)


Program Contact: Neil D. Pundit

Return to Top of the Page

Newsnotes | Info and Events (internal - SNL only) | Open-Source Software Downloads | Privacy and Security
Sandia National Laboratories Home Page - External or Internal (SNL only)

Maintained by: Bernadette M. Watts
Modified on: May 6, 2008