Advanced Development

In the field of high-performance computing, there is a saying that if you are standing still, you are really falling behind. On the other hand, just grabbing the newest technology in the hope of keeping ahead is not necessarily a prudent course. Knowing what is looming just over the horizon—and helping shape it before it comes completely into view—has long been a hallmark of the NERSC staff.

By working in close partnership with leading developers of high-performance computing systems and software, NERSC staff share their collective expertise and experience to help ensure that new products can meet the demands of computational scientists. Vendors benefit from this relationship in that NERSC serves as a real-world testbed for innovative ideas and technologies. This section presents current examples of how NERSC is working in partnership to advance the state of scientific computing.

Creating Science-Driven Computer Architecture

In 2002 Lawrence Berkeley and Argonne national laboratories, in close collaboration with IBM, proposed a strategy to create a new class of computational capability that is optimal for science. The white paper “Creating Science-Driven Computer Architecture: A New Path to Scientific Leadership” envisioned development partnerships in which teams of scientific applications specialists and computer scientists would work with computer architects from major U.S. vendors to create hardware and software environments that will allow scientists to extract the maximum performance and capability from the hardware. The paper included a conceptual design for a system called Blue Planet that would enhance IBM’s Power series architecture for scientific computing.

The first implementation of this strategy in 2003 involved a series of meetings between NERSC, IBM, and Lawrence Livermore National Laboratory (LLNL). IBM had won a contract to build the ASCI Purple system at LLNL, the world’s first supercomputer capable of up to 100 teraflop/s, powered by 12,544 Power 5 processors. LLNL and IBM scientists thought that they might enhance the performance of ASCI Purple by incorporating key elements of the Blue Planet system architecture, particularly the node configuration, and so NERSC staff joined in discussions to work out the details. The resulting architecture enhancements are expected to benefit all scientific users of future IBM Power series systems.

Making Archives More Accessible

As a long-time collaborator in HPSS development, NERSC made two major contributions this year. The first was the Grid authentication code for the Parallel FTP (PFTP) file transfer protocol; this enhancement to PFTP is now being distributed by IBM. The second contribution, in collaboration with Argonne National Laboratory, is a testbed for the Globus/Grid interface with HPSS.

NERSC is also collaborating with the DOE’s Joint Genome Institute (JGI) on a project to optimize genomic data storage for wide accessibility. This project will distribute and enhance access to data generated at JGI’s Production Genome Facility (PGF), and will archive the data on NERSC’s HPSS. The PGF is one of the world’s largest public DNA sequencing facilities, producing 2 million trace data files each month (25 KB each), 100 assembled projects per month (50–250 MB), and several very large assembled projects per year (~50 GB). NERSC storage will make this data more quickly and easily available to genome researchers. Storage will be configured to hold more data online and intelligently move the data to enable high-speed access.

NERSC and JGI staff will work closely to improve the organization and clustering of DNA trace data so that portions of an assembled dataset can be downloaded instead of the entire project, which could contain millions of trace files. The Web interface being developed will for the first time correlate the DNA trace data with portions of the assembled sequence, simplifying the selection of trace data, reducing the amount of data that must be handled, and enhancing researchers’ ability to generate partial comparisons. Cached Web servers and file systems will extend the archive to the wide area network, distributing data across multiple platforms based on data characteristics, then unifying data access through one interface. When this project is completed, its storage and data movement techniques, together with its data interfaces, could serve as a model for other read-only data repositories.

Streamlining File Storage

In 2001, NERSC began a project to develop a system to streamline its file storage system. In a typical HPC environment, each large computational system has its own local disk along with access to additional network-attached storage and archival storage servers. Such an environment prevents the consolidation of storage between systems, thus limiting the amount of working storage available on each system to its local disk capacity. The result is an unnecessary replication of files on multiple systems, an increased workload on users to manage their files, and a burden on the infrastructure to support file transfers between the various systems.

NERSC is developing a solution using existing and emerging technologies to overcome these inefficiencies. The Global Unified Parallel File System (GUPFS) project aims to provide a scalable, high-performance, high-bandwidth, shared-disk file system for use by all of NERSC’s high-performance production computational systems. GUPFS will provide unified file namespace for these systems and will be integrated with HPSS. Storage servers, accessing the consolidated storage through the GUPFS shared-disk file systems, will provide hierarchical storage management, backup, and archival services. An additional goal is to distribute GUPFS-based file systems to geographically remote facilities as native file systems over the DOE Science Grid.

When fully implemented, GUPFS will eliminate unnecessary data replication, simplify the user environment, provide better distribution of storage resources, and permit the management of storage as a separate entity while minimizing impacts on the computational systems.

The major enabling components of this envisioned environment are a high-performance shared-disk file system and a cost-effective, high performance storage-area network (SAN). These emerging technologies, while evolving rapidly, are not targeted towards the needs of high-performance scientific computing. The GUPFS project intends to encourage the development of these technologies to support HPC needs through collaborations with other institutions and vendors, and through active development.

During the first three years of the project, the GUPFS team (Figure 1) tested and evaluated shared-disk file systems, SAN technologies, and other components of the GUPFS environment. This investigation included open source and commercial shared-disk file systems, new SAN fabric technologies as they became available, SAN and file system distribution over the wide-area network, HPSS integration, and file system performance and scaling.

Figure 1
Rei Lee, Cary Whitney, Will Baird, and Greg Butler, along with (not shown) Mike Welcome (Computational Research Division) and Craig Tull, are working to develop a high performance file storage system that is easy to use and independent of computing platforms.

In 2003, following extended trials with multiple storage vendors, NERSC selected the YottaYotta NetStorager system to address the scalability and performance demands of the GUPFS project. A major goal is to provide thousands of heterogeneous hosts with predictable and scalable performance to a common pool of storage resources with no increase in operational costs or complexity.

Enabling Grid-Based Visualization

The Berkeley Lab/NERSC visualization group is developing a Web portal interface called VisPortal that will deliver Grid-based visualization and data analysis capabilities from an easy-to-use, centralized point of access. Using standard Globus/Grid middleware and off-the-shelf Web automation, VisPortal hides the underlying complexity of resource selection and application management among heterogeneous distributed computing and storage resources. From a single access point, the user can browse through the data, wherever it is located, access computing resources, and launch all the components of a distributed visualization application without having to know all the details of how the various systems are configured. The portal can also automate complex workflows like the distributed generation of MPEG movies or the scheduling of file transfers. The VisPortal developers are currently working with NERSC users to improve the workflow management and robustness of the prototype system.

A Testbed for Cluster Software

Three years ago, Berkeley Lab purchased a 174-processor Linux cluster, named Alvarez in honor of Berkeley Lab Nobel Laureate Luis Alvarez, to provide a system testbed for NERSC and other staff to evaluate cluster architectures as an option for procurement of large production computing systems. Recently the system has also proven useful for vendors looking to test and develop software for large-scale systems, such as Unlimited Linux and operating system components for Red Storm.

Unlimited Scale, Inc. (USI) used Alvarez to test a beta version of Unlimited Linux, an operating system developed to provide production capabilities on commodity processors and interconnects that match or exceed those of the Cray T3E supercomputer, once the workhorse of the NERSC Center. USI asked NERSC to assess Unlimited Linux’s operational and administrative functionality, performance, desired features and enhancements, and user functionality. A few minor problems were discovered in adapting Unlimited Linux to the Alvarez network address scheme, and USI will change its configuration tools to make it easier to install the system on existing hardware configurations. Performance benchmarks showed that Unlimited Linux compares favorably to the IBM system software that runs on Alvarez.

Cray Inc. was given access to Alvarez to test the operating system under development for the Red Storm supercomputer being developed by Cray Inc. and DOE’s Sandia National Laboratories for the National Nuclear Security Administration’s Advanced Simulation and Computing (ASCI) program. The first round involved testing system software and administration, and system scalability. The Red Storm Linux-based software allows simulation of multiple virtual processors per physical processor, and simulations of up to 1,000 processors were run on Alvarez. A set of scripts for system configuration and easier job launch were developed, and experiments provided some initial data on launch time for small and large executable size programs. The team found and fixed approximately 20 system bugs, from the portals layer up through MPI and system launch.

In the second stage of testing, Cray’s I/O team is working with NERSC to use the GUPFS testbed platform, and GUPFS joined together with Alvarez, to do scalability testing of two potential Red Storm file systems, PVFS and Lustre. They will also collaborate with NERSC staff who are working on global file systems that will eventually be deployed within the NERSC Center.

Clients, Sponsors, and Advisors
Capability and Scalability
Comprehensive Scientific Support
Connectivity
Advanced Development

Top