NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Franklin I/O Upgrade and System Stabilization

On this page:

NERSC and Cray are making improvements to Franklin to address stability and I/O performance through April 2009. Upgrades include configuration changes, the addition of I/O hardware, and operating system bug fixes.

1. Latest Status

Top

1.1 New scratch File Systems

Top

An additional scratch file system was added on April 1, 2009. Users now have access to file systems defined by the environmental variables $SCRATCH and $SCRATCH2. During the upgrade on March 31-April 1, all files in the existing scratch file system were erased when the disks were reformatted. No backups are available from NERSC.

Jobs that were in the queue prior to the upgrade have been placed in a "User Hold" status. Users can release the jobs with the "qrls" command, but all required files in /scratch must be recreated in place or the jobs will fail.

1.2 Batch Wallclock Limit Reduced to 24 Hours

Top

The maximum wall limit will was set to 24 hours on March 17; the previous limit was 36 hours. This change was made to reduce system idle time associated with draining the machine before a scheduled maintenance. The limit of 36 hours is expected to return in the Summer of 2009.

2. Schedule

Top

Note: The dates in the table below are subject to change. Each maintenance period is expected to start at 7am Pacific Time, with the exact durations announced as we get closer to the time. System status and maintenance progress can be monitored at NERSC MOTD.

EventsDates
Instability analysis and stabilization plansFebruary 26 - March 16
2.1.1 OS upgrade and stability patchesMarch 17, 7:00 AM - 11:00 PM PT
I/O hardware upgrades, hardware repairs, and stability testing March 24, 7:00 AM - 6:00 PM PT
Final I/O upgrades and reformat scratch file systemMarch 31 7:00 AM - April 1
Stabilization/prepare for quad-core system acceptanceApril 1 - May 4. A few scheduled outages TBD. First outage: April 7, 7:00 AM - 3:00 PM PT
System acceptanceMay 4 - July 4 (or earlier); scheduled outages TBD

3. Changes to Franklin I/O

Top

3.1 Summary

Top

  • Upgrade the interactive network adapters (PCI to PCI-e) to improve network performance between the interactive nodes and other NERSC systems, including NGF (/project).
  • Double the number of I/O service nodes and upgrade their networking cards (PCI to PCI-e) to improve scratch IO performance.
  • Install service nodes for Data Virtualization Services (DVS) to be able to export NGF (/project) directly to compute nodes later this year.

3.2 Scratch file system upgrade and reconfiguration

Top

  • New hardware was installed to add bandwidth to Franklin's I/O subsystem and improve metadata operations (such as creating new files and using the "ls" command). It also has the potential to provide additional stability and reducing I/O contention among jobs.
  • The existing /scratch file system was reformatted on March 31, 2009 as part of the new hardware installation. All existing files were permanently deleted as a result.
  • Batch jobs that were queued the Mar 31 maintenance were placed into a "user hold" status. If those jobs required files in /scratch to run successfully, the files need be recreated in place before releasing the job from hold. Jobs have not been released after 14 days will be deleted on Apr 14.
  • In anticipation of intensive HPSS access the number of simultaneous HPSS connections (per user) was set to 5 on April 1, 2009. The previous limit of 15 will be reinstated based on observed demand.
  • A new scratch file system, named /scratch2, was added on April 1, 2009
    • Both /scratch and /scratch2 have the same peak bandwidth (~16 GB/sec) and storage (~200 TB).
    • Users may run out of either scratch file system, but are encouraged to choose one for their primary work.
    • The environment variables $SCRATCH and $SCRATCH2 are set to the path a user's scratch directories.
    • Both scratch file systems have a default stripe count of 2. (Previously /scratch was set to 4)
    • User disk quotas remain the same. A user will not be able to run jobs if the total combined disk usage in $SCRATCH and $SCRATCH2 is over his or her quota.

4. Node Configuration Changes

Top

The following table lists the number of service nodes and compute nodes before and after the Franklin I/O upgrades.

Franklin Nodes Configuration
Before Upgrade After Upgrade
Compute Nodes 9,660 9,532
Spare Compute Nodes 20 60
Login Nodes 10 (also served as MOM nodes) 10 (distinct as login nodes)
Batch Management Nodes (MOM nodes) 16 (10 also served as login nodes) 6 (distinct as MOM nodes)
I/O Server Nodes 32 56
DVS Server Nodes 0 20

5. Configuration Improvements

Top

Note: Configuration changes will be added to the list below once they are scheduled for implementation.
  • Move the PBS batch management functionality off of the interactive nodes to separate nodes to improve job reliability and interactive responsiveness. This work was completed on March 3, 2009. Franklin now has 10 login and 6 batch management nodes.
  • Increase Lustre timeouts in order to allow more jobs to complete their I/O rather than be terminated. This change was made on March 10.
  • Plan to convert the scratch disks from RAID5 to RAID6, which will greatly increase disk reliability.

LBNL Home
Page last modified: Thu, 02 Apr 2009 21:46:49 GMT
Page URL: http://www.nersc.gov/nusers/systems/franklin/IO_upgrade.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science