|
Bassi Detailed Information
IBM Publications
Bassi System Specificiations
- IBM Cluster 1600
- p5-575 nodes = POWER 5 nodes of a specific (575) configuration
- 8-way single core SMP nodes
- 1.9 GHz single-core POWER 5 64-bit processors (DCM: Dual Chip Module with one active core)
- 7.6 GFlops/sec theoretical peak per processor
- 2MB on-board L2 & 36MB L3 cache
- 200 GB/s max cumulative peak theoretical memory bandwidth
- 32 GB memory per node via 64 512 MB pluggable DIMMs
- 48GB/s theoretical peak I/O Bandwidth via 6 GX+ buses
- One 2-link High Performance Switch adapter
- Four integrated 10/100/1000Mb Ethernet ports
- Production System
- 111 Compute Nodes (888 Compute Processors)
- 2 Interactive Nodes
- 3 Spare Nodes
- 6 Storage Nodes
- 1 p550 Cluster System Management (CSM) node
- 12 Hardware Management Control (HMC) 3.06 GHz Xenon
- ~100 TB formatted usable RAID GPFS global file storage
- 111*8*7.6 GFlops/sec = 6.7 TFlops/sec theoretical system peak
- I/O Subsystem Configuration
- 6 Virtual Shared Disk servers to support GPFS, each with 2 High Performance Switch (HPS) links. (See next section)
- Each Virtual Shared Disk server has 16 2 Gb Fibre Channel links.
- Disk subsystem consists of 24 IBM DS4300 storage systems, each with 42 146 GB drives configured as 8 RAID 5 4+P arrays, with 2 hot spares per DS4300. Each DS4300 has dual controller and each controller has dual Fibre Channel ports.
- Servers and controllers are configured for failover so each server acts primary for 8 controller ports and backup for another 8 controller ports.
- Each array is a separate LUN. The scratch filesystem has 118 LUNs, of which 12 are used for 2 metadata replicas. Data and metadata on scratch are separated for performance.
- High Performance Switch (HPS) (Also known as "Federation")
- Each node: One 2-link High Performance Switch adapter
- Each node: attachs to the interconnect with 2 links, one to each of 2 separate planes
- LAPI and MPI communication
- GPFS control uses IP over HPS; data uses LAPI over HPS
- Peak HPS bandwidth - 2 GB/s per link each direction
- MPI latency less than 5 µs
- The network uses 123 of 128 available connections on each plane.
- Racks and Frames
- p5-575 nodes are housed in a 2U by 24-inch wide by 48-inch deep node package
- Up to 12 p5-575 modules and 2 High Performance Switches can be contained in a single 24-inch IBM system frame
- There are 12 system frames housing production system p575 nodes, node switches and I/O drawers.
- Software (original version in parentheses)
- IBM AIX Version 5.3 ML 4 (5.2 ML 5)
- IBM Parallel Environnent (PE) V. 4.2
- IBM C/C++ Enterprise Edition V 7.0
- IBM Fortran Enterprise Edition V 9.1
- IBM LoadLeveler 3.3
- IBM Cluster System Manager (CSM) V. 1.4
- IBM General Parallel Filesystem (GPFS) V. 2.3
- IBM Parallel Engineering Scientific Subroutines Library (PESSL) V 3.2
- IBM Engineering Scientific Subroutines Library (ESSL) V 4.2
- HPC Toolkit for POWER5 (no later than 4Q 2005)
- AIX Toolbox for Linux Applications
Bassi p575 Power 5 Processors
The following image depicts dual-core chips; Bassi
has a single core.
- 1.9 GHz single core Power 5 chips
- 1.92 MB Level 2 Cache per CPU (10-way associative, 3x640 KB)
- 36 MB Level 3 Cache per CPU
- Processor-to-memory bandwidth (peak): ~12 GB/s (per chip?)
- L3 to L2 peak bandwidth: 243.2 GBps (per node?)
- 120 rename registers for integer and floating point
- maximum 5 instructions per cycle
- maximum 4 flops per cycle
- 64 KB 2-way associative Instruction cache
- 32 KB 4-way associative Data cache
|