Skip all navigation and jump to content Jump to site navigation Jump to section navigation.
NASA Logo - Goddard Space Flight Center + Visit NASA.gov
NASA Center for Computational Sciences
NCCS HOME USER SERVICES SYSTEMS DOCUMENTATION NEWS GET MORE HELP

 

Documentation
OVERVIEW
GENERAL SOFTWARE INFO
HALEM
DALEY AND COURANT
PALM/EXPLORE
DIRAC/JIMPF

More palm/explore links:

+ Quick Start Guide

+ Overview of System Resources

+ Filesystem Access and Policies

+ Programming Environment

+ Batch Queues

+ Software and Tools

palm Software and Tools

[Adapted from the NASA Advanced Supercomputing (NAS) Division's SGI Altix 3000 Software documentation, accessed 1/26/05.]

Linux System Utilities
+ hinv
+ topology
+ uptime
+ ps
+ top and gtop
+ free
+ w

PD Software Modules
+ pd-hdf.5-1.6.4
+ pd-ncarg.4.4.1
+ pd-petsc.2.3.0-complex.le
+ pd-sun-java-sdk.1.4.2_06
+ pd-xdiff-3.4
+ pd-grads.1.9b4
+ pd-nedit-5.5-x86

Links to Other Useful Documents

Debuggers
+ Etnus Totalview
+ Intel idb
+ GNU gdb
+ ddd

Performance Analysis Tools
+ pfmon
+ profile.pl
+ Histx


Debuggers

  • Etnus TotalView

    Etnus TotalView is awaiting licensing on palm and is not yet available.
  • Intel idb

    This debugger comes bundled with the Intel compilers and supports MPI and OpenMP parallel programming models.

    The Intel Debugger is a symbolic source code debugger that debugs programs compiled by the Intel(R) C/C++ Compiler, The Intel(R) Fortran Compiler, and the GNU compilers (gcc, g++). For full source-level debugging, compile the source code with the compiler option that includes the symbol table information in the compiled executable file. Intel idb supports DBX and GDB modes. In the GDB mode, Intel Debugger operates like the GNU Debugger, GDB.

    For more information, check out man idb and the 'Intel Debugger (IDB) Manual' available under /opt/intel/comp/7.1.032/compiler70/docs.

  • GNU gdb (with Fortran Extension)

    The GNU Project Debugger is available from the Free Software Foundation. GNU gdb with Fortran extensions supports Fortran 95.

    You can use GDB to debug programs written in C, C++, and Modula-2. Fortran support will be added when a GNU Fortran compiler is ready.

    For more information, check out man gdb or the gdb web page on the Red Hat website.

  • ddd - The Data Display Debugger

    DDD is a graphical front-end for GDB and other command-line debuggers. DDD works best with GDB, but can work with idb in dbx mode.

    Ex:
    ddd --debugger idb --dbx ./a.out

    For more information, read man ddd or the ddd web page on the Red Hat website.

| Top of Page |


Performance Analysis Tools

  • pfmon

    pfmon is a performance tuning tool written by S. Eranian at HP Labs. pfmon allows users to collect performance data at the command line. It uses Itanium Performance Monitoring Unit (PMU) to do counting and sampling on unmodified binaries. The Itanium and Itanium 2 define approximately 230 and 470 different events, respectively. Since there are only four counters, only 4 events out of 230 (or 470 for Itanium 2) can be measured at any given time. It is often necessary to run an executable several times in order to get a complete coverage of all the relevant performance aspects.

    The following lists a few useful pfmon commands:

    • pfmon -help or pfmon -h
      A built-in help facility to show what pfmon options are available and what they do.

    • pfmon -l
      Shows list of supported events by host PMU.

    • pfmon -i substring
      Finds information about an event.

    • Example:                  
      %pfmon -i cpu_cycles         
      Name   : CPU_CYCLES         
      VCode  : 0x12         
      Code   : 0x12         
      PMD/PMC: [ 4 5 6 7 ]         
      Umask  : 0000         
      EAR    : No (N/A)         
      BTB    : No         
      MaxIncr: 1  (Threshold 0)         
      Qual   : None         
      Group  : None         
      Set    : None         
      Desc   : CPU Cycles                    
    • pfmon --outfile=filename ./a.out
      Prints counts in a file called "filename"

    • pfmon --with-header ./a.out
      Generates a machine description header with results

    • pfmon -eCPU_CYCLES --us-counter-format ./a.out
      or
      pfmon -ecpu_cycles --us-counter-format ./a.out
      Reports the total number of CPU cycles needed by the program a.out, and reports the result in a US number format (123,456,789). As shown, the counter name (ex: CPU_CYCLES) is case insensitive.

    • pfmon -ecpu_cycles,ia64_inst_retired_this,nops_retired, back_end_bubble_all ./a.out
      This is recommended as the first step in using pfmon to count cycles, instructions, and nops (non-operational instructions) retired as well as back-end stall cycles.

      Note: Do not leave a space between commas and the names of the event.
      The event NOPS_RETIRED provides information on number of retired nop.i, nop.m, nop.b, and nop.f instructions. These no-ops are added to the instruction bundles when no other useful instructions can be found to execute.

    • pfmon -ecpu_cycles,fp_ops_retired a.out

      The event fp_ops_retired tracks retired floating-point operations. To calculate the FLOPS (floating point operations per second), use

      FLOPS = count from fp_ops_retired/count from cpu_cycles
      * cpu clock speed

    • pfmon -e data_ear_events a.out

      Use of this event helps identify performance problems from having float and int in the same cache line; whether active data are grouped in the same cache line; pre-fetches.

    • pfmon --check-events-only -e....
      The --check-event-only option allows checking if simulatenous measurements of the specified events is permitted.

      Ex:
      pfmon --check-events-only -e loads_retired,stores_retired
      event LOADS_RETIRED and STORES_RETIRED
      cannot be measured at the same time

    • If the command you want to run takes options, you can clearly distinguish the options of pfmon from the options of your command using the '--' symbol:

      % pfmon -e ia64_inst_retired -- ls -ial /dev/null
      210135 crw-rw-rw- 1 root root 1, 3 Mar 24 2001 /dev/null
      2709704 IA64_INST_RETIRED

    For more information on pfmon,

  • profile.pl

    profile.pl is a free software written by SGI. It is a Perl script to run profile experiments using pfmon. It provides a simple way to do procedure-level profiling of a program on a Linux IA64 system. Symbol information is required to be present in the program text -- profile.pl does not produce useful output on programs that have been processed by strip.

    While profile.pl is designed to work with SGI ProPack for Linux, the only real dependency profile.pl has on that system is that profile.pl will use dplace to control binding of processes to processors.

    A few important options:
    • -c >cpulist< : This option is used by dplace to bind the processes to processors. The CPU numbers in the list are absolute CPU numbers, not relative to an enclosing cpuset.
    • -E event : This is the event name as specified to pfmon. The default is CPU_CYCLES
    • -N number : This option controls how often the sampling (profile_ticks/second) is done. N gives the number of events/profile ticks and is proportional to the reciprocal of the sampling rate. For event CPU_CYCLES, N defaults to 10000*CPU_MHZ. For other events, you need to provide your own N.
    • -O profile : This gives the file name for the analyzed profile. Defaults to profile.out.
    • -K : Keeps the separate per CPU sample files around and Produces a separate profile report for each CPU.

    The simplest way to use profile.pl is as follows:

    profile.pl -c0-3 test_program

    profile.pl -c0-3 'test_program > input < output '

    For OpenMP programs,
    profile.pl -c0-3 -x 6 test_program
    or
    profile.pl -c0-3 -x 6 -K test_program

    For MPI programs,
    mpirun -np 4 profile.pl -s1 -c0-4 a.out

    See man profile.pl for more information.

    | Top of Page |

  • Histx

    You need to 'module load' this software before using it. Currently, histx+.1.2a loads SGI histx software version 1.2a, located under /opt/sgi/histx+/1.2a/. Use the following command to load:

    %module load histx+.1.2a

    The following was copied from SGI's Histx 1.0 documentation:

    SGI Histx is a performance analysis tool designed to complement pfmon, which is included with the SGI ProPack software for the Altix family of servers. The software is designed to run on Altix machines only. Used internally by SGI developers and benchmarkers, this product is offered as a service to SGI customers with a no-fee end-user proprietary license via the SGI Download Cool Software (DCS) Web site. Customers wishing to use SGI Histx should be aware that there is no support planned for this product and customers who use it accept it "as is."

    The product can produce separate reports for individual pthreads, OpenMP API threads, fork()ed processes, and MPI processes. This complements pfmon's ability to trace the initial thread or the system as a whole, which is useful only in a strictly controlled single-user benchmark environment."

    SGI Histx consists of a group of tools:

    • lipfpm (Linux IPF Performance Monitor) (a "perfex"-like tool) -- This is the tool that supports individual pthreads, OpenMP threads, fork()ed processes, and MPI processes. It reports counts of desired events for the entire run of a program.

      Syntax:

      lipfpm [-c name] [-e name]* [-f] [-i] [-h] [-k]
      [-l] [-o path] [-p] command args

      Unfortunately, there is no man page for lipfpm. To learn more about lipfpm, use:

      %module load histx+.1.2a
      %lipfpm -h

      An example of using lipfpm to obtain FLOPS per cycle:

      %lipfpm -e CPU_CYCLES -e FP_OPS_RETIRED ./a.out
      lipfpm summary
      ====== =======
      Retired FP Operations.................................. 18878283
      CPU Cycles............................................ 301004631
      Average FLOPS per Cycle............................... 0.0627176

      Note: Same as pfmon, maximum of 4 events can be monitored for each lipfpm analysis.
      However, lipfpm is more strict than pfmon in the syntax for event names. (1) The event names in lipfpm have to be in uppercase. Use 'lipfpm -l' to find acceptable event names. (2) pfmon allows use of both x.ALL and x_ALL. lipfpm only accepts x.ALL.

      For OpenMP programs:

      %setenv OMP_NUM_THREADS 4 (default is 16)
      %lipfpm -e FP_OPS_RETIRED -o out ./a.out

      This will generate 4+2 files (out.a.out.pid1, out.a.out.pid2, etc). Each contains the counts for the CPU cycles used, the retired FP operations, and the average FLOPS per cycle.

      For MPI programs:

      mpirun -np 4 /opt/sgi/histx+/1.1/bin/lipfpm
      -f -e FP_OPS_RETIRED -o fp ./a.out

      This will generate 4+1 fp.a.out.pid files. Each contains the counts for the CPU cycles used, the retired FP operations, and the average FLOPS per cycle.

    • samppm : This tool allows the use to examine the time-varying behavior of selected performance counters.

      Syntax:

      samppm [-e name]* [-f] [-h] [-k]
      -o path [-r n] command args

      Example:

      samppm -e FP_OPS_RETIRED -e L2_MISSES -e BACK_END_BUBBLE.ALL -e IA64_INST_RETIRED.THIS -o output a.out

      Note: This tool is similar to lipfpm. It allows 4 events to be monitored at the same time. The difference between lipfpm and samppm is that lipfpm gives a total count for each specified event at the end of the run while samppm keeps track of the counts as a function of time. In addition, samppm gives a binary output that has to be further processed by dumppm to provide human readable form.

    • dumppm : This tool displays data collected by samppm in human readable form.

      Syntax:

      dumppm [-c] [-d] [-h] [-l n1,n2,...] [<] infile [>] outfile

      Example:

      dumppm output.xxxx.pid

      Displays the counts for each events collected as a function of time. The first column in the output shows "time of day," and the other columns show the counts of each event at that time.

      dumppm -d -l FP_OPS_RETIRED output.xxxxx.18571

      Displays the event differences and relative times for the specified event (FP_OPS_RETIRED in this example).

    • histx (a profiling tool)-- It can sample instruction pointer (that is, program counter) or callstack on either timer interrupts or on performance monitor counter overflows. This tool does not require compilation with special flags.

      Syntax:

      histx [-b width] [-f] [-e source] [-h] [-k]
      -o file [-s type] command args

      Unfortunately, there is no man page for histx. To learn more about histx, use:

      %histx -h

      An example of using histx to do profiling:

      %histx -e timer@1 -o out ./a.out

      This will generate an output file, out.xxx.pid, which provides the number of timer ticks (1 timer tick is ~ 0.977 milliseconds) for each subroutines and functions.

    • iprep -- This tool formats one or more raw histx ip sampling reports into a useful and usable report.

      Syntax:

      %iprep output_from_histx > iprep.output

    • csrep -- This tool formats one or more raw histx callstack sampling reports into a format resembling an SGI(R) SpeedShop "butterfly" report.

      %csrep output_from_histx > csrep.output

| Top of Page |


Linux System Utilities

  • hinv

    hinv displays the contents of the system hardware inventory. Contents include brick configuration, processor type, main memory size, disk drivers, etc. hinv -v -c processor provides info about each processor. hinv -v -c memory gives the amount of memory in units of pages (in each node.

  • topology

    topology is a Bourne shell script that uses information in /dev/hw to display the system topology and configuration. This command is useful to make sure that the user's view of the system hardware matches that of the firmware and operating system.

  • uptime

    uptime gives a one-line display of the following information. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.

  • ps

    ps gives a snapshot of the current processes. If you want a repetitive update of this status, use the top command.

  • top

    top provides an ongoing look at processor activity in real time.

  • free

    free displays the total amount of free and used physical and swap memory in the system, as well as the shared memory and buffers used by the kernel.

  • w

    w displays who is on the system and what they are doing.

| Top of Page |


Linux System Utilities

For all of the software for the Altix that is compiled by ourselves or does not have a for-pay license we have been installing the software under:

/local/LinuxIA64/

For each of these packages there may be multiple versions. Users are given access to these packages through a module. The software can be used by issuing a command to load the module (which sets up the user's env to use the software).

  • pd-hdf.5-1.6.4

    HDF5 version 1.6.4 - installed from binary packages from: ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/altix/

  • pd-ncarg.4.4.1

    To use the module, the user needs to have an intel 8.1 or 9.x. loaded in addition to this module (or the software may not work correctly).

    You need to 'module load' this software before using it.

    %module load pd-ncarg.4.4.1

  • pd-petsc.2.3.0-complex.le

    petsc version 2.3.0 compiled with complex number support and little endian - compiled from source found here: http://www-unix.mcs.anl.gov/petsc/petsc-2/

  • pd-sun-java-sdk.1.4.2_06

    sun's Java SDK 1.4.2 - from sun

  • pd-xdiff-3.4

    xdiff version 3.4 graphical diff - binary package installed from http://reality.sgiweb.org/rudy/xdiff/

  • pd-grads.1.9b4

    GrADS can be loaded via:

    %module load pd-grads.1.9b4

  • pd-nedit-5.5-x86

    Nedit can be loaded via:

    %module load pd-nedit-5.5-x86

| Top of Page |


Links to Other Useful Documents


FirstGov logo + Privacy Policy and Important Notices
+ Sciences and Exploration Directorate
+ CISTO
NASA Curator: Mason Chang,
NCCS User Services Group (301-286-9120)
NASA Official: Phil Webster, High-Performance
Computing Lead, GSFC Code 606.2