NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Profiling and Performance Tools on Franklin

IPM

IPM is the NERSC developed Integrated Performance Monitoring Tool for MPI programs. More information could be found here. It has been implemented on all other major NERSC platforms. IPM on Franklin is currently on a trial base (not officially supported). Please contact us for feedback and problems report.

Due to the limitation of the lack of shared memory support on the compute nodes, IPM is available only as a static library. Users need to link IPM library manually. To use:

% module load ipm
% ftn ... $IPM
or % cc ... $IPM
or % CC ... $IPM

Some of the caveats with IPM on Franklin include:

  • Writing IPM output for high concurrency jobs may need long time and could cause the job hung or exit exceeding wall clock time.
  • PAPI 3.5 is known to have counter overflow issues on CNL 2.x especially for long runs. If you see odd HPM data this may be the issue.
  • IPM does not work with all of our test applications yet.

CrayPat

CrayPat is a performance analysis tool offered by Cray for the XT platform. Cray's website includes detailed documentation about CrayPat. Additionally, once the CrayPat module is loaded, relevant man pages will be available. Production codes should NOT use CrayPat. CrayPat has a large feature set. Here we will highlight the most basic and point to other relevant documentation.

Finding FLOPS (Floating Point Operations Per Second) of an Application

The simplest way to get flops is to use pat_hwpc . There are a number of other ways to get hardware counters using craypat

  • Load the craypat module
    
    % module load xt-craypat/4.1
    
    
  • Build application as usual
  • Run code with pat_hwpc
  • Augment the aprun command in a batch script with the following
    
    % pat_hwpc -E  aprun -n  myexecutable 
    
    
  • Performance counter output will be appended to standard out.
  • This example only counts basic default hardware counters. For more options see the pat_hwpc man page for setting the PAT_RT_HWPC environment variable to get other hardware counter information.

Application Profiling

Use CrayPat to analyze MPI communication time, memory usage, and user functions.

  • Load the craypat module
    
    % module load xt-craypat
    
    
  • Build application as usual
  • The next step is to instrument the code with the pat_build command. See the pat_build man page for all options. The basic ones are listed below.
    
    % pat_build -g  [group]  myexecutable
    
    
    • group options are mpi, io, heap. Automatic instrumentation is included at the function level.
    • The -u option instruments user functions.
    • For mpi, io, heap and user function instrumentation:
      
      % pat_build -g mpi,io,heap -u myexecutable
      
      
      (notice there are no spaces between mpi,io,heap)
    • For instrumentation of user functions only:
      
      % pat_build -u myexecutable
      
      
    • Running the pat_build command will create a file called myExecutable+pat. Note! the .o files must be available to pat_build
  • Run the instrumented executable. In order to collect hardware counter information, set the environment variable PAT_RT_HWPC to values [1-9]. See the pat_hwpc man page for all options. When the code runs, a file ending in .xf will be created.
  • Then a performance file can be generated using the pat_report command.
    
    % pat_report -f ap2 [options] <.xt file>
    
    
    • basic options are -o output_file See the pat_report man page for more options.
    • Running pat_report will generate a file ending in .ap2
  • pat_report or Cray Apprentice2 can be used to analyze the results.

Cray Apprentice2

Cray Apprentice2 is tool used to visualize performance data instrumented with the craypat tool. There are many options for viewing results. Please refer to the app2 man page or Cray's Apprentice2 documentation for more details.

  • First load the apprentice2 module
    
    % module load  apprentice2 
    
    
  • Visualize data
    
    % app2 myExecutable.ap2 
    
    
  • Apprentice2 can also be used to view xml report created by pat_report. See example:
    
    % module load craypat
    % module load apprentice2 
    % pat_report -f xml -c records myExecutable+patoptions.xf 
    > myExecutable.xml
    % app2 myExecutable.xml
    
    

LBNL Home
Page last modified: Tue, 08 Apr 2008 22:10:47 GMT
Page URL: http://www.nersc.gov/nusers/systems/franklin/tools.php
Web contact: webmaster@nersc.gov
Computing questions: consult@nersc.gov

Privacy and Security Notice
DOE Office of Science