NOAA

Geophysical Fluid
Dynamics Laboratory

Skip to: [content] [navigation]
If you are using Navigator 4.x or Internet Explorer 4.x or Omni Web 4.x , this site will not render correctly!

gfdl's home page > people > John Dunne >

Totalview Guide

Description

This document describes how to get started using the Totalview Debugger. This is an object-oriented program that allows you to explore the model code with full access to the constants and variables within it as it runs.

Altering the runscript

There is one essential edit to be made in the runscript in wrap the run command in a totalview execution and automatically exiting rather than outputing the executable to and then doing post-processing. Replace:

mpirun -np $npes $executable:t > fms.out

with:

totalview mpirun -a -np $npes $executable:t
exit


Also, make sure that you are running with MPI as opposed to SHMEM.


Re-compiling the executable

The executable will have to be edited and recompiled with the debugging options turned on and optimization turned off (Note: this will slow down the code by a factor of 5). The easiest way to do this is to edit the Makefile (which resides in the exec directory) to point to a mkmfTemplate that allows command line options. This can be done by changing the line:

include /home/fms/bin/mkmf.template.sgi

to:

include /home/jpd/jakarta/mkmf.template.sgi

When recompiling the executable, be sure that all of the intermediate ".o" files are removed so that the Makefile will re-create them with the correct compilation options and that all of the fortran code is copied over to the directory in which the Makefile resides so that totalview will be able to find the source code. This is achieved by changing to the Makefile directory and running the commands:
    make clean
    make localize
    gmake DEBUG=1

Running totalview

To start totalview, follow the following steps:
  • Log in for an interactive session on the AC.
  • If they exist, remove and the initialized file in the model output directory.
  • Change to the directory with the runscript.
  • Execute the runscript. This will start the totalview program.

At this point, totalview is waiting for you to start executing the model within it. The active window is a driver window, which you will have to close to exit totalview but otherwise will not touch. The other, larger window which will pop up is the main program window.

  • Begin execution by switching to the larger, main program window as the active one, and either hitting "G" on the keyboard, or selecting the corresponding menu option by holding down the middle mouse button and choosing the "Go/Halt/Next/Step/Hold" --> "Go Group" option.
    Totalview will then begin execution of the runscript. When it gets to the command to run the executable (which requires multiple processors), it will ask if you wish to halt the process.

  • Clicking "NO" will let the program run until it either crashes (where it will stop, offering you a traceback of the hierarchy of lines of model source code and active constants and variables at that point) or run to completion (where it will automatically exit).

  • The Traceback hierarchy is in the upper, left part of the main window. This will allow you to toggle through the heirarchy of subroutine calls within the code.

  • The current, active constants and variables are provided in the upper, right part of the main window. You can click on these parameters both in this list or directly in the code.

  • Clicking "YES" will halt the program and allow you to open up source files and place "STOP" commands in the code, before running the code with "G". Add a stop by clicking the left mouse button on the code line number on the left of the screen. Remove them by clicking on them again.

  • Sometimes the code will stop before it gets to the stop. Frankly, I haven't figured out why this happens, other than the program is timing out as some processors jump ahead of others. When this happens, just resume execution with "G".




smaller bigger reset
last modified:June 08 2004.