Office of Science
FAQ
Capabilities

Getting a Refund of Allocated Time

Users are entitled to allocation refunds when their jobs die because of failures of MSC hardware, system software or MSC-supported application software. Note that this does not apply to jobs that die because a user makes a mistake in their input deck. It also doesn't apply to problems with the system or applications software if those problems are documented either on the MSC web site or in the software manuals. It is the user's responsibility to read these materials on a periodic basis to stay informed of possible bugs. Also, users will not be given refunds for job failures associated with application software development.

Users will be refunded only that amount of node-wall time for which they would have to rerun in a subsequent restart calculation (i.e., a geometry optimization would not have the full node-wall time refunded - just the part from the last structure in the optimization). Users are expected to checkpoint (or save appropriate intermediate output) in their calculations.

To request a refund, send mail to mscf-consulting@emsl.pnl.gov. Refunds will not be given for verbal requests.

When requesting a refund, the user MUST provide the following information:

Users are encouraged to provide any other information they think may be useful in determining the outcome of the refund decision. The MSC Consulting Group will then notify the user of the action taken on their refund request and address any questions that may arise.

To facilitate refund requests, the submission scripts have been modified so that users can tell what their job number is after their job has been run. The Batch Job ID is the job number that your job ran under and is a six digit integer. Users are encouraged to include the following lines in their submission scripts so that the refund information is more readily accessible:

    #
    # This is to help in getting the refund information
    #
    echo "refund:UserID = (your userid)"
    echo "refund:Account name = (your account name)"
    echo "refund:Job ID = ${SLURM_JOBID}
    echo "refund:Number of nodes = (no. of nodes you used)"
    echo "refund:Number of cores per node = (core per node you used)"
    echo "refund:Number of cores = (no. of cores you used)"
    echo "refund:Amount of time requested = (time you requested)"
    # 

Note for Ecce users: It is possible for a job to remain in the submitted or running state within Ecce even though the job has actually run to completion. This results from communications going down between the machine where Ecce is running and the machine where the job is launched. In this case the job can be imported as a new Ecce calculation after it has completed using the Calculation Manager. Because no results are lost by importing a calculation launched from within Ecce allocation refunds will not be given. It is a priority of continued Ecce development to minimize the likelihood of this occurring.

Note for Ecce users: It is not advisable to restart jobs which terminated due to a hardware error. There often are errors or missing data in the file which won't always show up right away. This could lead to a job running for a long period of time, perhaps to completion, but giving the wrong final answer.