CDC netCDF Conventions: Gridded Data

The following format description indicates the minimum requirements for creating netCDF files using the CDC netCDF standard format.

The conventions in this document are compatible with, but more restrictive than standards developed jointly between the institutions participating in the now-defunct NOAA Cooperative Ocean-Atmosphere Research Data Service (COARDS). In general, CDC netCDF software can read any netCDF file compliant with the more general joint standards document created by COARDS.

This basic format may be enhanced with additional dimensions, variables, and attributes as long as the standard elements are included.

FILE NAME

All netCDF files must have '.nc' as the final suffix of the file name.

DIMENSIONS/COORDINATE VARIABLES

One or more dimensions can be used with each data variable. There are four standard dimensions: time, level, lat, lon. In data variable definitions, these dimensions must be used in this order (if present) as they appear in CDL. In Fortran, this order is reversed in function calls.

When "extra" dimensions are used, such as with model runs, they should appear to the left of the standard dimensions in a variable definition (in CDL order). The dimension names should begin with a letter and be composed of letters, digits, and underscores.

time is defined as the Unlimited (or record) dimension, except in those cases where "extra" dimensions are used.

Coordinate variables that correspond to the dimensions must have the same names as the dimensions. Coordinate values of a coordinate variable must be either monotonically increasing or monotonically decreasing. However, the coordinate values need not be evenly spaced. Missing values are not allowed in coordinate variables.

DATA VARIABLES

One or more per file.

Variable names should begin with a letter and be composed of letters, digits, and underscores. The data type should be byte, short, long, float, or double.

CDC has standard variable abbreviations for most climate-related variables that should be used where possible.

ATTRIBUTES

The type for attributes is character except as noted.

Data variable attributes:

Dictionary attributes (dataset, var_desc, level_desc, statistic, parent_stat) -- for use by CRDtools dictionary browse application. Select from list of valid attributes available, or use "Other", "-". If these are not specified, the default is "Other".
valid_range -- expected "reasonable" range for variable. Same type as unpacked values.
actual_range -- actual data range for variable. Same type as unpacked values.
least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short.
precision -- number of places to right of decimal point that are significant, based on packing used. Type is short.
units -- units the variable is recorded in. Where possible, the units should follow the Unidata udunits standard.
missing_value -- the value that signifies grid points for which there is no data available. This value should be outside of the valid_range of the data and should not equal the netCDF standard initial data value for the data type (nor should it equal _FillValue if used). missing_value has the (possibly packed) data value data type.
long_name -- a long descriptive name. This could be used for labelling plots, for example. If a variable has no long_name attribute, the variable name will be used as a default.
add_offset -- If present for a variable, this number is to be added to the data after it is read by the application that accesses the data. add_offset has the unpacked value data type. Where the data is not packed, add_offset = 0. If add_offset is omitted, the default is 0.
scale_factor -- If present for a variable, the data are to be multiplied by this factor after the data are read by the application that accesses the data. scale_factor has the unpacked value data type. Where the data is not packed, scale_factor = 1. If scale_factor is omitted, the default is 1.
The attributes scale_factor and add_offset can be used together to provide simple data compression to store low-resolution floating-point data as small integers in a netCDF file.
The unpacking algorithm is:
unpacked value = add_offset + ((packed value) * scale_factor)

Time coordinate variable

long_name -- "Time"
units -- a character string formatted as recommended in the Unidata udunits package. The string contains multiple parts:
- a time unit -- The valid units for time are listed in the Unidata udunits standard. The most commonly used of these strings (and their abbreviations) includes day (d), hour (hr, h), minute (min), second (sec, s), year (yr). A year is defined as being exactly 3.1536e7 secs or 365 days (i.e. no leap years). Plural forms are also acceptable.
- the string "since"
- a base date in the form "year-month-day"
- an optional base time in the form "hours:minutes:seconds"
- an optional base time zone offset from GMT
The following example shows an implementation of the time unit string:
"hours since 1900-01-01 06:00:00 -6:00"
indicates the number of hours since January 1st, 1900 at 6:00 in the morning in the Mountain Daylight Time zone.
NOTE: The normally used CDC base date/time is: "0001-01-01 00:00:00".
Time coordinate variables representing climatological time (an axis of 12 months, 4 seasons, etc. that is located in no particular year) should be encoded like other time axes but with the added restriction that they be encoded to begin in the year 0000.
NOTE: There are udunits functions that can interpret and manipulate the time units string.
actual_range -- start and end times in the same time units and base as in the units attribute. Type is double.
delta_t -- The amount of time between time coordinate values, in the format "yyyy-mm-dd hh:mm:ss". Smaller (unused) time elements are zero-filled (e.g., if the delta_t is one month, "0000-01-00 00:00:00" signifies one month between time values).
If there is no regular time increment, all the elements should be zero-filled. If delta_t is omitted, no regular time increment is implied.
avg_period -- Required only for time-averaged data. The period of time over which the data was averaged, in the format "yyyy-mm-dd hh:mm:ss". Smaller (unused) time elements are zero-filled (e.g., if the averaging period is one month, "0000-01-00 00:00:00" signifies that each value is an average of one month's values).
prev_avg_period -- Required only for time-averaged data. The average period represented in the source variable before taking the average. Format is "yyyy-mm-dd hh:mm:ss". Smaller (unused) time elements are zero-filled.
ltm_range -- Required only for time-averaged data. The begin and end values of the time period used to create the averaged data using the same time units and base as in the units attribute. Type is double.
subset_begin, subset_end -- Required only for time-averaged data not averaged over a full time unit. The portion of a time unit actually used. Format is "yyyy-mm-dd hh:mm:ss". Smaller (unused) time elements are zero-filled.
For example, if a long term mean was created using only the months March through June, subset_begin would be "0000-03-00 00:00:00" and subset_end would be "0000-06-00 00:00:00".

Level coordinate variable.

Level is used for the vertical or 'Z' dimension in a variable.

long_name -- Standard choices are: "Level" for pressure levels, "Sigma" for sigma levels, "Isentropic" for theta levels and "Depth" for depth below a datum (usually sea level). If the level is not one of these, an arbitrary name may be chosen.
units -- corresponds to long_name choices above. Standard choices are: "millibar", "sigma_level", "degree_K", and "meter". If the level does not use one of these, the units should follow the Unidata udunits standard, where possible.
actual_range -- level range in same units as the units attribute. Type is float.
positive -- indicates the direction of positive for the level axis. Valid values are "up" and "down". Normally, pressure levels, sigma levels, and depths are "down", theta levels and heights are "up". The positive attribute is not required for pressure levels, where it defaults to "down".

Latitude coordinate variable

long_name -- "Latitude"
units -- "degrees_north"
actual_range -- latitude range in degrees. The range values are used to indicate order of storage (e.g., 90,-90 would indicate the latitudes started with 90 and ended with -90). Type is float.

Longitude coordinate variable

long_name -- "Longitude"
units -- "degrees_east"
actual_range -- longitude range in degrees. The range values are used to indicate order of storage (e.g., 0,360 would indicate the longitudes started with 0 and ended with 360). Type is float.
Longitudes may be represented modulo 360, meaning that -180 and 180 are both valid representations of the International Dateline and 0 and 360 are both valid representations of the Prime Meridian. Note, however, that the sequence of numerical longitude values stored in the netCDF file must be monotonic (in a non-modulo sense).

Global attributes:

title -- data set title (not specific to any one variable).
history -- brief description on multiple lines of the procedures used to generate the file. The description should include:
- an attribution for CDC and the initials (in reverse order if privacy is desired) of the individual that generated the file.
- the source dataset and name(s) or range of the input file(s)
- the date the file was generated
- the name of the software package or a description of the custom source code used to create the file.
Here's an example history attribute:
Created by NOAA-CIRES Climate Diagnostics Center Data Management Group (SAC, cdcdata@noaa.gov) from the NCEP Reanalysis data set on 1997/07/03 by ltmmaker using /Datasets/ncep.reanalysis/surface_gauss/air.sfc.73.nc thru /Datasets/ncep.reanalysis/surface_gauss/air.sfc.96.nc
When a file is updated, the original history attribute contents should be retained and prefixed with information about the changes, including:
- an attribution for CDC and the initials (in reverse order if privacy is desired) of the individual that updated the file, if the update procedure isn't part of an automated process
- update date
- the name of the software package or a description of the custom source code used to update the file
- source(s) of new data

Physical Sciences Division

CDC netCDF Conventions: Gridded Data