You are here

Please note: Due to browser FTP deprecation, users will no longer be able to access NCEI data via browser FTP clients. Users may access data via NCEI Web Accessible Folders and/or FTP client supported applications. We apologize for any inconvenience. See this document as reference. 

Introduction

Since 1987, the National Oceanic and Atmospheric Administration's National Centers for Environmental Information (NCEI-NC) has used observations from the U.S. Historical Climatology Network (USHCN) to quantify national- and regional-scale temperature changes in the conterminous United States (CONUS). To that end, there are corrections to USHCN temperature records to account for various historical changes in station location, instrumentation, and observing practice. The USHCN is actually a designated subset of the NOAA Cooperative Observer Program (COOP) Network the USHCN sites having been selected according to their spatial coverage, record length, data completeness, and historical stability. The USHCN, therefore, consists primarily of adjustments for systematic, nonclimatic changes that bias temperature trends of monthly temperature records of long-term COOP stations.

Map of US COOP Stations
CaptionDistribution of U.S. Cooperative Observer Network stations in the CONUS. U.S. HCN version 2 sites are indicated as red triangles.

The first development of USHCN datasets were at NOAA's NCEI in collaboration with the Department of Energy's Carbon Dioxide Information Analysis Center (CDIAC) in a project that dates to the mid-1980s (Quinlan et al. 1987). At that time, in response to the need for an accurate, unbiased, modern historical climate record for the United States, personnel at the Global Change Research Program of the U.S. Department of Energy and at NCEI defined a network of 1219 stations in the contiguous United States whose observation would comprise a key baseline dataset for monitoring U.S. climate. Since then, the U S HCN dataset has been revised several times (e.g., Karl et al., 1990; Easterling et al., 1996; Menne et al. 2009). The three dataset releases described in Quinlan et al. 1987, Karl et al., 1990 and Easterling et al., 1996 are now referred to as the USHCN version 1 datasets.

These version 1 datasets contained adjustments to the monthly mean maximum, minimum, and average temperature data that addressed potential changes in biases (inhomogeneities) in data from USHCN stations documented in NCEI’s station history archives.

The documented changes that were addressed include changes the time of observation (Karl et al. 1986), station moves, and instrument changes (Karl and Williams, 1987; Quayle et al., 1991). Apparent urbanization effects were also addressed in version 1 with a specific urban bias correction (Karl et al. 1988).

In 2007, USHCN version 2 serial monthly temperature data were released and updates to the version 1 datasets were discontinued.

Relative to the version 1 releases, production began of version 2 monthly temperature data using an expanded database of raw temperature values from COOP stations, a new set of quality control checks, and a more comprehensive homogenization algorithm. The version 2 temperature dataset and processing steps detailed description are in Menne et al. (2009) and more briefly below.

In October 2012, a revision to the version 2.0 dataset was released as version 2.5. The version 2.5 processing steps are essentially the same as in version 2.0, but the version number change reflects modifications to the underlying database as well as coding changes to the pairwise homogenization algorithm (PHA) that improve its overall efficiency. Table 1 below lists these modifications. NCEI Technical Reports GHCNM-12-01R (Williams et al., 2012a) and GHCNM-12-02 (Williams et al. 2012b) provides details regarding the PHA modifications.

Table 1. Differences between USHCN version 2.0 and version 2.5

  Version 2.0 Version 2.5
Database construction and quality control Monthly mean maximum and minimum temperatures (and total precipitation) were calculated using three daily datasets archived at NCDC (DSI-3200, DSI-3206 and DSI-3210). The daily values were first subject to the quality control checks described in Menne et al. 2009 and only those values that passed the evaluation checks were used to compute monthly average temperatures. Monthly averages were computed only when no more than 9 daily values were missing or flagged by the quality checks. Monthly values calculated from the three daily data sources then were merged with two additional sources of monthly data (DSI 3220 and the USHCN version 1) to form a more comprehensive dataset of serial monthly temperature and precipitation values for each HCN station. Duplicate records between data sources were eliminated and values from the daily sources were used in favor of values from the two monthly sources. DSI 3200 was used in favor of the USHCN v1 database. . Monthly values were subject to a separate suite of checks as described in Menne et al. 2009 Monthly mean maximum and minimum temperatures (and total precipitation) were calculated using GHCN-Daily (Menne et al. 2012). The daily values are first subject to the quality control checks described in Durre et al. (2010). Only those values that pass the GHCN-Daily QC checks are used to compute the monthly values. Further, a monthly mean is calculated only when nine or fewer daily values are missing or flagged. 

Monthly values calculated from GHCN-Daily are merged with the USHCN version 1 monthly data to form a more comprehensive dataset of serial monthly temperature and precipitation values for each HCN station. Duplicate records between data sources were eliminated and values from GHCN-Daily are used in favor of values from the USHCN version 1 raw database. USHCN version 1 data comprise about 5% of station months, generally in the earliest years of the station records. 

Monthly mean temperature values are then subject to an addition set of monthly QC tests as described in Lawrimore et al. (2011).

Pairwise Homogenization Algorithm (PHA) Version Number 52d (source code) 52i (source code)
Re-processing frequency The raw database was constructed in 2006 using the sources described above (and inMenne et al. 2009 ) and updated thereafter with monthly values computed from GHCN-Daily. 

The temperature data were last homogenized by the PHA algorithm in May 2008. Since May 2008, more recent data have been added to the homogenized database using the monthly values computed from GHCN-Daily (but without re-homogenizing the dataset).

The raw database is routinely reconstructed using the latest version of GHCN-Daily, usually each day. The full period of record monthly values are re-homogenized whenever the raw database is re-constructed (usually once per day)
Data format Six-digit station identification number One data value flag (see the version 2 readme.txt file for details). Eleven-digit station identification number similar to that used in GHCN-Daily. A network code of ‘H’ is used and the last six-characters of the id are the coop identification number. Three flags accompany each monthly values (data measure flag, data quality flag, data source flag) as in GHCN-Monthly version 3 [see the version 2.5 readme.txt file for details)
Version Control/Time Stamping Data files labeled with the latest available data month (e.g., 9641C_yyyymm.dataset.element.gz; whereyyyy=year and mm=month; dataset=raw, tob, or F52; and, element=max. min, avg, or pcp) Data files are all of the format ushcn.element.latest.dataset.tar.gz where element=tmax, tmin, tavg, or prcp; and, dataset=raw, tob, or FLs.52i. The data will untar/uncompress into a directory called ushcn.version.date whereversion=2.5.0 and date=the year, month and day (yyyymmdd) that the data were last reprocessed and updated.
 

A brief summary of version 2 processing steps is provided below. A more comprehensive summary, including discussions of the sources and magnitude of bias in the raw (unadjusted) data, is provided in Menne et al. (2009) . An assessment specifically addressing the reliability of the USHCN temperature trends in light of station siting concerns is also provided below and in more detail by Menne et al. (2010). Details of the pairwise homogenization algorithm and its evaluation against synthetic benchmark datasets with realistic bias-error scenarios are provided in Williams et al. (2012c). A comparison of the USHCN v2 homogenization algorithm (version 52d) and other approaches to homogenization is provided in Venema et al. (2012). A comparison of adjusted and unadjusted USHCN temperature trends with a number of atmospheric reanalysis datasets is described in Vose et al. (2012).

Station Siting and U.S. Surface Temperature Trends

Station Siting and U.S. Surface Temperature Trends

Photographic documentation of poor siting conditions at stations in the USHCN has led to questions regarding the reliability of surface temperature trends over the conterminous U.S. (CONUS). To evaluate the potential impact of poor siting/instrument exposure on CONUS temperatures, The Menne et al. (2010) compared trends derived from poor and well-sited USHCN stations using both unadjusted and bias-adjusted data. Results indicate that there is a mean bias associated with poor exposure sites relative to good exposure sites in the unadjusted USHCN version 2 data; however, this bias is consistent with previously documented changes associated with the widespread conversion to electronic sensors in the USHCN during the last 25 years Menne et al. (2009) . Moreover, the sign of the bias is counterintuitive to photographic documentation of poor exposure because associated instrument changes have led to an artificial negative ("cool") bias in maximum temperatures and only a slight positive ("warm") bias in minimum temperatures.

Adjustments applied to USHCN Version 2 data largely account for the impact of instrument and siting changes, although a small overall residual negative (“cool”) bias appears to remain in the adjusted USHCN version 2 CONUS average maximum temperature (see also Fall, S. (2011)). Nevertheless, the adjusted USHCN CONUS temperatures are well aligned with recent measurements from the U.S. Climate Reference Network (USCRN). This network was designed with the highest standards for climate monitoring and has none of the siting and instrument exposure problems present in USHCN. The close correspondence in nationally averaged temperature from these two networks is further evidence that the adjusted USHCN data provide an accurate measure of the U.S. temperature.

The Menne et al. (2010) results underscore the need to consider all changes in observation practice when determining the impacts of siting irregularities. Further, the influence of non-standard siting on temperature trends can only be quantified through an analysis of the data which do not indicate that the CONUS average temperature trends are inflated due to poor station siting.

Scientists used four sets of USCHN stations in the Menne et al. (2010) analysis. Set 1 includes stations identified as having good siting by the volunteers at surfacestations.org. Set 2 is a subset of Set 1 consisting of the Set 1 stations whose ratings are in general agreement with an independent assessment by NOAA’s National Weather Service. Set 3 are those stations with moderate to poor siting ratings according to surfacestations.org. Set 4 is a subset of Set 3 consisting of the Set 3 stations whose ratings are in agreement with an independent assessment by NOAA’s National Weather Service. For further information, please see Menne et al. (2010). The set of Maximum Minimum Temperature Sensor (MMTS) stations and Cotton Region Shelter (Stevenson Screen) sites used in Menne et al. (2010) are also available (see the "readme.txt" file as described below for a description of the station list format). Access to the unadjusted, time of observation adjusted, and fully adjusted USHCN Version 2 temperature data is described below.

Data Access

U.S. HCN version 2.5 monthly data are available via ftp. Users are requested to cite the version number and timestamp of the dataset when USHCN v2.5 data are used for analysis (as well as Menne et al. 2009). Please see the "readme.txt" file in the v2.5 directory for information on downloading and reading the v2.5 data. Information about the processing of the USHCN v2.5 data is provided in the status.txt file. U.S. HCN Version 2 monthly data will continue to be updated through 2012 and will be available in static form thereafter; however, users are encouraged to make the transition to Version 2.5 as soon as possible.

Pairwise Homogeneity Adjustment Software

The automated pairwise bias algorithm (PHA) software (Menne and Williams 2009) Version 52i used to detect and adjust for documented and undocumented inhomogeneities in the U.S. HCN version 2.5 monthly temperature dataset is available via ftp. PHA version 52d used to adjust the v2 dataset is available to users. Please refer to the README text file in these directories for guidance on how to download, uncompress, compile and run the pairwise homogenization software. The "tar/g zipped" file contains all of the necessary software to run the pairwise homogenization procedure. A simulated test dataset is included with the software along with a file of the expected output that can be used to verify proper execution of the code.