Frequently Asked Questions - Data FAQ

IMERG

IMERG provides a data field that estimates the probability that the retrieved precipitation amount is “liquid”, which is defined to include “mixed” (liquid and solid) precipitation.  In retrospect the field name should have been “ice”, but “liquid” had already be set.  The rational is that mixed precipitation is very rare and transient, so it should be lumped with “liquid” or “ice”.  Furthermore, the primary effects of “ice” are to 1) prevent the falling precipitation from immediately entering the hydrological system (until it melts), and 2) to create (potentially) dangerous travel conditions.  “Mixed” typically ends up not creating either of these effects, so lumping it with “liquid” seems appropriate. 

Even given this basic definition, there are numerous forms of precipitation, and it might not be obvious how they end up being classified in IMERG.  The key fact is that the phase is computed diagnostically at present, based on work by Guosheng Liu (Florida State University) and students.  The Liu scheme uses data from a numerical model or model analysis to compute a “specification”, without reference to the satellite data, including whether or not IMERG estimates that precipitation is occurring, or even possible to estimate.  Thus, probabilityLiquidPrecipitation (pLP) is a globally complete field whenever the relevant model data exist.  An additional factor is that there is a conceptual difference between how the half-hourly phase is computed and how phase is defined in this probability framework for the monthly data.  We will handle the half-hourly first, for which the Liu specification equation is directly relevant.

Liu determined that the primary factor for phase is the surface wet bulb temperature (Tw), a combination of temperature and humidity, with small contributions from the low-altitude Tw lapse rate and the surface pressure, and with systematic differences between ocean and land areas.  In practice, the fitted probability as a function of Tw is converted to separate look-up tables for ocean and land.

Typical results for different forms of precipitation are:

  • Rain:  Ordinary falling liquid typically happens for Tw>0°C, so pLP is high.
  • Freezing Rain:  Liquid that freezes upon contact with the Earth's surface typically falls in Tw<0°C, so pLP is low.
  • Snow, ice pellets, snow pellets:  These frozen hydrometeors occur around or below Tw<0°C, so pLP varies from around 50% to very low.
  • Sleet:  Frozen droplets (U.S. definition) typically fall in Tw<0°C, so pLP is usually below 50%.
  • Mixed snow and rain; falling slush:  The mixed category is likely to occur around the pLP=50% mark.  If one uses 50% as a liquid/solid threshold, that implies that mixed cases will end up in both categories, depending on the details.
  • Hail:  Hail typically occurs when the surface air temperature is well above freezing (i.e., on summer afternoons).  Thus, pLP is very high.  But, hail is even rarer than mixed and unlikely to be correctly specified in this scheme, and anyway, in such conditions it rapidly melts and so is properly “mixed”.
  • Dew and frost:  These phenomena are not forms of precipitation.  They are liquid or solid water that condenses directly at the Earth's surface.  For this reason, any amount of surface accumulation due to dew or frost is not included in the IMERG precip estimate.

As the time interval for the data values lengthens, it becomes increasingly likely that both liquid and solid might have fallen, at which point the meaning of pLP should change to “what fraction of the estimated precipitation amount fell as liquid or mixed?”  This is the definition of pLP for both the monthly IMERG Final Run pLP and the set of GIS IMERG files (TIFF+WorldFile) providing estimated accumulations longer than three hours.

The following table provides a quick reference for the IMERG variables that can be visualized using Giovanni.

Product

Variable and Description

GPM_3IMERGHHE

30-min averaged data

Merged microwave-only precipitation estimate [Final]

Precipitation estimates from combining microwave data from the GMI, TMI, and other partner instruments.

Random error for gauge-calibrated multi-satellite precipitation [Final, Early, Late]

This is an estimate of the non-systematic component of the error. The exact variable name depends on the product, but all begin with "Random error..."

Microwave satellite observation time [Final]

Observation time of the microwave precipitation estimates given as minutes from the beginning of the current half-hour.

Microwave satellite source identifier [Final]

This is an integer between 0 and 24 that corresponds to the instrument from which the microwave precipitation estimate was taken

Weighting of IR-only precipitation relative to the morphed merged microwave-only precipitation estimate [Final]

This is the weighting of the infrared data in the final merged estimate, given in percent. Zero means either no IR weighting or no precipitation.

IR-only precipitation estimate [Final]

This is the microwave-calibrated infrared precipitation estimate.

Multi-sallite precipitation estimate with climatological gauge calibration [Final, Early, Late]

This is the precipitation estimate that has been calibrated with gauge data. This variable is recommended for most users. Note: Climatological gauge calibration is used in Early and Late.

Multi-satellite precipitation estimate [Final]

This is the precipitation estimate that has not been calibrated with gauge data.

Accumulation-weighted probability of liquid precipitation phase [Final]

This is the probability of liquid precipitation. The probabilities are calculated globally regardless of whether precipitation is actually present

   

GPM_3IMERGM

1 month averaged data

Weighting of observed gauge precipitation relative to the multi-satellite precipitation estimate

This is the percent weighting of the surface gauge data.

Merged satellite-gauge precipitation estimate

This is the precipitatiotn estimate that has been calibrated with gauge data. This variable is recommended for most users.

Accumulation-weighted probability of liquid precipitation phase

This is the probability of liquid precipitation. The probabilities are calculated globally regardless of whether precipitation is actually present.

Random error for merged satellite-gauge precipitation

This is an estimate of the non-systematic component of the error.

GPM began a phased release of Version 4 products in March 2016, at which point the Version 3 IMERG Final Run no longer had consistent input.  In fact, it was necessary to stop it at the end of January. Version 4 IMERG is currently scheduled to be ready around November or December 2016, at which point the products will be retrospectively processed for the GPM era (starting in March 2016). 

The main difference between the IMERG Early and Late Run is that Early only has forward propagation (which basically amounts to extrapolation), while the Late has both forward and backward propogation (allowing interpolation).  As well, the additional 10 hours of latency allows lagging data transmissions to make it into the Late run, even if they were not available for the Early (see below). 

There are two possible factors which contribute to differences in the IMERG Late Run and Final Run datasets:

  1. The Late Run uses a climatological adjustment that incorporates gauge data.  In Version 4 and later (scheduled to be available in November - Decemberr 2016), this will be a climatological adjustment to the Final run, which includes gauge data at the monthly scale. For Version 3 (which is the currently available data) the TRMM V7 climatological adjustment of the TMPA-RT to the production TMPA is used (which includes gauge at the monthly scale) because this at-launch algorithm didn't yet have any Late and Final data from which to build the climatological adjustment.  The Final run uses a month-to-month adjustment to the monthly Final Run product, which combines the multi-satellite data for the month with GPCC gauge.  Its influence in each half hour is a ratio multiplier that's fixed for the month, but spatially varying.
  2. The Late Run is computed about 15 hours after observation time, so sometimes a microwave overpass is not delivered in time for the Late Run, but subsequently comes in and can be used in the Final.  This would affect both the half hour in which the overpass occurs, and (potentially) morphed values in nearby half hours.

The difference over the oceans has to be the first, while the difference over many land areas could be either.  The satellite sensor difference could be examined by comparing the satellite sensor data field in the Late and Final Run datasets for each half hour.  Since the gauge adjustment is a constant multiplier, a time series should show a constant ratio between the Late and the Final Runs for the entire month (except for cases where the satellite sensor is changing, just as for the ocean).

We always advise people to use the Final Run for research, but to be realistic; with such a short record, the extra months of Late Run might outweigh the risk of using less-accurate data.  The vast majority of grid boxes have fairly similar Late and Final values, so it makes sense to stick to metrics that are more resistant to occasional data disturbances than others.  Extreme values are more sensitive to these details; medians, means, and root-mean square difference are less sensitive. 

GPM project data sets, including the Core Observatory and constellation partner sensor data sets, national data sets, including multi-satellite data sets, have been released to the public and are available for download now (click here to see a table of GPM data products). These initial releases are being computed for the GPM era (February 2014 to present) using pre-launch calibrations.  

Subsequently, a general reprocessing will upgrade the algorithms to fully GPM-based calibrations.  This is scheduled to occur in September 2015 for Core Observatory and partner data sets, and in January 2016 for the U.S. multi-satellite algorithm (Integrated Multi-Satellite Retrievals for GPM; IMERG). After about a year of additional development work, the data sets will be retrospectively processed back to the start of TRMM (January 1998).

The resolution of Level 0, 1, and 2 data is determined by the footprint size and observation interval of the sensors involved.  Level 3 products are given a grid spacing that is driven by the typical footprint size of the input data sets. See the table of GPM & TRMM Data Downloads for details on the resolution of each specific product.

There are several sources for downloading and viewing data which allow you to subset the data by longitude and latitude. These include the Simple Subset WizardGiovanni and STORM . In the new Giovanni 4 you can also now obtain data for a specific country, U.S. state, or watershed by using the "Show Shapes" option in the "Select Region" pane

Browse our tables of GPM & TRMM data downloads to locate your desired algorithm, then click on the link in the algorithm description that says "Full Documentation".

The transition from the Tropical Rainfall Measuring Mission (TRMM) data products to the Global Precipitation Measurement (GPM) mission products has begun. The TMPA products will be replaced by the Integrated Multi-satellitE Retrievals for GPM (IMERG) products.  It is tentatively planned to continue computing the TMPA products throughout the transition, into Spring 2017. Click here for more details on this transition. Click here for more details on this transition. 

GPM data products can be divided into two groups (real-time and production) depending on how soon they are created after the satellite collects the observations. For applications such as weather, flood, and crop forecasting that need precipitation estimates as soon as possible, real-time data products are most appropriate.  GPM real-time products are generally available within a few hours of observation.  For all other applications, production data products are generally the best data sets to use because additional or improved inputs are used to increase accuracy.  These other inputs are only made available several days, or in some cases, several months, after the satellite observations are taken, and the production data sets are computed after all data have arrived, making possible a more careful analysis.

The TRMM FTP has a Climatology directory which contains files in the TRMM Composite Climatology developed by Wang, Adler, Huffman, and Bolvin.  A journal article on this topic is available here:http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-13-00331.1 . Pre-generated world maps of TRMM climatology data are also available here.

Yes, in line with NASA's general data policy. Please refer to the GPM Data Policy for further details.

The data set source should be acknowledged when the data are used.  One standard format for a formal reference is:

Dataset authors/producers, data release date: Dataset title, version. Data archive/distributor, access date in standard AMS format, data locator/identifier (doi or URL).

For example:

G. Huffman, D. Bolvin, D. Braithwaite, K. Hsu, R. Joyce, P. Xie, 2014: Integrated Multi-satellitE Retrievals for GPM (IMERG), version 4.4. NASA's Precipitation Processing Center, accessed 31 March, 2015, ftp://arthurhou.pps.eosdis.nasa.gov/gpmdata/

For more details on citation format, please refer to the American Meteorologic Society Data Archiving and Citation guidelines:http://www2.ametsoc.org/ams/index.cfm/publications/authors/journal-and-bams-authors/journal-and-bams-authors-guide/data-archiving-and-citation/

In the case of data sets that have not been given DOI’s, the most persistent "landing page" should be named as the "data locator", for example,

http://disc.gsfc.nasa.gov/datacollection/3B42.V7.html 

As an “Acknowledgment”, one possible wording is:

"The &lt;dataset name> data were provided by the NASA/Goddard Space Flight Center's <team's organization> and PPS, which develop and compute the <dataset name> as a contribution to <project (TRMM or GPM)>, and archived at the NASA GES DISC." 

For any given data TMPA data set, each data value provides a precipitation rate based on one (or perhaps two) satellite snapshots during the TMPA’s 3-hour analysis period. IMERG values are based on a single snapshot during its half-hour analysis period, or a morphed interpolation if no microwave values are available.  The values are expressed in the intensive units mm/hr; it is usually best to assume that this rate applies for the entire 3- or half-hour period.  If you wish to regrid to a finer time and/or space grid, note that many interpolation schemes have the property of suppressing maxima in precipitation and expanding rain events into neighboring zero-amount periods. 

Yes, you can download a subset of the parameters of a given data product using the Simple Subset Wizard. Several other data sources also provide spatial subsetting, including Giovanni and STORM.

The GPM satellite constellation observes precipitation as it is falling, and maintains a database of precipitation records dating back to 1998. GPM is primarily focused on obtaining the highest quality precipitation measurements and studying fundamental atmospheric processes, and thus we do not focus on forecasting or predicting the weather. However, the near-real-time data collected by GPM is ingested into computer models by operational agencies such as the NWS and the ECMWF, who use it to improve their weather forecasts. Please visit the NWS and ECMWF websites for further information:

http://www.weather.gov/

http://www.ecmwf.int/

Although the TRMM satellite is no longer in service, the 3B42 series of algorithms will continue to be run using other satellites in the constellation to produce data products that are consistent with the long-term records.  The current plan is to continue production into mid-2017 to give users time to transition to the newer IMERG multi-satellite data products. For more details about the status of 3B42 and the transition to IMERG, please refer to this document: http://pmm.nasa.gov/sites/default/files/document_files/TMPA-to-IMERG_transition.pdf 

 Please refer to our tables TRMM and GPM data downloads:

First locate the data product that meets your needs, then look to the “Format” column to find the appropriate link to download data in your desired format. 

In general, GPM data products are named using the following format: 

[algorithm level].[satellite].[instrument].[algorithm name].[year / month / date].[data start time hr/min/sec UTC].[data end time UTC].[sequence indicator showing orbit # (L2) or day/month (L3)].[algorithm version].[data format]

For example:

2A.GPM.GMI.GPROF2008.20131101-S235152-E012400.000352.V03C.HDF5

This is a Level 2A product, using the GPM satellite's GMI sensor, using the "GPROF 2008" algorithm, showing data from Novemeber 1st 2013 starting at 23:51:52 UTC and ending at 01:24:00 UTC, orbit number 352, using version 03C of the algorithm, in HDF5 format. 

For a more detailed explanation of GPM file naming conventions, please refer to the following document: File Naming Convention for Precipitation Products For the Global Precipitation Measurement (GPM) Mission

The GPROF retrieval uses all the GMI channels, but these channels are recorded by multiple feed horns on the instrument, which produce data with slightly different geolocations that are systematically offset from each other.  Thus, only the region with overlapping data can support GPROF retrievals.  The Core Observatory data are downlinked via the NASA Tracking and Data Relay Satellite System (TDRSS) communications satellite system to the NASA White Sands Test Facility in New Mexico, and networked to PPS at NASA/Goddard as 5-minute packets.  So, for GPROF to create retrievals across the entire granule, the previous and following granules are required to give all the channels over a packet's entire area of coverage.  And, to satisfy the need for "real time" production, the retrieval is run no more than 11 minutes after the last observation time.  (The maximum delay was recently adjusted to accommodate changes in GPROF run times.)

Episodically, the Core Observatory is out of sight of the TDRSS satellites.  Depending on orbital details, the gap can be 20 minutes, and as long as an hour.  When this happens, the last packet before a gap will lack timely access to the following packet and the last several scans will not have all the necessary channel data.  The result is several scans of "missing" retrievals at the end of the granule.  There are, of course, other ways in which scans of "missing" retrievals can occur, but the issue described is the most common. 

First, ensure you have registered your email address with the PPS using this webpage: http://registration.pps.eosdis.nasa.gov/registration/

Once registered, your email address will serve as both your username AND password for logging into the FTP site. Email addresses are converted to lower case when registering, so please enter your username and password in lowercase as well.

If you need access to the near-realtime (NRT) GPM files on ftp://jsimpson.pps.eosdis.nasa.gov, please be sure to check the box labelled "Near-Realtime Products". Otherwise you will be unable to log in to the NRT FTP server.

If you have already registered but would like to change your account details (such as adding access to NRT products) please visit this page and click "Verify Email or Update Info": http://registration.pps.eosdis.nasa.gov/registration/

The PPS FTP does not work with the Safari web browser due to the way it handles FTP authentication. It is recommended you use another web browser such as Chrome or Firefox, or use the command line or a dedicated FTP client application (Click here for a list of possible FTP clients. NASA does not endorse any of these applications, they are listed merely as a suggestion).

Please visit the "PPS Satellite-Ground Coincidence Finder" website from NASA's Precipitation Processing System: https://storm.pps.eosdis.nasa.gov/storm/data/Service.jsp?serviceName=OverflightFinder#events

This tool allows you to select a location, date range, and satellite to determine when the satellite has passed over that location within those dates.