NHC Tropical Cyclone Forecast Verification
Updated 1 March 2012
Contents
- Introduction
- Forecast verification procedures
- Annual NHC verification reports
- Official five-year mean errors and distributions
- Official error trends
- Model error trends
- NHC official forecast error database
- Performance measures and goals
- References
6. Model error trends
Numerous objective forecast aids (guidance models) are available to help
the NHC Hurricane Specialists in the preparation of their official track and intensity forecasts. Guidance models are
characterized as either early or late, depending on whether or not they are available to the Hurricane Specialist
during the forecast cycle. For example, consider the 1200 UTC (12Z) forecast cycle, which begins with the 12Z
synoptic time and ends with the release of the official forecast at 15Z. The 12Z run of the NWS/Global Forecast
System (GFS) model is not complete and available to the forecaster until about 16Z, or about an hour after the
forecast is released - thus the 12Z GFS would be considered a late model since it could not be used to prepare the
12Z official forecast.
Multi-layer dynamical models are generally, if not always, late models.
Fortunately, a technique can be used to take the latest available run of a late model and adjust its forecast to
apply to the current synoptic time and initial conditions. In the example above, forecast data for hours 6-126 from
the previous (06Z) run of the GFS would be adjusted, or shifted, so that the 6-h forecast (valid at 12Z) would
exactly match the observed 12Z position and intensity of the tropical cyclone. The adjustment process creates an
"early" version of the GFS model for the 12Z forecast cycle that is based on the most current available guidance. The
adjusted versions of the late models are known, for historical reasons, as interpolated models.
A list of models is given in Table 4. In addition
to their timeliness, models are characterized by their complexity or structure; this information is contained in the
table for reference, but a complete description of the various model types is beyond the scope of this report.
Additional information is given by DeMaria and Gross (2003) and here. Briefly, dynamical models solve the physical equations
governing motions in the atmosphere. These may treat the atmosphere either as a single layer in the vertical
(two-dimensional) or as having many layers (three-dimensional), and their domains may cover the entire globe or be
limited to specific regions. Statistical models, in contrast, do not consider the physics of the atmosphere but
instead are based on historical relationships between storm behavior and various other parameters. There is a class
of statistical models that use output from dynamical models as well as historical data, and these hybrids are known
as statistical-dynamical models. Similarly, statistical-synoptic hybrids use data from a current synoptic analysis.
Consensus models are not true forecast models per se, but are merely combinations of results from other models. One
way to form a consensus model is to simply average the results from a sample of models, but other, more complex
techniques can give better results. For example, individual component models can be weighted or combined on the basis
of past performance, or statistical corrections to historical biases can be made.
A fairly compete digital record of NHC model guidance exists for the
period beginning in 1989. Digital records for earlier years have also been reconstructed for many guidance models.
Two verifications from this record are presented here, a homogeneous comparison of early models beginning in 1994,
and a non-homogeneous comparison of both early and late models dating back to 1970. A much more extensive discussion
on this topic is given by DeMaria and Gross (2003). Additional model verifications can also be found in the annual NHC verification reports.
This figure shows a homogeneous comparison of early
models for the period 1994-2009 (the practice of creating early versions of the late dynamical models began in 1994).
A downward trend in model forecast error is evident in the figure. One can also see that the best performing model
changes almost every year. It can also be seen that the accuracy of the official forecast is generally close to that
of the best performing model in any given year. This is not because the forecasters are particularly skilled at
identifying the best performing model in advance, but because the forecasters have long relied on a consensus or
ensemble forecast philosophy, which tends to outperform even the best performing individual component of the
ensemble. This philosophy was formalized in 2001 with the introduction of explicit consensus models into the
operational guidance suite. Since their introduction, the consensus models have been the best performing track
models, and they are expected to continue to excel in the future.
A longer history of track model guidance errors is shown
here. This verification is non-homogeneous and includes both early and late models, meaning that these errors
cannot be fairly compared either with each other or with the official forecast. However, such a verification does
provide a general sense of the trends in model performance over time. It is interesting to note how the primary model
types have changed over the years. In the early 1970's, statistical (and statistical hybrid) models, along with the
simple barotropic SANBAR model, were the primary form of guidance. The 1980's saw the development of tropical cyclone
forecast skill in the multi-level global dynamical models, although such guidance normally arrived too late to be
directly used by forecasters. By the 1990's, the global models had improved and were providing consistently better
guidance than either the statistical or the simpler dynamical models. And, as noted above, consensus models have been
particularly strong performers in recent years.
It can also be seen from the figure that the variability in model
accuracy (the spread between the best and worst performing models in a given year) has substantially decreased over
the past 35 years. This is due in part to the retirement of poor-performing models (and model types) over time, but
it may also reflect the changing nature of tropical cyclone activity during this period. In the 1970's and 80's,
there was relatively little activity, and much of that activity was at higher latitudes, where forecasts generally
have higher errors and more variability. During the past 10 years or so there have been more storms, particularly at
low latitudes, leading to both more representative and more consistent error statistics.
Table 4. National Hurricane Center forecasts and models.
ID |
Name/Description |
Type |
Timeliness
(E/L) |
Parameters
forecast |
OFCL |
Official NHC forecast |
|
|
Trk, Int |
CLP5 |
CLIPER5 (Climatology and Persistence model) |
Statistical baseline |
E |
Trk |
SHF5 |
SHIFOR5 (Climatology and Persistence model) |
Statistical baseline |
E |
Int |
DSF5 |
Decay-SHIFOR5 (Climatology and Persistence model) |
Statistical baseline |
E |
Int |
OCD5 |
CLP5 (track) and DSF5 (intensity) models merged |
Statistical baseline |
E |
Trk, Int |
HURN |
HURRAN (retired) |
Analog |
E |
Trk |
A67 |
NHC-67 (retired) |
Statistical-synoptic |
E |
Trk |
A72 |
NHC-72 (retired) |
Statistical-dynamical |
E |
Trk |
A83 |
NHC-83 (retired) |
Statistical-dynamical |
E |
Trk |
A90E |
NHC90 (Atlantic) |
Statistical-dynamical |
E |
Trk |
A98E |
NHC98 (Atlantic) |
Statistical-dynamical |
E |
Trk |
P91E |
NHC91 (Pacific) |
Statistical-dynamical |
E |
Trk |
BAMS |
Beta and advection model (shallow layer) |
Single-layer trajectory |
E |
Trk |
BAMM |
Beta and advection model (medium layer) |
Single-layer trajectory |
E |
Trk |
BAMD |
Beta and advection model (deep layer) |
Single-layer trajectory |
E |
Trk |
SBAR |
SANBAR (retired) |
Single-layer regional dynamical |
L |
Trk |
VBAR |
VICBAR (retired) |
Single-layer regional dynamical |
L |
Trk |
LBAR |
Limited area barotropic model |
Single-layer regional dynamical |
E |
Trk |
QLM |
Quasi-Lagrangian model (retired) |
Multi-layer regional dynamical |
L |
Trk |
MFM |
Movable Fine Mesh model (retired) |
Multi-level global dynamical |
L |
Trk |
GFDL |
NWS/Geophysical Fluid Dynamics Laboratory model |
Multi-layer regional dynamical |
L |
Trk, Int |
GFSO |
NWS/Global Forecast System (formerly Aviation) |
Multi-layer global dynamical |
L |
Trk, Int |
UKM |
United Kingdom Met Service model |
Multi-layer global dynamical |
L |
Trk, Int |
NGPS |
Navy Operational Global Prediction System |
Multi-layer global dynamical |
L |
Trk, Int |
GFDN |
Navy version of GFDL |
Multi-layer regional dynamical |
L |
Trk, Int |
CMC |
Environment Canada global model |
Multi-level global dynamical |
L |
Trk, Int |
EMX |
ECMWF global model |
Multi-level global dynamical |
L |
Trk, Int |
ETA |
NWS/Eta |
Multi-level regional dynamical |
L |
Trk, Int |
NAM |
NWS/NAM |
Multi-level regional dynamical |
L |
Trk, Int |
AFW1 |
Air Force MM5 |
Multi-layer regional dynamical |
L |
Trk, Int |
OFCI |
Previous cycle OFCL, adjusted |
Interpolated |
E |
Trk, Int |
GFDI |
Previous cycle GFDL, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
GHMI |
Previous cycle GFDL, adjusted using modified intensity algorithm |
Interpolated-dynamical |
E |
Trk, Int |
GFSI |
Previous cycle GFS, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
UKMI |
Previous cycle UKM, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
NGPI |
Previous cycle NGPS, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
GFNI |
Previous cycle GFDN, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
EMXI |
Previous cycle EMX, adjusted |
Interpolated-dynamical |
E |
Trk, Int |
SHIP |
Statistical Hurricane Intensity Prediction Scheme (SHIPS) |
Statistical-dynamical |
E |
Int |
DSHP |
SHIPS with inland decay |
Statistical-dynamical |
E |
Int |
AEMN |
GFS ensemble mean |
Consensus |
L |
Trk, Int |
GUNA |
Average of GFDI, UKMI, NGPI, and GFSI |
Consensus |
E |
Trk |
TVCN |
Average of at least 2 of GFDI, UKMI, NGPI, GFSI, EMXI and GFNI |
Consensus |
E |
Trk |
CGUN |
Version of GUNA corrected for model biases |
Corrected Consensus |
E |
Trk |
TVCC |
Version of TVCN corrected for model biases |
Corrected Consensus |
E |
Trk |
FSSE |
FSU Super-ensemble |
Corrected consensus |
E |
Trk, Int |
Annual average model track errors for
Atlantic basin tropical cyclones for the period 1994-2011, for a homogeneous selection of "early" models. NHC official and operational CLIPER5 errors for these homogeneous samples are
shown for comparison.
A non-homogeneous comparison of annual average model
track errors for Atlantic basin tropical storms and hurricanes for the period 1970-2011.
This verification includes both early and late models (late models are
those not available to the forecaster during the forecast cycle).
Readers are cautioned that directly comparing errors from a
non-homogeneous verification can lead to incorrect conclusions about
the relative performance of the various models.
Next: NHC official forecast error database
|