Jump to main content.


Research Project Search
 Enter Search Term:
   
 NCER Advanced Search

2006 Progress Report: Integrating Numerical Models and Monitoring Data

EPA Grant Number: R829402C002
Subproject: this is subproject number 002 , established and managed by the Center Director under grant R829402
(EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).

Center: Center for Integrating Statistical and Environmental Science
Center Director: Stein, Michael
Title: Integrating Numerical Models and Monitoring Data
Investigators: Stein, Michael , Amit, Yali , Beletsky, Dmitry , Kotamarthi, V. Rao , Lesht, Barry , Nakamura, Noboru , Schwab, David , Stroud, Jonathan
Current Investigators: Stein, Michael , Amit, Yali , Beletsky, Dmitry , Chen, Li , Kotamarthi, V. Rao , Lesht, Barry , Nakamura, Noboru , Schwab, David , Stroud, Jonathan , Zhang, Zepu
Institution: University of Chicago , Argonne National Laboratory , National Oceanic and Atmospheric Administration , University of Michigan , University of Pennsylvania
Current Institution: Argonne National Laboratory , National Oceanic and Atmospheric Administration , University of Chicago , University of Michigan , University of Pennsylvania
EPA Project Officer: Smith, Bernice
Project Period: March 12, 2002 through March 11, 2007
Project Period Covered by this Report: March 12, 2006 through March 11, 2007
RFA: Environmental Statistics Center (2001)
Research Category: Environmental Statistics , Ecological Indicators/Assessment/Restoration

Description:

Objective:

This project addresses statistical approaches to using both monitoring data and output from a physical model to assess the state of the physical environment. This work can be organized into seven subprojects, and several sections of this report are divided into parts for each of these. These projects cover a broad range of environmental applications, including air pollution monitoring, evaluation of the Community Multiscale Air Quality (CMAQ) model, statistical models for environmental processes on a global scale, assessing the effect of spatial resolution on model output, and data assimilation for CMAQ and for hydrodynamic and sediment transport models in Lake Michigan. One subproject from last year, “Using Simplified Versions of CMAQ to Explore Many Emission Scenarios,” is no longer active and is not reported on here. Another, “Estimating Deformations of Isotropic Random Processes,” is nearly completed and should wrap up in the next month or so. There are some new directions, including investigations of statistical methodologies and their properties for spatial and spatial-temporal processes (included in subprojects A and F; see lettered sections below) and data assimilation for CMAQ using the Data Assimilation Research Testbed (DART) system developed at the National Center for Atmospheric Research (NCAR) for meteorological models (included in subproject D). However, as the funding period for the Center for Integrating Statistical and Environmental Science (CISES) nears its end, the focus is increasingly on completing work in existing areas.

Progress Summary:

A. Space-Time Covariance Functions

Models for processes on spheres continue to be one of the major focuses of this subproject. In particular, we have been investigating modeling and estimation for spatial processes on the sphere possessing axial symmetry (the process model is unchanged by rotations about a given axis) using spherical polynomials with random coefficients. Total column ozone, as measured by the satellite-based instrument Total Ozone Mapping Spectrometer (TOMS) remains the test case we are considering. The other main area of investigation in this subtopic is periodically correlated models for spatial-temporal processes in order to allow for diurnal and/or seasonal cycles in the covariance structure. These models are being investigated in the context of total column ozone on a global scale (for which seasonal cycles are the main issue) and ground-level ozone on a regional level (focusing on diurnal cycles). The following is an example of the kind of issue one faces . I n many regions, there may be a stronger spatial correlation in ground level ozone at night than during the day, a feature that is not easily captured by existing spatial-temporal statistical models. PI: Michael Stein. Postdoctoral Research Associate: Li Chen.

B. Comparing CMAQ Output to Monitoring Data

Li Chen has nearly completed work on comparing ozone levels from CMAQ runs at various resolutions to observed values. The focus has been on developing fairly simple numerical graphical methods that allow easy comparisons across models and the sources of discrepancy between output and observations. PIs: Michael Stein. Postdoctoral research associate: Li Chen. U.S. Environmental Protection Agency (EPA) collaborator: Robin Dennis.

C. Statistical Issues Arising in the Study of High-Resolution Versions of CMAQ

Using specialized CMAQ runs done at EPA, we are writing up work on statistical descriptions of model errors— that is, errors that occur in deterministic numerical models even when the initial and boundary conditions are specified without error. A good statistical description of these errors is a critical component to proper data assimilation that is often ignored, in part, because it is so difficult to do. Our tool for getting at this question is to run versions of CMAQ at different resolutions using, to the extent feasible, identical initial and boundary conditions and then using the high-resolution run as a surrogate for the truth. We have analyzed these “model errors” for multiple pollutants using a number of different approaches, including spatial periodograms, which led to one of the theoretical investigations described in subproject F. PI: Michael Stein. Graduate student: Chae Young Lim.

D. Data Assimilation

We have submitted a paper describing a detailed forecasting study of Lake Michigan sediment by combining a two-dimensional sediment transport model with Sea-Viewing Wide Field-of-View Sensor (SeaWiFS) satellite images. We have developed new algorithms for combined state and parameter estimation in the Ensemble Kalman Filter (EnKF) framework, and we are continuing work on these algorithms, including extensions to smoothing algorithms and their applications to sediment transport. We are nearing completion on data assimilation for a hydrodynamics program in Lake Michigan. The measurements available here are hourly current observations at 11 moorings, 10 of which are in the southeast corner of the lake. We have developed an approach to incorporating observational data into model output that naturally yields currents respecting the boundaries of the lake and the near divergence-free character of the advections.

We have developed a simplified version of CMAQ to make a single tracer version to model carbon monoxide. For both computational and conceptual reasons, we consider this simplified CMAQ as a more appropriate test case for data assimilation with an air quality model than the full CMAQ. We have combined this simplified CMAQ with the DART , developed at NCAR, to create an environment to test various data assimilation techniques. Presently, we are running CMAQ in ensemble adjustment Kalman filter mode to assimilate synthetic observations for the period June 2001. We have also experimented with assimilating real observations from the Air Quality System (AQS) network of monitors. Alexis Zubrow has traveled to NCAR to work with the developers of DART to integrate the DART-CMAQ interface into future releases. PIs: Dmitry Beletsky, Rao Kotamarthi, Barry Lesht, Dave Schwab, Michael Stein, Jon Stroud. Postdoctoral research associates: Zepu Zhang, Li Chen. Staff: A lexis Zubrow.

E. Estimating Deformations of Isotropic Random Processes

Except for finishing a revision of the main paper on this topic, this subproject is completed. PI: Michael Stein. Graduate student (former): Ethan Anderes.

F. Statistical Inference for Spatial and Spatial-Temporal Processes

This subproject addresses some theoretical issues that have arisen in the course of our work on this project. We have completed a substantial paper on the nonparametric estimation of the spectral density of an isotropic spatial process based on scattered observations, taken from the completed doctoral thesis of Haky Im. We are continuing our investigation of simple and computationally efficient methods for obtaining accurate predictive inferences for Gaussian processes with unknown parameters in the covariance function. We are presently exploring the use of Edgeworth expansions for comparing various approaches to this problem as well as applying the methods to large space-time datasets to assess their effectiveness.

Some new efforts focus on fixed-domain asymptotics (in which the number of observations in a fixed and bounded region tends to infinity), which is the natural asymptotic approach for high density spatial data. With Ji Meng Loh (who is not connected to CISES), Michael Stein has written a paper on asymptotic properties of bootstrap estimates of variance for spatial variograms. The paper contains the first theoretical results we are aware of for bootstrapping under fixed-domain asymptotics. Chae Young Lim has recently obtained a number of fixed-domain asymptotic results on the behavior of spatial periodograms and cross-periodograms, including central limit theorems for smoothed spatial cross-periodograms. PI: Michael Stein. Graduate students: Chae Young Lim, Darongsae Kwon.

G. Statistical Approaches to Numerical Model Errors

A highly idealized advection-diffusion system with random point sources has been studied as a testbed for errors in the air quality model. Capitalizing on the linearity of the problem, we have evaluated the relative importance of truncation errors, errors in the advecting winds, and multiplicative effects of errors in the presence of nonlinear chemistry. Distribution of errors has been shown to depend on model resolution, numerical diffusion, magnitude of wind speed, and the density of source distribution. The result is being written up in a manuscript to be submitted to Journal of Geophysical Research. We are also pursuing theoretical studies of the statistical nature of model errors, including limit theorems as the density of point sources increases. PIs: Noboru Nakamura, Michael Stein. Graduate student: Chae-Young Lim.

Results to Date

A. Space-Time Covariance Functions. We have completed work on space-time models for total column ozone at a single latitude that respects the space-time structure of the observations and provides means for capturing the seasonality in the dependence structure. Moving next to modeling the spatial variation in total column ozone at all latitudes simultaneously, we have had some success in fitting axially symmetric models that capture the large differences in covariance structure at different latitudes. O ne of our postdoctoral research associates, Zepu Zhang, following up on work in his doctoral thesis, has completed a paper on space-time models for rain gauge data, which has been accepted for publication at Water Resources Research.

B. Comparing CMAQ Output to Monitoring Data. We have refined our numerical and graphical approaches to describing the discrepancies between CMAQ output and observations of hourly ozone levels. In addition, we have applied our methods to CMAQ runs done by EPA for the Atlanta area; our methods show that the entire Atlanta area behaves more like the nonurban region in Chicago in terms of model/data comparisons. A draft manuscript of this work is available, which should be ready for submission in the near future.

C. Statistical Issues Arising in the Study of High-Resolution Versions of CMAQ. Our results show substantial discrepancies between high and low resolution CMAQ outputs, even when the boundary and initial conditions for the chemistry of the runs are identical. While the discrepancies tend to increase the longer the models are allowed to evolve from their initial conditions, the discrepancies are quite noticeable after even 1 hour, indicating that lack of resolution can be a major source of model error in CMAQ and suggesting that even if data assimilation occurs at an hourly time step, accounting for model error will still be important.

D. Data Assimilation. A paper on data assimilation for sediment transport modeling has been submitted for publication. Our data assimilations for hydrodynamics in Lake Michigan show substantially improved advections for roughly 12 hours after the assimilation time, even at monitoring sites not used in the assimilation.

In addition to successfully incorporating our simplified version of CMAQ into the DART framework, we have been gaining experience on how assimilating air quality data affects pollution forecasts. For example, one needs to give careful consideration as to how to update pollution levels above the surface level, given that nearly all monitoring data are taken near the surface.

E. Estimating Deformations of Isotropic Random Processes. We have submitted a paper describing our proposed solution for estimating a deformation of an isotropic Gaussian process, based on data taken on a dense grid. This paper is now being revised for the Annals of Statistics,and we hope to have this work accepted in the coming months.

F. Statistical Inference for Spatial and Spatial-Temporal Processes. In our study of bootstrapping for spatial variograms, we have demonstrated theoretically under fixed-domain asymptotics that it is possible to get asymptotically valid confidence intervals for the spatial variogram at short lags with observations on a grid. An important requirement is that the process not be too smooth, although we show that for smoother processes, looking at higher order variograms again allows valid inferences for some aspects of the local behavior of the process. A large-scale simulation study supports these conclusions.

In our study of periodograms for spatial data on a lattice, we have very recently obtained results on the asymptotic normality of smoothed periodograms of appropriately differenced Gaussian random fields under fixed-domain asymptotics. These are the first results of this type, and they demonstrate that, at least in some circumstances, spatial periodograms can be expected to be useful even when there are nontrivial spatial dependencies across the observation domain.

In addition to the simulations and applications to meteorological data, the work on predictive inferences is now being addressed from a theoretical perspective, as we seek to use Edgeworth expansions in some special cases to compare the asymptotic properties of various approaches to account for parameter uncertainty in predictive inferences.

G. Statistical Approaches to Numerical Model Errors. Error distributions in our idealized models have been shown to depend on model resolution, numerical diffusion, magnitude of wind speed, and the density of source distribution. In addition, we have obtained some central limit theorems for pollution levels as the density of sources increases.

Future Activities:

A. Space-Time Covariance Functions. We plan to complete work on axially symmetric models of spheres and their application to total column ozone data this fall. One of our recommendations will be for the National Air and Space Administration (NASA) to produce a “Level 2.5” version of these data that would be gridded like Level 3 daily data but would retain information from individual orbits of the satellite, thus enabling more refined models of the spatial-temporal variation of ozone. With regards to periodically correlated spatial-temporal processes, we will seek to find suitable space-time models for hourly ozone data in the Chicago area that take account of the differences in spatial dependence between day and night.

B. Comparing CMAQ Output to Monitoring Data. We will complete the write-up of this work and submit it for publication.

C. Statistical Issues Arising in the Study of High-Resolution Versions of CMAQ. We will complete the write-up of this work and, if feasible, make some further specialized CMAQ runs on the CISES computers to investigate how varying the meteorological and emissions inputs affects CMAQ output.

D. Data Assimilation in Hydrodynamic Models. For assimilation into the sediment transport model, we largely have the makings of two substantial papers and a third related paper on computational methods for parameter estimation related to the ensemble Kalman filter. Thus, the focus in the coming year needs to be on writing up these results.

For assimilation into the hydrodynamic model, we hope to have a completed paper ready in the next few months. Further efforts in this direction may focus on vertical variations in advections and how best to exploit the available data at the 11 monitoring sites, in which measurements are taken at many depths at some sites and at only two depths at the other sites.

For data assimilation into CMAQ, goals for the coming year include the assimilation of satellite data using Measurements of Pollution in the Troposphere (MOPITT) and developing methods for improving emissions via these same assimilation techniques.

E. Estimating Deformations of Isotropic Random Processes. Once the major paper on this topic is accepted, no further work will be done on this project.

F. Statistical Inference for Spatial and Spatial-Temporal Processes. We will write up the work on limit theorems for spatial periodograms, obtain Edgeworth expansions for studentized predictions in some simple special cases, and derive higher order results for coverage probabilities of resulting prediction intervals that take into account that the degrees of freedom in the various approximations are random.

G. Statistical Approaches to Numerical Model Errors. The coming year’s effort will aim at using the error statistics we have obtained in our studies to date to come up with a sensible subgrid parameterization and an input for data assimilation.


Journal Articles on this Report: 4 Displayed | Download in RIS Format

Other subproject views: All 28 publications 27 publications in selected types All 17 journal articles
Other center views: All 102 publications 59 publications in selected types All 37 journal articles

Type Citation Sub Project Document Sources
Journal Article Im HK, Stein ML, Kotamarthi VR. A new approach to scenario analysis using simplified chemical transport models. Journal of Geophysical Research 2005;110(D24205), doi:10.1029/2005JD006417. R829402C002 (2006)
R829402C002 (Final)
  • Abstract: AGU Abstract
    Exit EPA Disclaimer
  • Journal Article Shao X, Stein ML. Statistical conditional simulation of a multiresolution numerical air quality model. Journal of Geophysical Research 2006;111(D15211), doi:10.1029/2005JD007037. R829402 (2006)
    R829402C002 (2006)
    R829402C002 (Final)
  • Abstract: AGU Abstract
    Exit EPA Disclaimer
  • Journal Article Stein ML. Statistical methods for regular monitoring data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(5):667-687. R829402C002 (2004)
    R829402C002 (2006)
    R829402C002 (Final)
  • Abstract: Blackwell Synergy Abstract
    Exit EPA Disclaimer
  • Journal Article Stein ML. Seasonal variations in the spatial-temporal dependence of total column ozone. Environmetrics 2007;18(1):71-86. R829402C002 (2006)
    R829402C002 (Final)
  • Abstract: InterScience Abstract
    Exit EPA Disclaimer
  • Other: University of Chicago PDF
    Exit EPA Disclaimer
  • Supplemental Keywords:

    , Ecosystem Protection/Environmental Exposure & Risk, Economic, Social, & Behavioral Science Research Program, Air, Geographic Area, Scientific Discipline, Health, RFA, PHYSICAL ASPECTS, Ecosystem/Assessment/Indicators, Engineering, Chemistry, & Physics, Risk Assessments, Environmental Statistics, Great Lakes, Applied Math & Statistics, Health Risk Assessment, Physical Processes, Ecological Risk Assessment, Environmental Engineering, EPA Region, particulate matter, Ecological Effects - Environmental Exposure & Risk, Ecosystem Protection, Monitoring/Modeling, Environmental Monitoring, risk assessment, trend monitoring, ozone , chemical transport models, particulate, stochastic models, statistical methodology, air quality, computer models, ecological risk, ecosystem health, environmental indicators, ozone, chemical transport, health risk analysis, human health risk, monitoring, statistical models, particulates, statistical methods, watersheds, Region 5, air pollution, sediment transport, stratospheric ozone, emissions monitoring, data models, exposure, water, chemical transport modeling, ecological models, ecological effects, ecological health, human exposure
    Relevant Websites:

    http://galton.uchicago.edu/~cises/ exit EPA

    Progress and Final Reports:
    2002 Progress Report
    2004 Progress Report
    Original Abstract
    Final Report


    Main Center Abstract and Reports:
    R829402    Center for Integrating Statistical and Environmental Science

    Subprojects under this Center: (EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).
    R829402C001 Detection of a Recovery in Stratospheric and Total Ozone
    R829402C002 Integrating Numerical Models and Monitoring Data
    R829402C003 Air Quality and Reported Asthma Incidence in Illinois
    R829402C004 Quasi-Experimental Evidence on How Airborne Particulates Affect Human Health
    R829402C005 Model Choice Stochasticity, and Ecological Complexity
    R829402C006 Statistical Approaches to Detection and Downscaling of Climate Variability and Change

    Top of page

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Conclusions drawn by the principal investigators have not been reviewed by the Agency.


    Local Navigation


    Jump to main content.