USGS Home
SOFIA Home

Hydrologic Monitoring Network: Data Mining and Modeling to Separate Human and Natural Hydrologic Dynamics

Metadata also available as - [Questions & Answers] - [Parseable text] - [XML]

Metadata:


Identification_Information:
Citation:
Citation_Information:
Originator: Paul Conrads
Publication_Date: Unpublished Material
Title:
Hydrologic Monitoring Network: Data Mining and Modeling to Separate Human and Natural Hydrologic Dynamics
Online_Linkage:
<http://sofia.usgs.gov/projects/index.php?project_url=hydro_monnet/>
Description:
Abstract:
The objectives of the study include: (1) integration of hydrologic analysis and synthesis with biological studies; (2) separation of water level, stream flow, and salinity time series into the natural (tidal, climate) and anthropogenic components; and, (3) identification of additional areas where application of data mining techniques can address the DOI science needs in South Florida.
Purpose:
New technologies in environmental monitoring have made it cost effective to acquire tremendous amounts of hydrologic and water-quality data. Although these data are a valuable resource for understanding environmental systems, often there is seldom a thorough analysis of the data. The monitoring network(s) supported by the Comprehensive Everglades Restoration Plan (CERP) records tremendous amounts of data each day and the data base incorporates millions of data points describing the environmental response of the system to changing conditions. To enhance the evaluation of the CERP data base, there is an immediate need to apply new methodologies to systematically analyze the data set to answer critical questions such as relative impacts of controlled freshwater releases, tidal dynamics, and meteorological forcing on streamflow, water level, and salinity. There also is a need to integrate longer-term hydrologic data with shorter-term hydrologic data collected for biological resource studies. This study will be undertaken as a series of pilot studies to demonstrate the efficacy of data mining techniques to evaluate CERP data and address hydrologic issues important to DOI's efforts in South Florida. In addition, preliminary assessment of the complete set of hydrologic data networks for further integration and analysis using data mining techniques will be conducted.
Time_Period_of_Content:
Time_Period_Information:
Range_of_Dates/Times:
Beginning_Date: 20041001
Ending_Date: 20080930
Currentness_Reference: ground condition
Status:
Progress: In Work
Maintenance_and_Update_Frequency: As needed
Spatial_Domain:
Bounding_Coordinates:
West_Bounding_Coordinate: -81
East_Bounding_Coordinate: -80.25
North_Bounding_Coordinate: 26.25
South_Bounding_Coordinate: 25.75
Keywords:
Theme:
Theme_Keyword_Thesaurus: none
Theme_Keyword: hydrology
Theme_Keyword: water quality
Theme_Keyword: data mining
Theme_Keyword: ANN
Theme_Keyword: artificial neural network
Theme_Keyword: EDEN
Theme_Keyword: Everglades Depth Estimation Network
Theme_Keyword: decision support tool
Theme_Keyword: DSS
Theme:
Theme_Keyword_Thesaurus: ISO 19115 Topic Category
Theme_Keyword: environment
Theme_Keyword: inlandWaters
Theme_Keyword: 007
Theme_Keyword: 012
Place:
Place_Keyword_Thesaurus:
Department of Commerce, 1995, Countries, Dependencies, Areas of Special Sovereignty, and Their Principal Administrative Divisions, Federal Information Processing Standard (FIPS) 10-4, Washington, DC, National Institute of Standards and Technology
Place_Keyword: United States
Place_Keyword: US
Place:
Place_Keyword_Thesaurus:
U.S. Department of Commerce, 1987, Codes for the identification of the States, the District of Columbia and the outlying areas of the United States, and associated areas (Federal Information Processing Standard 5-2): Washington, DC, NIST
Place_Keyword: Florida
Place_Keyword: FL
Place:
Place_Keyword_Thesaurus:
Department of Commerce, 1990, Counties and Equivalent Entities of the United States, Its Possessions, and Associated Areas, FIPS 6-3, Washington, DC, National Institute of Standards and Technology
Place_Keyword: Broward County
Place_Keyword: Miami-Dade County
Place:
Place_Keyword_Thesaurus: USGS Geographic Names Information System
Place_Keyword: Florida Bay
Place_Keyword: Everglades National Park
Place_Keyword: Loxahatchee National Wildlife Refuge
Place_Keyword: Arthur R. Marshall- Loxahatchee National Wildlife Refuge
Place:
Place_Keyword_Thesaurus: none
Place_Keyword: Central Everglades
Place_Keyword: Water Conservation Area 3
Place_Keyword: WCA3
Place_Keyword: Water Conservation Area 3A
Place_Keyword: WCA3A
Access_Constraints: none
Use_Constraints: none
Point_of_Contact:
Contact_Information:
Contact_Person_Primary:
Contact_Person: Paul Conrads
Contact_Organization: U.S. Geological Survey
Contact_Address:
Address_Type: mailing and physical address
Address: 720 Gracern Road
City: Columbia
State_or_Province: SC
Postal_Code: 29210-7651
Country: USA
Contact_Voice_Telephone: 803 750-6140
Contact_Facsimile_Telephone: 803 750-6181
Contact_Electronic_Mail_Address: pconrads@usgs.gov
Data_Set_Credit:
Project personnel include Mark Lowery, Ed Roehl, Ruby Daamen, and Matt Petkewich
Cross_Reference:
Citation_Information:
Originator:
Conrads, Paul A.

Roehl, Edwin A., Jr.

Publication_Date: 2007
Title:
Hydrologic Record Extension of Water-Level Data in the Everglades Depth Estimation Network (EDEN) Using Artificial Neural Network Models 2000-2006
Geospatial_Data_Presentation_Form: report
Series_Information:
Series_Name: USGS Open-File Report
Issue_Identification: 2007-1350
Publication_Information:
Publication_Place: Reston, VA
Publisher: U.S. Geological Survey
Online_Linkage: <http://pubs.usgs.gov/of/2007/1350>
Cross_Reference:
Citation_Information:
Originator:
Conrads, P. A.

Roehl, E. A., Jr.

Publication_Date: 200609
Title: Estimating Water Depths Using Artificial Neural Networks
Geospatial_Data_Presentation_Form: report
Series_Information:
Series_Name: Hydroinformatics 2006
Issue_Identification: v. 3
Publication_Information:
Publication_Place: Nice, France
Publisher: 7th International Conference on Hydroinformatics, HIC 2006
Other_Citation_Details:
This paper was presented at the 7th International Conference on Hydroinformatics, Nice, France, 04-08 September 2006

publication edited by Phillipe Gourbesgille, Jean Cunge, Vincent Guinot, and Shie-Yui Liong

Online_Linkage: <http://sc.water.usgs.gov/publications/pdfs/HIC2006_EDEN.pdf>
Cross_Reference:
Citation_Information:
Originator:
Conrads, P. A.

Roehl, E. A. Jr.; Daamen, R.; Kitchens, W. M.

Publication_Date: 200609
Title:
Using Artificial Neural Network Models to Integrate Hydrologic and Ecological Studies of the Snail Kite in the Everglades, USA
Geospatial_Data_Presentation_Form: report
Publication_Information:
Publication_Place: Nice, France
Publisher: 7th International Conference on Hydroinformatics, HIC 2006
Other_Citation_Details:
This paper was presented at the 7th International Conference on Hydroinformatics, Nice, France, 04-08 September 2006
Online_Linkage:
<http://sc.water.usgs.gov/publications/pdfs/HIC2006_SnailKite.pdf>

Data_Quality_Information:
Logical_Consistency_Report: not available
Completeness_Report: not available
Lineage:
Process_Step:
Process_Description:
The monitoring network for the snail kite study has established an array of 20 continuous water-level monitors to understand differences in hydrology in the study area. To maximize the information content, empirical models using data mining techniques will be developed to (1) predict the water levels at the long-term water-level stations to changing hydrologic inputs, and (2) predict the water level at the 20 short-term monitoring stations. After completing these models, the period of record of the short-term monitoring stations can be extended to be concurrent with the three long-term stations.

To simulate the water level response at the 20 short-term snail kite monitoring sites, artificial neural network (ANN) models will be developed using long-term water-level data in the study area. The ANNs will then be used to extend the period of record of the short-term monitoring sites. The steps to be taken are described below.

Step 1. Data Compilation and Merging: Hydrologic, meteorological, and operational data from the USGS, the National Weather Service, and other databases will be merged and time synchronized. Variables of interest include river flows, freshwater releases, water levels, specific conductance, wind direction and speed, and rainfall.

Step 2. Data Preparation: Methods will be used to maximize the information content in the raw data, while diminishing the influence of poor or missing measurements. Signal (time series) processing methods include clustering, filtering, spectral decomposition, estimation of data characteristics and time delays, and synthesizing missing data. Signal processing transforms the "raw" data into "pre-processed" data for analysis and modeling. The data collected from the agencies have different sampling frequencies, ranging from every 15 minutes to once per month. The variables must be "time-merged" by either interpolating between less frequent measurements, or by averaging frequent samples to obtain fewer values.

Another signal processing task is "signal decomposition". The complex behaviors of the variables of a natural system result from interactions between multiple physical forces. Signal decomposition involves digital filtering to split a signal into sub-signals, called "components", that are independently attributable to different physical forces. Some components are periodic and some are chaotic. The filtering method of choice is frequency-domain filtering. It is applied to a signal after it has been converted into a frequency distribution by Fast Fourier Transform. This allows a signal component that lies within a window of frequencies, for example, the 12.4-hour tidal cycle lies between periods of 12.0 to 13.0 hours, to be excised, analyzed, and modeled independently of other components. Digital filtering also can diminish the effect of noise in a signal to improve the amount of useful information that it contains. Working from filtered signals makes the modeling process more efficient, precise, and accurate.

Step 3. Correlation Analysis and Sensitivity: Estimation Correlation analysis quantifies the relationships between many variables and provides deeper understanding of the data. The computer systematically correlates factors that influence parameters of interest, such as water level, to combinations of controlled and uncontrolled variables, such as river flows, controlled releases, and meteorological conditions. Correlation methods based on statistics and machine learning are applied in combination. Comparing them to known patterns of behavior validates results found by the computer.

Step 4. Predictive Modeling: Using machine learning, a predictive model is developed directly from the data and correlations determined in Steps 2 and 3. To maximize accuracy, the model is constructed from sub-models, which independently correlate periodic and chaotic components. Their outputs are combined to obtain an overall prediction that manifests all of the different forcing functions, represented by input variables, which affect the output variables.

Step 5. Develop Long-Term Water Level Data: For the connection between hydrologic response and the snail kite, many of the water level gages were only operated for a limited number of years. Long-term hind casts of water level for the period of record for the long-term monitors will be produced using the data and correlations of Steps 2 and 3 and predictive modeling of Step 4.

Preliminary ANN models of salinity response for Trout Creek have been built using the 1996-2000 USGS data for the five gaging stations of creeks entering Florida Bay. The database used to build the preliminary model has been updated to include the recently available data for the period 2001 to 2004. We have been working with developers of the SICS model on how results from the ANN models can be used to assist in the calibration and confirmation of the SICS model. Short-term water level data (12 months) at sites instrumented for the Snail Kite study in WCA-3 have been hindcasted to create a 14 year water-level record for analysis. We are developing potential methodologies for estimating water levels and water depths at ungaged areas using ANN models. The approach utilizes static variables of location and percent vegetation and dynamic variables of water levels at known locations.

Process_Date: 2006
Process_Step:
Process_Description:
Estimating water depths and water levels at ungaged locations in the EDEN network:

The approach taken in FY05 will be expanded from a small sub-domain of WCA3a to the domain of the Everglades. A three-step modeling approach will be used to predict water-depths. The first step will be to develop a group assignment model. The model will use static variables of an ungaged site as input variables to determine which group (from the clustering analysis) the site should be assigned. The second model will predict the water-depth using only the static variables of location and vegetation types. Obviously, this model (also called the 'static" model) is not able to predict the dynamic variability of the water depth, but it is able to discriminate general differences in the water-depth variable based on differences in location and vegetation. The static model is used to calculate the residual error (difference between the predicted and measured water depth), which is then modeled by the third model. The third model (also call the "dynamic" model) will use time series of water-depths and static variables to predict the variability in water-depth at each site as characterized by the residual in the static model. The final prediction of water depth at each site is the summation of the water-depth prediction from the static model and the prediction of the water-depth residual from the dynamic model.

Prediction models were developed for 25 recently added marsh gages in EDEN. The models were used to hindcast the water-level records at these sites to be concurrent with the EDEN database from January 1, 2000 to the present. The hindcasted water levels were used to augment the available data for surfacing the water levels of the Everglades.

Process_Date: 2008
Process_Step:
Process_Description:
A Synthesis of Hydrology and Water-Quality Data of A.R.M. Loxahatchee NWR

The Arthur R. Marshal Loxahatchee National Wildlife Refuge is the last of the soft-water ecological systems in the Everglades. Historically, the ecosystem was driven by precipitation inputs to the system that were low in conductance and nutrients. With controlled releases into the canal that surround the Refuge, the transport of water with higher conductance and nutrient concentration could potentially alter critical ecosystem functions. With potential alteration of flow patterns to accommodate the restoration of the Everglades, the Refuge could be affected not only by changes in the timing and frequency of hydroperiods but by the quality of the water that inundate the Refuge.

Steps 1-3 described for FY 2005 will be followed in this study. However, the predictive modeling (step 4) will be limited to three selected water level sites (1-7, 1-8c, 1-9) and three water quality sites (LOX4, LOX5, LOX13). The water-level sites are critical sites for the operation of the regulation schedule and the water quality sites are critical sites for the water quality compliance consent degree. The anticipated results for FY 2006 are the compiled, time synchronized database and the predictive ANN models of the selected water level and water quality stations. The models will provide powerful analysis tools for understanding the dynamics of the system. In particular, 3-dimensional response surfaces showing the interaction of two explanatory variables (such as canal inflow, outflow, canal water level, and rainfall) on a response variable (interior water level, conductance, and phosphorous) will be generated.

Flow and water-level data for WCA-1A were analyzed. A spreadsheet application developed that allows user to analyze the dynamic interaction between flow, water level, and rainfall signals.

Process_Date: 2007
Process_Step:
Process_Description:
Integration of Long-term Hydrologic Data with Snail Kite Study

One of the objectives for FY 2005 was to integrate short- and long-term hydrologic and ecological data for the study of Snail Kites in Water Conservation Area 3a (WCA3a). The monitoring network for the snail kite study has established 17 continuous water-depth monitors to understand differences in hydrology in the study area. In addition to the 17 water-depth gages (short-term, < 2 years of record), there are 3 long-term (>13 years of record) in the study area. To maximize the information content ANN models were developed to predict the water depths at the 17 monitoring stations. These models are used to extend the period of record of the short-term monitoring stations to be concurrent with the three long-term stations.

The Decision Support System (DSS) would allow users to interrogate the historical database and model simulations to better understand the water-depth dynamics of the system. The DSS will read and write files for the various run-time options that can be selected by the user through the system's graphical user interface (GUI). The historical database contains thirteen years of hydrodynamic data that will be used to generate water-level simulations using the 17 ANN models. Using GUI controls, the user can evaluate alternative flow and water-level scenarios. The outputs generated by the ANN models will be written to files for post processing in MS Excel. The DSS will provide streaming graphics during model simulations, visually representing historical and predicted behaviors side-by-side.

The following steps will be taken for the development of the Snail Kite Hydrology DSS .

Build DSS shell to include: 1. GUI for setting all simulation run parameters a. Start and stop time of simulation run b. Source input for simulation run (historical data, percentage of actual, user-defined, external file) 2. Streaming graphics to display model outputs 3. Hydrologic parameter/stastistic a. Calculate statistic b. Provide graphical display c. Tabular output values 4. Output model predictions and simulation parameters Integrate 17 ANN models into DSS Shell 1. Define models and inputs within the DSS 2. Calculate model inputs using user selected data source 3. Run models using iQuest Runtime module 4. Write and display output within DSS application

The USGS-SCWSC continues to provide ecologist at the Florida Cooperative Unit with technical support of the Snail Kite Hydrology Decision Support System (DSS). Substantial enhancements to the DSS were under taken in FY07 to better meet the needs of the plant ecologists. At each continuous monitoring location, vegetation samples are collected twice a year. For each sampling site (over 6,000 sites), ecologists need to know the water-depth hydrograph. The DSS enhancements include the generation of these hydrographs in addition to reading and writing to external databases. Other enhancements include additional statistics (hydro-ecological indices), updating the application with retrained ANN models, generation of elevation transects, and writing of user’s manual for the DSS. The majority of the enhancements have been completed. The ANN models are currently (September 2007) being retrained. The enhancements should be completed in the fall of 2007.

Process_Date: 2008
Process_Step:
Process_Description:
Analysis of Water Level, Streamflow, and Salinity Signals

The salinity dynamics of the five tributary creeks are currently being analyzed. Response surface for the five tributary creeks and for various combinations of explanatory variables are used to evaluate system behavior at the five sites. Comparisons and differences in the process physics, as manifest by the response surface for each tributary, between tributaries will be documented.

The analysis has been completed for the EDEN coastal water-level gages. The approach will be expanded in the FY08 to include all the USGS coastal water-level and water-quality gages maintained by the Ft. Lauderdale and Ft. Myers Offices.

Process_Date: Not complete
Process_Step:
Process_Description:
Hydrology monitoring network : Data mining and modeling to separate human and natural hydrologic dynamics

Work planned for FY2008 includes:

1. The initial approach for developing soft sensors for the EDEN network will be to develop a prototype for one site in the network. An ANN model of the water level for the site will be developed. A automated process will be developed that will pull the necessary input data from the real-time data base, run the ANN model, statistically compare the soft sensor and real-time data, and generate status reports. After completion of the automated process, all the EDEN marsh sites (approximately 150 sites) will be modeled and incorporated into the automated process.

2. The enhancements to the Snail Kite Hydrology DSS will be completed and tested. The development of the DSS application will be documented in a USGS Open-File Report.

To analyze the linkage between changes in hydrology and vegetation; historical hydrologic and vegetation datasets will obtained. Understanding the historic changes in the patterning of the ridge and slough and tree island topology necessitates the understanding of long-term change in hydrology. Currently there are various monitoring networks throughout the Everglades. As one moves back in time to the 1950s, the temporal and spatial extent of water-level gages diminishes greatly. An approach for enhancing the information content of historical databases of monitoring networks is to estimate (or “hindcast”) historical time series using a combination of short-term, for example, 1 to 2 years, and long-term, for example, greater than 5 years, time-series datasets. By developing accurate water-level estimates, contemporary and historical databases can be integrated and used to enhance scientific investigations that seek to link hydrologic and ecological change.

Process_Date: Not complete
Process_Step:
Process_Description:
A synthesis of water-quality data of A. R. M. Loxahatchee NWR

Work planned for FY2008 includes:

ANN model hydrologic and water-quality process model development will continue. In FY2007, a preliminary "model" (tool) was developed to analyze system dynamics by adjusting time delays of inflow, outflows, and precipitation time series. The tool allows an analyst to change time delays of major inputs and outputs to the system and determine how the correlation to water levels change. The tool is a critical first step in developing an accurate empirical model of the system by determining the optimal combination of inputs, outputs, time delays of the data prior to training ANN models. The following steps will be undertaken during the continued development of the process water-level and water-quality model of the Refuge.

1. Development of a Temporal and Spatial Data Viewer A three-dimensional hydrologic and water-quality data viewer will be developed to visualize the historical data over time and by location. The general approach involved overlaying a rectangular grid onto Refuge; aggregating data collection sites into the grid cells; and aggregating measurements from sites within each cell into time steps. The viewer will provide an integrated, interactive environment for exploring and analyzing the data.

2. Correlation Analysis and Sensitivity Estimation Continued correlation analysis quantifies the relationships between many variables and provides deeper understanding of the data. The computer systematically correlates factors that influence parameters of interest, such as water level, conductance, and phosphorous to combinations of controlled and uncontrolled variables, such as inflows, outflows and rainfall. Correlation methods based on statistics and machine learning are applied in combination. Comparing them to known patterns of behavior validates promising results found by the computer. Correlation analysis identifies: a. Relative impact - For example: What variables impact the increased conductance and phosphorous? And to what degree? b. Relationships between controlled (inflows and outflows) and uncontrolled variables (meteorology forcing). c. Quantifiable answers to complex questions such as: What are the critical temporal and spatial relationships between the controlled releases and the water level, conductance, and phosphorous response in the interior of the Refuge? What are the relative impacts of the inflows/outflow locations on these responses?

3. Predictive Modeling Using machine learning, predictive models are developed directly from the data and correlations. To maximize accuracy, the model is constructed from sub-models, which independently correlate periodic and chaotic components. Their outputs are combined to obtain an overall prediction that manifests all of the different forcing functions that are represented by input variables, which affect the output variables. The models of the Refuge will predict water level, conductance, and phosphorous at multiple locations from inputs such as inflow, outflow, rainfall, wind direction and speed. The models will provide powerful analysis tools for understanding the dynamics of the system. In particular, 3-dimensional response surfaces showing the interaction of two explanatory variables (such as canal inflow, outflow, canal water level, and rainfall) on a response variable (interior water level, conductance, and phosphorous) will be generated.

4. Develop the DSS with Water-Level, Conductance, and Phosphorous Optimizer and Write User’s Manual The models will be integrated in a user-friendly Excel/Visual Basic program called a Decision Support System (DSS). The DSS requires no typing to operate or to obtain output. It will contain a historical database (including hind-casted values) to allow for running long-term simulations. Run-time monitoring and DSS output will be in the form of supporting graphics and Excel worksheets. A constrained optimization routine will be integrated into the DSS that will determine the necessary inflows/outflows to meet specified water level, conductance, and phosphorus levels. A user’s manual will describe installation, operation, and features of the Simulator.

Process_Date: Not complete
Process_Contact:
Contact_Information:
Contact_Person_Primary:
Contact_Person: Paul Conrads
Contact_Organization: U.S. Geological Survey
Contact_Address:
Address_Type: mailing and physical address
Address: 720 Gracern Road
City: Columbia
State_or_Province: SC
Postal_Code: 29210-7651
Country: USA
Contact_Voice_Telephone: 803 750-6140
Contact_Facsimile_Telephone: 803 750-6181
Contact_Electronic_Mail_Address: pconrads@usgs.gov

Metadata_Reference_Information:
Metadata_Date: 20081031
Metadata_Contact:
Contact_Information:
Contact_Person_Primary:
Contact_Person: Heather Henkel
Contact_Organization: U.S. Geological Survey
Contact_Address:
Address_Type: mailing and physical address
Address: 600 Fourth Street South
City: St. Petersburg
State_or_Province: FL
Postal_Code: 33701
Country: USA
Contact_Voice_Telephone: 727 803-8747 ext 3028
Contact_Facsimile_Telephone: 727 803-2030
Contact_Electronic_Mail_Address: sofia-metadata@usgs.gov
Metadata_Standard_Name: Content Standard for Digital Geospatial Metadata
Metadata_Standard_Version: FGDC-STD-001-1998
Metadata_Access_Constraints: none
Metadata_Use_Constraints:
This metadata record may have been copied from the SOFIA website and may not be the most recent version. Please check <http://sofia.usgs.gov/metadata> to be sure you have the most recent version.

This page is <http://sofia.usgs.gov/metadata/sflwww/hydro_mon_net.html>

U.S. Department of the Interior, U.S. Geological Survey
Comments and suggestions? Contact: Heather Henkel - Webmaster
Generated by mp version 2.8.18 on Fri Oct 31 08:56:43 2008