Role in Data Quality

There are two aspects of quality for Substance Registry Services (SRS): improving the quality of substance identification information at EPA and ensuring high quality data in the SRS.


How does SRS Contribute to Quality Information?

Substance identification is an on-going issue at EPA. SRS provides key support to EPA’s efforts to improve quality substance identification information in three areas:

Standardizing Substance Identification
SRS supports the Chemical Identification Data Standard and the Biological Taxonomy Data Standard. These standards require EPA to establish a set of key fields for each substance to ensure unique identification. Some EPA programs have begun adopting the SRS standardized names, thus reducing the use of synonyms across EPA.

Promoting Accurate Substance Identification
Standardization also promotes improved accuracy. For a subset of substances monitored in the environment, standardization can be difficult. Chemical nomenclature workgroups have striven to identify duplicate substance identities and select valid names for each substance. The result is that, when submitting information about these substances, states and other parties can use the standard identification, thus improving the quality of the incoming data. Additionally, states have begun using SRS to validate substance identification in their own data systems, moving EPA and states toward greater consistency regarding substance information.

Mapping Substance Identification
Over time, Congress has passed different statues to address various environmental concerns. EPA has implemented programs and data systems to implement those environmental statutes. Because each was developed independently of the others, there was no consistency of substance names; each statute and each data system often utilized different synonyms for the same chemical. Mapping substance identification across these statutes and EPA data systems is a key quality concern that is resolved through SRS.


Return to Top

How is the Quality of the Information within SRS Ensured?

Quality for SRS data is an on-going effort. With approximately 100,000 records in SRS as of June 2008, there are enormous opportunities for error. EPA focuses on quality in three areas for SRS:

Quality of Core Substance Information
Core information is the fundamental metadata about a substance. These data items remain static regardless of environmental statute or EPA system. Examples of core metadata are the SRS Registry Name, the molecular weight, and the EPA Identifier.

The SRS Registry Name (a standard name EPA adopts for each chemical and biological organism) requires high quality. To determine these names and to ensure their accuracy, there are workgroups with representatives from EPA programs and state agencies that meet monthly. The workgroup participants include chemists, staff with extensive knowledge of laboratory analyses, plus other staff with long experience. This complement of diverse skills is necessary for making sound decisions.

Other core data intrinsic to the substance (e.g., molecular weight and molecular formula) are not generated by EPA but are maintained in SRS. To keep this information current and accurate, EPA employs various processes for quality checking and updating of the information.

Maintaining the substance lists in SRS is another area that demands quality assurance. Discovering which substances are named in a particular statute or which substances are tracked by a certain EPA database is a principal use of SRS. EPA has determined that the best approach to managing the substance lists is through stewardship. Each substance list has at least one steward who manages a specific list within SRS. Normally, the steward is from the organization that is responsible for, or has the best information about, the substance list.

Assessing Quality of EPA Synonyms
Future plans include evaluating the quality of the synonyms in the SRS. These synonyms, whether found in environmental statutes or in EPA data systems, are not always correct. A name in an EPA data system, submitted by a facility or other organization, may be misspelled. An environmental law may have used an ambiguous or inaccurate synonym. A review and quality assignment rating (e.g., valid, misspelled, ambiguous) of each synonym will help the users of SRS to decide whether or not to adopt a particular synonym.

Value of Information in SRS
Quality also means ensuring that the information in SRS is of value to users. Since one of the most widely used features of SRS is the fact sheets about substances, SRS will link to additional internal and external sources that provide fact sheets or other documentation about substances.

SRS will also either store other federal agencies’ substance identification information or create links to their substance registries. The result will be the ability to go to SRS as a one-stop registry to discover substance information for the entire Federal government.


Return to Top

Local Navigation