spacer image spacer image spacer image spacer image
The ITIS Logo The banner for the About ITIS  page spacer image
spacer image

Integrated Taxonomic Information System
National Museum of Natural History
Washington, D.C.

Data Development History and Data Quality

The Integrated Taxonomic Information System (ITIS) was established in the mid-1990s as a cooperative project among several federal agencies to improve and expand upon taxonomic data (known as the NODC Taxonomic Code[1]) maintained by the National Oceanographic Data Center (NODC), National Oceanic and Atmospheric Administration (NOAA). ITIS inherited approximately 210,000 scientific names with varying levels of data quality from the NODC data set. While many important taxonomic groups were not well represented (e.g., terrestrial insects), the rate of errors and omissions within represented taxonomic groups ranged from relatively low (e.g., few misspellings or occasional typographical errors) to rather high (e.g., many species names without authors or dates, or species assigned to wrong groups).

The ITIS mission is to create a scientifically credible database of taxonomic information, placing primary focus on taxa of interest to North America, with world treatments included, as available. Within this framework, the initial data content development and quality assurance strategy was to begin with the NODC data and proceed on two tracks: (1) adding new names or checklists with a high level of taxonomic credibility, and (2) reviewing and verifying the legacy NODC data, thereby bringing it to a minimal, or higher, standard of data quality. Pending review and improvement, the unverified legacy data have been retained in the ITIS database to meet the needs of ITIS partners and cooperators who use the names and their associated unique identifiers (Taxonomic Serial Numbers - TSNs) in specific applications. Since the 1996 import of the legacy dataset, ITIS has grown to nearly 852,000 scientific names, more than 90% of which have been verified in the literature, leaving about 89,000 names as unverified legacy data.[2]

Although the ITIS database initially was populated with names derived from the NODC Taxonomic Code, it is being expanded to link individual names to one or more credible reference(s) (e.g., print publications, recognized experts, databases). (Those references may or may not also be linked to other names contained within the group, i.e., a family name may be linked to a publication that was used to verify its status and position, but that publication might not reference the subordinate genera or species within the family). This process of verification based on credible references is at the core of ITIS activities.

Depending on the rank of the scientific name (e.g., kingdom, family, subspecies, etc.), each ITIS name record has one to three data quality indicators[3] associated with it:

  • Record Credibility Rating
  • Latest Record Review
  • Global Species Completeness

Every scientific name record in ITIS, regardless of the names rank, has the data quality indicator Record Credibility Rating denoting whether it has undergone internal review. Because the NODC records originally imported into the ITIS database were of unknown quality, each was assigned a Record Credibility Rating of unverified. As these records are reviewed, credibility ratings are changed to either the highest value, verified standards met, or verified minimum standards met. A rating of verified standards met indicates to the user that all elements in the record and the position of the scientific name in the hierarchy are perceived to be accurate and supported by one or more credible references. If data in the record have been reviewed but are incomplete and/or contain accuracy, placement, or nomenclatural issues, or are from a non-peer reviewed source, a rating of verified minimum standards met is assigned. During the process of adding new names to ITIS, some of the unverified legacy records in the same taxonomic group are vetted (i.e., unverified records are verified and the Record Credibility Rating is adjusted). As a result, more than 57% of the original NODC records have now been verified, and efforts to improve the quality of ITIS legacy data continue.

Latest Record Review, a second ITIS data quality indicator, is assigned to records with names at ranks above species (e.g., genera, families, orders), and represents the year that the record was last reviewed by ITIS. For example, if a family name is listed as verified standards met, with a Latest Record Review rating of 1997, a user can assume that the record was reviewed in that year. Users should be aware however, that taxonomic changes might have been published since that review and not yet incorporated into the ITIS files. (Additionally, users can refer to dates of cited publications which provide another indication of the currency of the record.) For original NODC data, all records were initially rated as unknown for this data quality indicator, but are being adjusted as records are reviewed.

The third ITIS data quality indicator, Global Species Completeness, appears on all valid/accepted taxa at the rank of genus or higher. It indicates whether all known, named, modern species (extant or recently extinct) for that taxon were incorporated into ITIS at the time of review. Possible values are complete, partial, or unknown. Note that completeness ratings of unknown were given to the NODC taxa that initialized the ITIS database, and have been adjusted over time to the appropriate level. Both Latest Record Review and Global Species Completeness indicators are used by ITIS in making decisions about the timeliness of peer review of a group.

There are two major web interfaces to the ITIS data set:

The Canadian dataset is synchronized once a month (usually), whereas the main ITIS database is updated as chunks of data pass through an editing and proofing process.

The degree of review and completion in ITIS varies greatly by group. Some groups have had relatively little review at the genus and species level (e.g., Pseudoscorpionida), and others have had more (e.g., Sepiolida). Some taxonomic groups have expanded in coverage by the addition of regional species or lists, but without review of NODC-derived names in the same group; others have had complete world lists added, coupled with complete reconciliation of the unverified legacy records. A summary of updated data and work in progress is available at the main ITIS websites Data Status Page.

ITIS is making steady progress both in expanding coverage and improving data quality. We welcome offers of assistance from the taxonomic community as we continue to improve the database that, by necessity, still contains a large amount of unverified legacy data. Over the next few years we hope to add additional data development personnel and expand cooperation with specialists. While we are currently expanding the database group-by-group and at the request of users and partners, we hope to develop technological solutions that will facilitate and accelerate this process, as well as expedite the task of improving the quality of legacy records. We continue to consider changes that will enrich the display and use of data at the ITIS websites and promote better understanding of data quality.



[1] For an account of the history of the NODC Taxonomic Code, see http://www.nodc.noaa.gov/General/CDR-detdesc/taxonomic-v8.html.

[2] Counts as of Dec 2020

[3] For complete definitions of ITIS data quality indicators see Glossary at http://www.itis.gov/glossary.html

spacer image
spacing image spacing image spacing image spacing image