Jump to main content.

SDF Download Page

NTPHTS: National Toxicology Program High Throughput Screening Project
Structure-Index File

** Version 2b DSSTox Structure-Index File, updated 15 February 2008
bullet Additional QA
Includes new InChIKey Standard Chemical Field

Update (21 April 2008): Instructions for accessing PubChem Bioassay Data associated with the NCGC NTPHTS 1408 structure-index file.

Quick & Easy File Downloads: FTP Download Instructions

blue bullet graphic Description
blue bullet graphic Source Website & Contact
blue bullet graphic Retrieving PubChem Bioassay Data
blue bullet graphic SDF Fields
blue bullet graphic SDF Content Summary
blue bullet graphic Version 2b Update

blue bullet graphic SDF Download Table

blue bullet graphic Acknowledgements, DSSTox Citation & Disclaimer

New Users: For general information, see DSSTox Project Goals and About DSSTox. For additional information on DSSTox SDF (Structure Data Format) files and their use in Chemical Relational Databases, see More on SDF and More on CRDs.

Description: The National Toxicology Program (NTP) has initiated a High Throughput Screening (HTS) Project exit EPA to explore new approaches to evaluating chemicals across a spectrum of high-throughput biological assays. Assays are being selected based on their potential be informative of animal bioassay results and relevant to human health risk assessments. As an initial phase of this project, the NTP has provided a set of 1408 chemicals from NTP inventories for HTS in bioassays relevant to toxicology, to the NIH Chemical Genomics Center (NCGC), part of the NIH Molecular Libraries & Imaging Roadmap (MLR) Initiative exit EPA. Assays are described and assay results reported in PubChem for this NTPHTS chemical data set in the same manner as for compounds from the Molecular Libraries Small Molecule Repository exit EPA. For an early interview on this NTP HTS project, see: Newsletter of the Society for Biomolecular Sciences - Chris Portier: HTS Takes US National Toxicology Program to Next Levelexit EPA.

The DSSTox project is collaborating with the NTP HTS project to provide structure-annotation and chemoinformatics support for this effort. Drawing largely from the contents of the existing NTPBSI Structure-Index Locator File, with many of the 1408 NTP HTS chemicals having been used in historical NTP toxicity studies, the DSSTox NTPHTS Structure-Index File provides the full complement of DSSTox Standard Chemical Fields for the NTP HTS chemical set. Additionally, the DSSTox NTPHTS file includes the NCGC Source PubChem_CID (Chemical Identifier) and PubChem_SID (Substance Identifier) code for each chemical record listed (NCGC is the Source Substance depositor for the Bioassay data associated with the NTPHTS structures). These PubChem_SID codes correspond directly to the relevant PubChem SID record that contains the corresponding NTP HTS assay results deposted by the NCGC.

Maintenance of this DSSTox NTPHTS file will be coordinated with expansion of the NTP HTS program. Inclusion of the DSSTox NTPHTS structure-index content in the DSSTox Master Structure-Index File additionally allows linkages to be made to other DSSTox Structure-Data Files, including the NTPBSI Structure-Index Locator File which links to the chemical-specific content of the NTP Bioassay On-line Database. See Coordinating Public Efforts and Work in Progress for additional information on DSSTox NTP collaborations.

Note: The DSSTox NTPHTS (1408) substance/structure inventory is also deposited in PubChem as part of the DSSTox Source Substance listing (and provides links to the NTP website and to this webpage); however, DSSTox Substances are not directly linked to NCGC Bioassay data since the latter were deposited by the NCGC Source.

Return to the list aboveReturn to Top


Source Website: NIEHS’s National Toxicology Program, NTP High Throughput Screening Projectexit EPA

Source Contact: National Institutes of Environmental Health Sciences, National Toxicology Program: Ray Tice, email: tice@niehs.nih.gov; Cynthia Smith, email: smith19@niehs.nih.gov

Return to the list aboveReturn to Top


Retrieving PubChem Bioassay Data (updated 21 April 2008)

A full list of NTP HTS bioassays for which data are available in PubChem exit EPA can be obtained from the following keyword search:

arrow Go to the PubChem Home Page exit EPA
Under the header
PubChem Text Search, select PubChem Bioassay and in the box to the right enter the keywords ntp ncgc

arrow Listed will be all PubChem Assays IDs (AIDs) in which the DSSTox NTPHTS chemical inventory has been tested by the NIH Chemical Genomics Center (NCGC) exit EPAexclusively for this 1408 set.
(as of 21 April 2008, there are 65 assays listed)

arrow To download data for a single assay, click on any "AID link"

arrow Bioassay Summary: Bioassay Results: Data Table - All Substances Select

arrow Bioactivity Analysis: Data Table Select Tab

arrow Result Display Option -- Group Results By: Substance
arrow Compound Duplicate Test Option: Test Results:  Flag Discrepancies
arrow Apply
arrow Choose Tab: Data Table, Concise or Data Table, Complete

At bottom of page, Results Exports: Bioassay CSV
... at bottom of page (save CSV file as xls file to view in Excel)

HTS bioassays used to screen this NTPHTS set are further described in PubChem on these pages and some assays are being used to screen larger sets of chemicals, with data also retrievable within PubChem using the appropriate Bioassay search criteria.

For example, HTS data for NTPHTS has been generated by the NCGC exit EPAfor 62,237 substances in the NCGC Small Molecule Inventory:

Cell viability assay (AID 559): http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=559


Alternatively, to batch retrieve Pubchem NCGC-NTPHTS substances and corresponding Assay data:

Download the following Substance ID list file (save to a folder on your desktop):

(file contains PubChem SIDs registered to NCGC for retrieving bioassay data for current NCGC NTP assays)

arrowGo to Entrez Batch exit EPA

arrowSelect "Pubchem Substance" in the "Database" drop down list.
arrowBrowse to your desktop to retrieve file NTPHTS_NCGC_1408sids.txt.
arrowClick "Retrieve" to retrieve records
arrow W hen page is returned, click on "Retrieve records for 1408 UID(s)".

All 1408 NCGC Substances corresponding to the NTP HTS set will be listed:

arrow Click on Links: ... "Bioassays" (and in the Pull down menu, select "PubChem Bioassays") to retrieve a full list of the bioassays in which these PubChem SIDs have been tested.

arrow Select some or all of the listed bioassays, then click on Bioassay Data Table to retrieve results.

arrow At the bottom of the Bioassay Results Page, select "Result Export: Bioassay" with export option "CSV" to export all the data to a CSV file which can be easily imported into or saved as an MS Excel file.

Return to the list aboveReturn to Top


NTPHTS SDF Fields (22 total)

DSSTox Standard Chemical Fields (19) * STRUCTURE_InChIKey field added in v2b


NTPHTS SDF Content Summary - 15 February 2008

Totals_v1a Totals_v2a Totals_v2b
# Records
DSSTox Standard Chemical Fields
NTPHTS Source Fields
Total # Fields
Chemical Content
Counts_v1a Counts_v2a Counts_v2a
defined organic
no structure
salt complex
single chemical compound
defined mixture or formulation
* (NA)
* (NA)
undefined mixture
* (NA)
* (NA)
unspecified or multiple forms
mixture or formulation
* (NA)

* (NA) = field entry not applicable for DSSTox file version indicated

Note: NTPHTS_v1a contains 55 chemical records with duplicate structures (duplicate DSSTox_CID); 1 pair of these consists of a monomer and polymer with same STRUCTURE represented, whereas 54 of these pairs are ostensibly the same Test Substances (same TestSubstance_CASRN and TestSubstance_Description), but were obtained from different lots or chemical manufacturers (different Lot and Vial numbers are maintained in internal NTP tracking identification). Because these chemical substances are provided as distinct chemical samples for HTS evaluation, they must be differentiated as separate chemical records in the DSSTox SDF, as well as by distinct DSSTox_RID and PubChem_SID assignments. The HTS data for these duplicates will be presented as separate PubChem SID listings, but will be grouped for easy viewing and comparison by common DSSTox_Generic_SID and DSSTox_CID values. For more information on accessing NTP HTS data within PubChem, see Searching DSSTox Files in PubChem.

Return to the list aboveReturn to Top


SDF Version History: An initial list of NTP HTS chemical names and CASRN was provided by NTP Sources Brad Collins and Cynthia Smith. Based initially on CASRN matching, DSSTox Standard Chemical Fields were populated largely from the existing DSSTox Master File content (see also Chemical Information Quality Review Procedures), with approximately 200 substances not contained in NTPBSI. Some corrections were applied to the original NTP HTS Source information (and communicated back to the NTP Source) and information on new substances not already occurring in the DSSTox Master File were entered into the Master FIle and NTPHTS after thorough Chemical Information Quality Review Procedures.

Version 2a Update: NTPHTS_v2a has no additional substance records, but includes includes minor QA corrections and two additional field, PubChem_CID and Note_NTPHTS. Changes to DSSTox Standard Chemical Fields include new ID fields: DSSTox_RID, DSSTox_Generic_SID and DSSTox_FileID (replacing DSSTox_SID and DSSTox_ID_FileName - see More on Standard Chemical Fields). Entries in theTestSubstance_Description field also have been simplified.

Version 2b Update: NTPHTS_v2b includes 1 structure modification and the new STRUCTURE_InChIKey field (25 character abbreviated InChI for use in structure-indexing applications) has been added as a DSSTox Standard Chemical Field to all DSSTox files.

File Download Notes:
The following files are offered in the DownLoad table below:

Structure Data File (SDF) is the main DSSTox product, providing the complete inventory of chemical structures, DSSTox Standard Chemical Fields, and all Source-specific data fields [Note: the structure field is blank for all records containing mixtures or undefined substances];
Data Table MS Excel (MS Office 2003) file contains the full SDF data contents in spreadsheet table form, minus the chemical structure field [file created with CambridgeSoft ChemFinder plug-in to MS Excel 2003];
Structures Table (PDF) file contains a tiled format graphical view of all chemical structures contained in the SDF file, annotated with TestSubstance_CASRN and truncated TestSubstance_ChemicalName field entries for the tested form of the chemical [file created with ACD ChemFolder, ver. 10.01, ACD Labs].

You will need Adobe Acrobat Reader, available as a free download, to view the Adobe PDF files on this page. See EPA's PDF page to learn more about PDF, and for a link to the free Acrobat Reader.

Zip files may be decompressed using a utility such as WinZip. Exit EPA Disclaimer

File Types   Description File Size Format

Data Files: NTPHTS
SDF Structure Data File  
Included in Zip file.
• Data Table
(no structures)
  Included in Zip file.
• Structures Table   pdf document icon
file error report graphic link to submit error report form    

These files constitute the main DSSTox products. DSSTox Structure Data Files and DSSTox File Names adhere to strict formatting standards and conventions. For additional information, see More on DSSTox Standard Chemical Fields, Known Problems & Fixes, Chemical Information Quality Review Procedures, and How to Use DSSTox Files.

Quick & Easy File Downloads: FTP Download

Acknowledgements: The DSSTox SDF file of the NTP HTS chemical list was originally provided by Cynthia Smith and Brad Collins of the NTP. Initial structure annotation, QA review, problem resolution, and subsequent QA review, field additions, and structure modifications to NTBSI_v2 were carried out by Maritja Wolf (Lockheed Martin, Contractor for EPA). In addition, we thank Ajit Jadhav (NIH, NCGC) for assistance in implementing the NTPHTS structures file with the NTP HTS assay data, and for providing instructions on batch retrieval of bioassay data in PubChem.

DSSTox Citation: Smith, C., B. Collins, R. Tice, M.A. Wolf, and A.M. Richard (2006) DSSTox National Toxicology Program High Throughput Screening Structure-Index File: SDF File and Documentation, Updated version: NTPBSI_v2b_1408_15Feb2008, www.epa.gov/ncct/dsstox/sdf_ntphts.html

Disclaimer: Every effort is made to ensure that DSSTox SDF files and associated documentation are error-free, but neither the DSSTox Source collaborators nor the EPA DSSTox project team make guarantees of accuracy, nor are any of these persons to be held liable for any subsequent use of these public data. The contents of this webpage and supporting documents have been subjected to review by the EPA National Center for Computational Toxicology and approved for publication. Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use.


Return to the list above Return to Top

Local Navigation

Jump to main content.