SDF Download Page
TOXCST: Research Chemical Inventory for EPA's ToxCast Program
Structure-Index File
..
** Version 3a, revised 12 February 2009
Newly added fields include
v2b Included new InChIKey Standard Chemical Field, and a few structure modifications after additional quality review of chemical inventory against Certificates of Analysis (COAs) of purchased chemicals.
STRUCTURE_InChI and STRUCTURE_InChIKey field entries have been updated to conform to newly released InChI 1.02 Standards .
Quick & Easy File Downloads: FTP Download Instructions
Description
Source Website & Contact
MainCitation
SDF Fields
SDF Content Summary
Version 3 Update
New Users: For general information, see DSSTox Project Goals and About DSSTox. For additional information on DSSTox SDF (Structure Data Format) files and their use in Chemical Relational Databases, see More on SDF and More on CRDs.
Description: EPA's National Center for Computational Toxicology is conducting research to develop the ability to forecast toxicity based on bioactivity profiling: the ToxCast Program for Prioritizing Toxicity Testing of Environmental Chemicals. Ultimately, the goal is to develop methods of prioritizing chemicals for further screening and testing to assist EPA in the management and regulation of environmental contaminants. For more information on the purpose, goals, and parameters of the program, see the Main Citation or ToxCast Publications.
Within the ToxCast Program, data is being generated on a selected chemical library of environmentally relevant chemicals using various high- and medium-throughput screening assays (e.g., cell-based, biochemical, and genomics) that evaluate a broad spectrum of bioactivities potentially relevant to toxicity (see ToxCast Assays). The essential first step is to conduct an initial testing phase using reference chemicals that have an existing, rich toxicological database (i.e., registered chemical pesticide active ingredients). Thus, ToxCast Phase I includes several hundred reference chemicals, representative of differing structural classes and phenotypic outcomes (e.g., tumorigens, reproductive toxicants, neurotoxicants). The objective of Phase I is to develop a set of predictive signatures, derived from ToxCast bioactivity profiles, that identify toxicity potential.
The DSSTox TOXCST Structure-Index File offers the full complement of DSSTox Standard Chemical Fields for the ToxCast Phase I testing - chemical inventory. A field entitled ToxCast_TestingStatus will update the status of these Phase I chemicals as they move into testing and beyond, and to indicate testing status of new chemicals added in subsequent Phases of the ToxCast research program. The DSSTox TOXCST chemical structure inventory will also be directly associated with any published bioassay information generated for this set and made available in public resources, such as PubChem. The DSSTox TOXCST file enables structure-searching (structure/substructure/similarity) capability from the DSSTox Structure-Browser across all DSSTox structure-data files. Linkage of this structure-browser to PubChem, and deposition of the DSSTox TOXCST SDF into PubChem will enable broad structure and analog-searching capabilities across the very large bioassay data inventory maintained within that large public resource. Future enhancements to the DSSTox TOXCST file or related files will include publication of summary toxicity reference data and/or bioassay information as these data become publicly available.
** Announcement: The DSSTox TOXCST Structure-Index File will accompany high-throughput screening and in vivo bioassay data being made available for this chemical inventory to participants of the first ToxCast™ Data Analysis Summit to be held May 14-15, 2009, hosted by EPA's National Center for Computational Toxicology. |
Source Contacts: For questions concerning the EPA ToxCast Program, contact David Dix, email: dix.david@epa.gov or Keith Houck, email: houck.keith@epa.gov.
Source Website: EPA National Center for Computational Toxicology ToxCast Program
Main Citation: Publications reporting use of the DSSTox TOXCST data files are asked to list the full DSSTox file name, including date stamp, and to cite as primary reference the following:
Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, W., Kavlock, R.J. (2007) The ToxCast program for prioritizing toxicity testing of environmental chemicals. Tox. Sci., 95:5-12.
See also: http://nheerlpub.rtord.epa.gov/ncct/toxcast/publications.html
DSSTox Standard Chemical Fields (20) * Substance_modify_yyyymmdd new field added Oct2008
Source_ChemicalName new field added Jan2009
ChemicalReplicateCount new field
Relationship_CID new field
ToxCast_TestingStatus
Note_TOXCST
TOXCST SDF Content Summary - 12 February 2009
TOXCST SDF Content |
Totals_v2b* | Totals_v2c* | Totals_v3a* |
---|---|---|---|
# Records
|
320
|
320
|
320
|
DSSTox Standard Chemical Fields
|
19 |
19 |
20 |
TOXCST Source Fields
|
2 |
2 |
5 |
Total # Fields
|
21 |
21 |
25 |
Chemical Content |
Counts_v2b | Counts_v2c | Counts_v3a |
defined organic |
313
|
313
|
313
|
inorganic |
1 |
1 |
1 |
organometallic |
6 |
6 |
6 |
no structure |
0 |
0 |
0 |
STRUCTURE_TestedForm_ DefinedOrganic: |
|||
parent |
299
|
300
|
299
|
complex |
8 |
8 |
6 |
salt |
6 |
5 |
7 |
salt complex |
0 |
0 |
1 |
TestSubstance_Description: | |||
single chemical compound |
308
|
308
|
309
|
macromolecule |
0 |
0 |
0 |
unspecified or multiple forms |
0
|
0
|
0
|
mixture or formulation |
12
|
12
|
11
|
* contains 309 unique substances (distinct DSSTox_Generic_SIDs and CIDs), 5 sets of duplicate compounds, and 3 sets of triplicate compounds.
Version 3a Update: Newly added Source-specific fields include Source_ChemicalNam, Chemical ReplicateCount, and Relationship_CID. The latter two fields are intended to aid identification of replicates within the data set and to map important relationships between substances, such as parent0-metabolite pairs. Minor corrections to structure-annotation fields resulting from further QC based on review and Certificates of Analysis (COAs), with changes affecting 6 records and documented in the Note_TOXCST field. STRUCTURE_InChI and STRUCTURE_InChIKey field entries have been updated to conform to newly released InChI 1.02 Standards .
File Download Notes: The following files are offered in the DownLoad table below:
Log File (PDF) provides SDF data file version history and summary information (field, chemical counts, etc.), and a description of procedures and quality assurance checks used in SDF file creation;
Structure Data File (SDF) is the main DSSTox product, providing the complete inventory of chemical structures, DSSTox Standard Chemical Fields, and all Source-specific data fields [Note: the structure field is blank for all records containing mixtures or undefined substances];
Data Table MS Excel (MS Office 2003) file contains the full SDF data contents in spreadsheet table form, minus the chemical structure field [file created with CambridgeSoft ChemFinder plug-in to MS Excel 2003];
Structures Table (PDF) file contains a tiled format graphical view of all chemical structures contained in the SDF file, annotated with TestSubstance_CASRN and truncated TestSubstance_ChemicalName field entries for the tested form of the chemical [file created with ACD ChemFolder, ver. 10.01, ACD Labs].
You will need Adobe Acrobat Reader, available as a free download, to view the Adobe PDF files on this page. See EPA's PDF page to learn more about PDF, and for a link to the free Acrobat Reader. |
|
||||||||||||||||||||||||||||||||||||||||||||||||
These files constitute the main DSSTox products. Documentation Files use standard templates, and DSSTox Structure Data Files and DSSTox File Names adhere to strict formatting standards and conventions. For additional information, see More on DSSTox Standard Chemical Fields, Known Problems & Fixes, Chemical Information Quality Review Procedures, and How to Use DSSTox Files.
Quick & Easy File Downloads: FTP Download
Acknowledgements: The DSSTox SDF file of the TOXCST Phase I chemical list was constructed from larger lists of candidate chemicals from the DSSTox inventory of files, EPA pesticide lists, and other candidate chemical lists for which existing toxicological data is expected to be available. Many persons aided in the compilation of such lists, including EPA Office of Pesticide Programs (OPP) personnel. Chemical structure annotation, QA review, and problem resolution were carried out by Maritja Wolf (Lockheed Martin, Contractor for EPA). Stephen Little provided valuable assistance in reviewing and indexing all available Certificates of Analysis for purchase chemicals in this inventory which underwent testing in Phase I of the ToxCast Research Project.
DSSTox Citation: Houck, K., D. Dix, R. Judson, M. Martin, M. Wolf, R. Kavlock, and A.M. Richard (2008) DSSTox EPA ToxCast High Throughput Screening Testing Chemicals Structure-Index File: SDF File and Documentation, Updated version: TOXCST_v3a_320_12Feb2009, www.epa.gov/ncct/dsstox/sdf_toxcst.html
Disclaimer: Every effort is made to ensure that DSSTox SDF files and associated documentation are error-free, but neither the DSSTox Source collaborators nor the EPA DSSTox project team make guarantees of accuracy, nor are any of these persons to be held liable for any subsequent use of these public data. The contents of this webpage and supporting documents have been subjected to review by the EPA National Center for Computational Toxicology and approved for publication. Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use. See additional disclaimers.
EPA/600/C-06/012