Quantitative Structure Activity Relationship
Introduction
Quantitative Structure Activity Relationships (QSARs) are mathematical models that are used to predict measures of toxicity from physical characteristics of the structure of chemicals (known as molecular descriptors). Acute toxicities (such as the concentration which causes half of fish to die) are one example of toxicity measures which may be predicted from QSARs. Simple QSAR models calculate the toxicity of chemicals using a simple linear function of molecular descriptors:Toxicity = ax1+bx2+c
where x1 and x2 are the independent descriptor variables and a, b, and c are fitted parameters. Examples of molecular descriptors include the molecular weight and the octanol-water partition coefficient. Additional examples are provided in our Molecular Descriptors Guide (PDF) (47 pp, 279 KB).Uses of QSAR toxicity models
- QSAR toxicity predictions may be used to screen untested compounds in order to establish priorities for traditional bioassays which are expensive and time-consuming.
- QSAR models are useful for estimating toxicities needed for green process design algorithms such as the Waste Reduction Algorithm.
Objectives
- Develop quantitative structure activity relationship (QSAR) methodologies to estimate toxicity from molecular structure
- Develop software, Toxicity Estimation Software Tool (T.E.S.T.) that will enable users to easily estimate toxicity from molecular structure
QSAR Methodologies
Several QSAR methodologies have been developed:
- Hierarchical method - The toxicity for a given query compound is estimated using the weighted average of the predictions from everal different models. The different models are obtained by using Ward’s method to divide the training set into a series of structurally similar clusters. A genetic algorithm based technique is used to generate models for each cluster. The models are generated prior to runtime.
- FDA method - The prediction for each test chemical is made using a new model that is fit to the chemicals that are most similar to the test compound. Each model is generated at runtime.
- Single-model method - Predictions are made using a multilinear regression model that is fit to the training set (using molecular descriptors as independent variables) using a genetic algorithm based approach. The regression model is generated prior to runtime.
- Group contribution method - Predictions are made using a multilinear regression model is fit to the training set (using molecular fragment counts as independent variables). The regression model is generated prior to runtime.
- Nearest neighbor method - The predicted toxicity is estimated by taking an average of the 3 chemicals in the training set that are most similar to the test chemical.
These methodologies are explained in detail in the below publication.
Toxicity Estimation Software Tool (T.E.S.T.)
T.E.S.T. will enable users to easily estimate acute toxicity using the
above QSAR methodologies. The software is now available for download.
The software is described in further detail in the User's Guide (PDF) (46 pp, 317 KB). The software is based on the Chemistry Development Kit , an open-source Java library for computational chemistry.
Currently, the software includes models for the following endpoints:
- 96-hour fathead minnow 50% lethal concentration (LC50)
- Tetrahymena pyriformis 50% growth inhibition concentration (IGC50) - The University of Tennesse Tetratox database
- Oral rat 50% lethal dose (LD50) - US National Library of Medicine - ChemIDplus Advanced database
Models for additional endpoints will be added as they are completed.
Get email alerts when publications are posted.
Download T.E.S.T (version 2.0):
- Windows EXE (1 file, 68 MB)
- MacOSX ZIP (12 items, 67 MB)
- Linux BIN (71 MB)
The training and prediction sets (ZIP) (6 items, 1.8 Mb) used in the software are available.
Sample structure data files (ZIP) (4 items, 3 Kb) (such as a MDL SD file) are available.
System requirements
- Java version 1.5 or higher
- Memory:
- For Windows XP®, 1 GB of RAM is recommended.
- For Windows Vista®, 2 GB of RAM is recommended.
Installation Instructions
- Save the appropriate installation file to your hard drive. Due to the large size of the file, the download may take 15 minutes or longer depending on the speed of the connection.
- Double-click the installation file (for Linux users: open a shell, cd to the directory where you downloaded the installer and at the prompt type: sh ./install.bin).
Publications
Martin, T.M., P. Harten, R. Venkatapathy, S. Das and D.M. Young. (2008) "A Hierarchical Clustering Methodology for the Estimation of Toxicity." Toxicology Mechanisms and Methods, 18:251–266.
Martin, T.M., and D.M. Young. (2001) "Prediction of the Acute Toxicity (96-h LC50) of Organic Compounds in the Fathead Minnow (Pimephales Promelas) Using a Group Contribution Method." Chemical Research in Toxicology, 14:1378–1385.
Contact
Todd Martin, PhD.
Research Chemical Engineer