national library of medicine sis nav sis nav sis nav sis Home National library of medicine
National Library of MedicineChemIDplus Advanced Application
     News SIS Home  |  Site  |  About Us  |  Contact     
 Help
border
 


Index

Search Input
Spell Checker
Right Truncation
Classification Code
Registry Number
Molecular Formula
Locator
Toxicity Data
Chemical Properties
Structure Search
Molecular Weight Search
Multiple Term Queries

Chemical Identification and Search

This database allows users to search the NLM ChemIDplus database of over 370,000 chemicals. A user may enter compound identifiers such as Chemical Name, CAS Registry Number, Molecular Formula, Classification Code, Locator Code, and Structure or Substructure. New searchable features include search and display by Toxicity indicators such as Median Lethal Dose (LD50), by Physical/Chemical Properties such as LogP, and by Molecular Weight.

The Advanced search page is available at http://chem.sis.nlm.nih.gov/chemidplus/chemidheavy.jsp This URL brings up a search page similar to other TOXNET search pages. Just type the name of a substance of interest to you, or its CAS Registry Number into the search box for simple searches. Right truncation ("starts with") is available by using the asterisk (*) at the end of a search term. The less advanced "Lite" version of this interface is available at ChemIDplus Lite page.

Search Input

The Search Page is divided into 6 areas of input

  1. Substance Identification -- for textual chemical identifiers, such as Name/Synonym, CAS RN, Molecular Formula, Classification Code, and Locator
  2. Toxicity -- for numeric toxicity data such as LD50
  3. Chemical Properties -- for numeric property data such as Melting Point
  4. Locator Codes -- to qualify a search by one or more databases which contain pertinent data
  5. Structure -- to search by chemical structure
  6. Molecular Weight -- to range on the MW calculated from the structure
Each box has an ChemIDplus Heavy info button button to obtain more information, and a reset button ChemIDplus Heavy reset button to clear that section.

For instance if you were interested in the drug "Valium", type this into the search box and hit return or click Search. The resulting retrieval is shown below:



ChemIDplus Heavy input page

The system will search the ChemIDplus database of over 1.3 million names and identifiers, and return the answer if available. For most chemicals this is one record and the system automatically retrieves that record. If multiple records were retrieved, a list of names and structures would be shown. You could then click on the name to select a detailed record.

Here is what would be retrieved for Valium:

ChemIDplus Advanced record for Valium

The buttons on the left allow the display of categories of detailed information, such as the Full Record, the Structure, Names & Synonyms, Formulas, Classification Codes, Registry Numbers, and Notes which are also available in ChemIDplus Lite. This new Advanced version also retrieves the 2D and 3D depictions of chemical structure, as well as toxicity and chemical/physical property data when available. Structures on this page are formatted as PNG images for ease of display, and do not require a plug in.

In the center of the page, Locators for various resources are hyperlinked by an acronym, with a longer name on the right. The i information button button may be clicked to give more information about a resource. There are three types of Locators:

  • File Locator(s) that point to a set of NLM associated databases of interest
  • Internet Locator(s) that point to a set of external resources with biomedical data that are relevant for this chemical, and
  • Superlist Locator(s) that point to a set of regulatory and scientific lists that contain information about this chemical.

Clicking on the Locator hyperlink will open a new window with data from a given resource, as close to the level of the substance as possible. An example for the File Locator HSDB (Hazardous Substances Databank) follows:

Valium Output


This "slave window" for external resources shows the Valium record in HSDB in this case. You can navigate within this HSDB record by clicking on its right hand menu. The slave window is reused each time you click on a new Locator, or data type such as Names & Synonyms. It will normally open in the foreground on the top left portion of your screen. As you can see, the ChemIDplus buttons on the right give more options than ChemIDplus Lite

If you click on "Full Record" you retrieve a of list all data for Valium carried by ChemIDplus. This includes the image of the structure when available. Following is the data seen in this display, some data was edited out for this example:

ChemIDplus Advanced Full Record

Some names edited out

ChemIDplus Advanced Full Record

Spell Checker

Chemical names are often hard to spell and tend to be phoneticized by users into an incorrect name. If you had mistakenly input the name of the compound above as "valeum", the system would not have been able to find it. The system would then have tried to find some near matches for you. As you can see, in this case this includes the correct "valium" name. Just click on one of these names to go to the appropriate record.


ChemIDplus Lite Spell Checker output


Right Truncation

If you pull down the "Starts with" tab after an input name, the system will search for all names that start with that name or fragment of a name. For instance if the beginning of the systematic name of Valium was input as 2H-1,4-Benzodiaz, the system would retrieve multiple answers that fit these criteria and display them as follows:

ChemIDplus Lite right truncation results


You may then select one or more records of interest to you of the 200 retrieved, using the navigation buttons on the left to browse between answers. Each compound contains a name that starts with 2H-1,4-Benzodiaz, even though this may not be the name shown on the retrieval screen. You may also achieve truncation by appending an asterisk to a string as in ChemIDplus Lite. Thus the string 2H-1,4-Benzodiaz* used in the Name Equals setting would also search for right truncation.

You could also use the "Contains" pull down after a name or name fragment. This search takes longer, but would find embedded instances of that input string.

Registry Number

This field searches across several CAS Registry number fields in the ChemIDplus database. Entering the Registry number of valium, "439-14-5" and using the "Registry Number" pulldown would retrieve the valium record just as the name did. The zero filled, non-hyphenated form of "000439145" would also work. There are also superseded RNs under Other Registry Numbers, and Related Registry Numbers.

Classification Code

This field contains category data from several sources for a given chemical. To search it, just pull down the Classification Code tab on the search page, enter your data, and hit enter. For instance, a search for "antineoplastic agents", a MeSH term carried in the Classification Code field, gives the following results:


ChemIDplus Classification Code results

Molecular Formula

This field contains the molecular formula of compounds, that is, a summary of the types of atoms contained it the structure and their counts. They are stored or searched in the HILL convention (carbon, hydrogen first then all other elements alphabetically). ChemIDplus also uses hyphens between elements and their counts. To search this field, pull down the "Formula (hyphenated)" tag, and enter the formula desired. A molecular formula is not a unique piece of data, so you may get multiple results from a search. Here is the result for the formula of TCDD (C12-H4-Cl4-O2).

ChemIDplus Advanced MF results


Locator

This field allows qualification of a search by the presence of data in a given resource. This can be entered free form in the first Data Search Type box using the Locator Code pull down, or in the Locator Codes box on the bottom of the page. This qualification can be used with any other search box as a limiter. In the bottom box, a user may require AND, OR, AND NOT logic to any two locators. Thus the following search would find compounds in CCRIS which are not in HSDB:

ChemIDplus Advanced Locator Search

This gives a retrieval of over 6,000 records which are in the CCRIS file and are not in the HSDB file. These compounds can now be reviewed as we saw above.



Toxicity Data

There is now a table of toxicity values available in ChemIDplus for search and display. A user may search individually on values and ranges throughout this data, and use it to qualify other searches. In most cases a lower dose in mg/kg in a given test regimen means more toxicity in this regimen for this chemical. This data comes from an older subset of the RTECS database from NIOSH. Here are some of the basic definitions:

Toxicity Terms *

Note: These abbreviations indicate whether the dose caused death (LD) or other toxic non-lethal effect (TD), or whether it was administered as a lethal concentration (LC) or toxic concentration (TC) in the inhaled air. In general, the term “Lo” is used where the number of subjects studied was not a significant number from the population or the calculated percentage of subjects showing an effect was 100. The doses and concentrations are defined as follows:

  • TDLo – Toxic Dose Low – the lowest dose of a substance introduced by any route, other than inhalation, over any given period of time and reported to produce any toxic effect in humans or to produce tumorigenic, reproductive, or multiple dose effects in animals.

  • TCLo – Toxic Concentration Low – the lowest concentration of a substance in air to which humans or animals have been exposed for any given period of time that has produced any toxic effect in humans or caused death in humans or animals.

  • LDLo – Lethal Dose Low – the lowest dose (other than LD50) of a substance introduced by any route, other than inhalation, over any given period of time in one or more divided portions and reported to have caused death in humans or animals.

  • LD50 – Lethal Dose Fifty – a calculated dose of a substance which is expected to cause the death of 50% of an entire defined experimental animal population. It is determined from the exposure to the substance by any route other than inhalation of a significant number from that population. Other lethal dose percentages, such as LD1, LD10, LD30, and LD99, may be published in the scientific literature for the specific purposes of the author. Such data would be published in the Registry if these figures, in the absence of a calculated lethal dose (LD50), were the lowest found in the literature.

  • LCLo – Lethal Concentration Low – the lowest concentration of a substance in air, other than LC50, which has been reported to have caused death in humans or animals. The reported concentrations may be entered for periods of exposure that are less than 24 hours (acute) or greater than 24 hours (substance and chronic).

  • LC50 – Lethal Concentration Fifty – a calculated concentration of substance in air, exposure to which for a specified length of time is expected to cause the death of 50% of an entire defined experimental animal population. It is determined from the exposure to the substance of a significant number from that population.


*Source: US Department of Health and Human Services, Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health. (1997). Comprehensive guide to the registry of toxic effects of chemical substances. DHHS (NIOSH) Publication No. 97-119.
Table 1. Test Results
Lethality Extreme High Moderate Low
Oral LD50 <50 mg/kg 50-500 mg/kg 500-5,000 mg/kg >5,000 mg/kg
Dermal LD50 <200 mg/kg 200-2,000 mg/kg 2000-20,000 mg/kg >20,000 mg/kg
Inhalation LC50 <200 mg/m3 200-2,000 mg/m3 2,000-20,000 mg/m3 >20,000 mg/m3

Source: U.S. EPA. Office of Pesticide Programs, Registration and Classification Procedures, Part II. Federal Register 40:28279.
Numerical Toxicity Rating Definitions*
(Human Oral Lethal Dose)
Toxicity Rating or Class Dose for 70 kg. Person (150lb.)
 6  Supertoxic  < 5 mg./kg.  A taste (less than 7 drops)
 5  Extremely toxic  5-50 mg./kg.  Between 7 drops and 1 tsp.
 4  Very toxic  50-500 mg./kg.  Between 1 tsp. And 1 oz.
 3  Moderately toxic  0.5-5 gm./kg.  Between 1oz. and 1 pint (1lb.)
 2  Slightly toxic  5-15 gm./kg.  Between 1 pint and 1 quart
 1  Practically nontoxic  Above 15 gm./kg.  More than 1 quart (2.2 lb)

*Source Gosselin, Robert E., Smith, Roger P., & Hodge, Harold C. (1984). Clinical toxicology of commercial products. Fifth Edition. Baltimore: Williams & Wilkins.

There has been some effort at international harmonization of the meaning of LD50 values. Here is a reference to one publication. http://www.unece.org/trans/danger/publi/ghs/ghs_text-pdf/GHS-PART-3e.pdf

The user may range over the values of these tests in combination with other parts of a search. Here is a search for LD50 Values in Mice given Intraperitoneally, with values between 0 and 30:

ChemIDplus Advanced Toxicity Search

Note that the actual count of each Test, Route, and Species is given in parentheses for guidance. Below is the toxicity data for the first answer of over 3,500 retrieved by this query. It is for the chemical Mitomycin C, and was retrieved by clicking on the Toxicity button. Note that a reference line which meets ANY criteria has that value highlighted in red. Any line that meets ALL criteria is highlighted in yellow. Some references contain a link to a Medline citation, if available.

ChemIDplus Advanced Toxicity Search Results



Chemical Properties

There is searchable Chemical Property Data available for over 25,000 compounds. Some values have been measured and some calculated. Here are definitions for these values:

  • MELTING POINT
    • The temperature at which the solid and liquid phases are in equilibrium at one atmosphere (Boethling, 2000)
    • Used for estimating vapor pressure when the chemical is a solid (Boethling, 2000)
  • BOILING POINT
    • The temperature at which a liquid’s vapor pressure equals the pressure of the atmosphere of the liquid. (Boethling, 2000)
    • Pure chemicals have a unique boiling point which can be used in identification of material. (Boethling, 2000)
    • Used for estimating vapor pressure (Boethling, 2000)
  • WATER SOLUBILITY
    • Affects chemical's distribution between environmental compartments
    • if LOW: Dissolve more slowly and have a stronger tendency to partition out of aqueous solution into other phases (suspended solids and sediment) . (Boethling, 2000)
    • if HIGH: Dissolve freely in water if accidentally spilled and will tend to remain in aqueous solution until degraded. (Boethling, 2000)
    • There is a relationship between Water solubility and the ability to biodegrade.
  • OCTANOL/WATER PARTITION COEFFICIENT
    • Mimics partitioning in biota
    • Ratio of concn of a solute between water and octanol (Boethling, 2000)
    • Used for estimating soil adsorption, bioconcentration, Henry’s Law constant (air/water partition coefficient)
    • log Koc = 0.903 log Kow + 0.094 (Boethling, 2000)
    • Used for estimating bioconcentration, bioaccumulation, and bioavailability
    • log BCF = 0.79 log Kow -0.4 (64FR604)
    • PBT Criteria: BCF values of 100 to 1,000 medium concern; >1,000 high concern (EPA). (64FR604)
    • Other High Concern Values: BCF of 5000 ( NAFTA-CEC); BCF of 5,000 (UNECE-LRTRAP); BCF of 5,000 (CMA) (64FR604)
  • VAPOR PRESSURE
    • Affects whether a chemical in the atmosphere will occur as a vapor or adsorbed to particulate matter.
    • Physical State of Compound: (Bidleman, 1988)
    • "< /=" 10-8 mm Hg Solely in Particulate Phase
    • 10 5, 10 6, 10 7 mm Hg Vapor and Particulate Phases
    • ">/=" 10 4 mm Hg Solely in Vapor Phase
  • ACID DISSOCIATION CONSTANT
    • Indicates whether a chemical is undissociated or occurs as an anion ("acid") or cation ("base").
    • Values of pKa<4 for bases and pKa>8 for acids are not relevant in an environmental context. (Tomlin, 1997)
  • HENRY’S LAW CONSTANT
    • Ratio of a chemical's concentration in the gas phase to that in the liquid phase
    • Volatilization from moist soil and water surfaces: (Lyman et al 1990)
    • <10-7 atm m3/mole Essentially non-volatile
    • 10-7 to 10-3 atm m3/mole May volatilize
    • >10-3 atm m3/mole Will volatilize rapidly
    • Atmospheric removal of particulate-phase compounds: (Lyman et al 1990)
    • Wet deposition: removal from atmosphere by precipitation of rain, fog, snow
    • Dry deposition: removal from atmosphere of particle associated chemical by diffusion/ sedimentation
  • OH RADICAL REACTION RATE CONSTANT
    • The OH radical is the key reactive species in the troposphere All organics react with OH except CFCs and non-hydrogenated Halogens (Boethling, 2000)
    • Used to calculate the half-life of chemicals in the atmosphere (Boethling, 2000)
    • PBT Criteria: Persistence half-life in air = 2 days. (64FR604)
  • REFERENCES:
  • 64FR608. Federal Register. Persistent Bioaccumulative Toxic (PBT) Chemicals; Proposed Rule. January 5, 2002.
  • Bidleman, T.F. Atmospheric processes. Wet and dry deposition of organic compounds are controlled by their vapor particle partitioning. Environ Sci Technol 22: 361-367 (1988)
  • Boethling R.S., and D. Mackay. Handbook of Chemical Property Estimation Methods. Boca Raton, FL: Lewis (2000)
  • Lyman, W.J., et al. Handbook of Chemical Property Estimation Methods. Washington, DC: Amer Chem Soc (1990)
  • Tomlin, C.D.S. (ed.). The Pesticide Manual World Compendium. 11th ed., Surrey, England: British Crop Protection Council. (1997)

Following is a search for compounds with a LogP greater than 5, that is the chemical partitions into fat at a 105 higher concentration than water:


ChemIDplus Advanced LOgP Search

Here are the results of that search. Note that DDT is retrieved, a substance which is known to be stored in the fatty tissues of animals. Also Thioridazine, which is an Antipsychotic agent with Central Nervous System (CNS) activity often associated with lipid solubility:


ChemIDplus Advanced LogP results


For some ranges of small values such as those found in Henry's law, the exponential input format may be used. Thus to find the range shown in the Help of 10-3 to 10-7 (may volatilize), the following input format is appropriate.

ChemIDplus Henry's Law Input

Structure Search

ChemIDplus allows a user to either draw their own structure for searching, or to pull in a structure that has already been retrieved. To transfer the structure of DDT, the transfer icon can be clicked to bring this to the search page for editing.


ChemIDplus Advanced Structure transfer


These are the structure buttons with their functions:

  1. ChemIDplus Advanced Structure transfer button Transfer the current structure to the search page,
  2. ChemIDplus Advanced similarity button Find compounds whose structures are 80% similar to the current compound.
  3. ChemIDplus Advanced enlarge button Enlarge the display, allowing both 2D and 3D display.
A user may also draw a structure using either the Marvin Java applet, or MDL's free ISISDraw to transfer to Chime. For Marvin, it is recommended that the latest version of the Java Virtual Machine be downloaded and installed from SUN. You maybe be prompted to do this, and also to accept Marvin as an applet.

Here is an example of the use of Marvin. The user clicked on the Structure box to invoke the applet:
ChemIDplus Advanced Marvin Input Search

The user has drawn this structure using the tools provided by Marvin (Chime and ISISDraw would be another option). They may then choose a structure search option.

Structure Search Options

     1.   Substructure Search, which will look for the drawn (or transferred) structure embedded in the structures of other compounds. This is useful for finding a set of compounds that share a common "Sub Structure" which may cause good or bad biological activity. An example would be the organophosphate class of pesticides which can be represented by a substructure. Remember, Marvin and ISISDraw both allow indeterminate atoms to be used in substructures.
 
     2.   Similarity Search, which will look for other compounds with structural features that are similar to the those of the drawn (or transferred) structure. The default is to find compounds that are 80% similar. This similarity percentage may be changed to be between 50% and 100% in the pull down box. You many find compounds that are related and are relatively close in overall size using this method. For Instance the structure of the drug "Rabeprazole" is 91% similar to the drug "Omeprazole", which has similar anti ulcer activity. In other cases retrieved compounds may share features but arrange these structural features differently than the parent and thus give different biological activity.
 
     3.   Exact (parent only), which will look for the structure that is drawn (or transferred) as a complete entity, with all the structure's atoms and bonds identical in the retrieved compound.
 
     4.   Flex (parent, salts, mixture), which will look for the structure that is drawn (or transferred), with all bonds identical in the retrieved compound including stereochemical and tautomeric bonds. Flex will also find salts, mixtures, hydrates, and polymers of the parent that is drawn. A Flex search of the structure of Penicillin G will find 28 salts and mixtures including Penicillin G Potassium. If you start with a complex salt or mixture with two or more components, you may only retrieve salts and mixtures that have two or more components. The substance "Clopidogrel bisulfate" will not retrieve the parent "Clopidogrel" in this mode, while the structure of the simpler compound named "Chlordiazepoxide hydrochloride" will find the parent "Chlordiazepoxide" in Flex mode. SUGGESTION: If your input structure has more than two or more components, delete all components but the structure you are interested in before using Flex.
 
     5.   Flexplus (parent, all variations), which will look for the structure that is drawn (or transferred), including compounds that have stereochemical and tautomeric variations in their bonds. It will also find salts, mixtures, hydrates, and polymers of the parent that is drawn, plus compounds containing metal atoms bonded to the parent. Flexplus may find compounds with different biological activity from the parent because of differences in stereochemistry and metal bonding. For instance a search for the structure of "Adenine arabinoside" (Vidarabine), which is an antiviral, will retrieve the record for "Adenine riboside" (Adenosine) which has no such activity. You should use Flexplus with caution if this possibility is a problem for you, and perhaps use Flex instead. SUGGESTION: If your input structure has more than one component, delete all components but the structure you are interested in before using Flexplus.


Molecular Weight Search

For every structure in ChemIDplus, a molecular weight (MW) is calculated and stored. The user should be aware that because of differing drawing conventions, this calculated value may differ from the true MW. For Instance, if a salt with a ratio of two organic ions to one inorganic ions is drawn, the MW will sum the weights of both organic ions, effectively doubling the MW. The user may range over two values (preferred method), or use the greater than, less than, or equals operator. Here is an example of a query to find MWs between 100 and 200. Note that the results show the MW on each record.


ChemIDplus Advanced MW Search

Here are the results of this search showing the first 4 answers:


ChemIDplus Advanced MW Out


Multiple Term Queries

A user may combine query terms from any of the fields in a search. The query term(s) from each section are ANDed together to get a resultant set of answers. Thus a query for compounds with a mouse LD50 less than 100 given interperitoneally with the substructure of pyridine and with data contained in the HSDB file is input as follows:


ChemIDplus Advanced Multi Term In Search

It gives the following 20 results:




ChemIDplus Advanced Multi Term In Search


Remember in searching multiple types of data that compounds for which we do not have data in a given area will be not be retrieved since all fields requested are searched. This is especially true in Physical Properties, which occurs in the least number of records.

The query logic can be reviewed by clicking on either the "Show Query" or "Search History" button. The latter allows all searches with results counts and the ability to rerun them exactly, or modify them.


ChemIDplus Advanced History


ChemIDplus Lite

If you do not need all the features of Advanced ChemIDplus, there is a simpler version of ChemIDplus that only allows Name and Registry Number searching at: http://chem.sis.nlm.nih.gov/chemidplus/chemidlite.jsp

Structure display is by image only, and the structure cannot be manipulated. It is designed to be easy to use to find individual chemicals and be led to other data by the Locator field.


horo rule If you have any comments please send us email at tehip@teh.nlm.nih.gov, or the following customer service number.

Customer Service
National Library of Medicine
8600 Rockville Pike
Bethesda, MD 20894
Telephone: 1-888-FINDNLM (1-888-346-3656)
e-mail:
custserv@nlm.nih.gov
horo rule