Hodes model for ranking small molecule stuctures

In 1981, the selection of compounds of interest for DTP’s repository was greatly enhanced by the use of a clustering model developed by Louis Hodes. The model is capable of measuring the novelty of a chemical structure by comparing it to known compounds in DTP’s structural database.1

The process for clustering involves generating appropriate descriptors for each compound in the data set, selecting an appropriate similarity measure, using the appropriate clustering method, and analyzing the results. DTP’s clustering model used a nonhierarchical, single-pass method to determine which cluster a compound should be assigned to. The descriptors of each compound were weighted by the occurrence in each compound, the size of the fragment, and the frequency of occurrence in the data set. Unlike most clustering models, the Hodes model uses an asymmetric coefficient to determine similarity.

The Hodes model was used not only to determine the novelty of a compound offered for evaluation, but also to determine a representative sample of the entire database, which, at the time, contained structures for more than 350,000 compounds. Representative samples are useful for multiple purposes, including the selection of samples for small screening sets representing the "chemical space" of the entire compound set.

1 Hodes L. Clustering of a large number of compounds. 1. Establishing the method on an initial sample. J Chem Inf Comput Sci 1989;29:66–71.

Clustering a large number of compounds. 1. Establishing the method on an initial sample.

Clustering a large number of compounds. 2. Using the Connection Machine.

Clustering a large number of compounds. 3. The limits of classification.

 

The Hodes clustering model revolutionized the selection of compounds of interest by measuring the novelty of a chemical structure by comparing it to known compounds.

 National Cancer Institute National Institutes of Heatlh Department of Health and Human Services FirstGov  

 

 

 

 

 

Back to DTP timeline DTP Home FDA-Approved Drugs Cancer.gov U.S. National Institutes of Health