SMILES strings - 237,771 structures in
SMILES
format. This database contains essentially all open structures in the NCI database up until about June, 1995. It
includes metal-containing compounds and other 'weird stuff'. It is therefore up to the user to ascertain the
usefulness of any of these SMILES strings for the intended purpose. Because different conversion programs produce
different output, two versions of the SMILES database are provided. We'd like to hear comments (send to
Marc C. Nicklaus) on how useful either one of these versions turns out to be.
-
Babel-converted version. 4.2 MB compressed
using standard Unix compress, uncompresses to ca. 15 MB.
The program Babel v. 1.6 was used to convert 3D coordinates, which had been generated by the program Corina v. 1.7 from the
connection tables. (Babel needs 3D coordinates when reading SD files.) The resulting Babel output was
modified by simple string substitution to solve the problem of nitro groups lacking formal charges, which
leads many SMILES readers to create an -N-O-H group. Thus, N(=O)O was replaced by [N+](=O)[O-], and N(=O)(O)
was replaced by [N+](=O)([O-]).
-
CACTVS-converted version. 4.4 MB compressed
using standard Unix compress, uncompresses to ca. 15 MB.
The program CACTVS v. 3.2 was
used to convert the connection tables to SMILES strings. Thanks to Wolf-Dietrich Ihlenfeldt for providing us with the conversion
scripts handling the formal charge problem and other 'unusual stuff' in the NCI database.
Other information Available on these compounds.