U.S. Department of Energy

Human Genome 1993 Program Report: Sequencing Technologies

Date Published: March 1994


For a printed copy of this document, contact
Human Genome Management Information System
Oak Ridge National Laboratory
1060 Commerce Park, MS 6480
Oak Ridge, TN 37830
423-576-6669, Fax: 423-574-9888
Internet: bkq@ornl.gov


Sequencing Technologies

Projects New in FY 1993

Sequencing by Hybridization: Development of an Efficient Large-Scale Methodology


Radomir Crkvenjakov
Center for Mechanistic Biology and Biotechnology; Argonne National Laboratory; Argonne, IL 60439-4833
708/252-3161, Fax: -3387, Internet: crk@everest.anl.gov

We proposed DNA sequencing by hybridization (SBH) in 1987. Steady progress in research and theory, including the sequencing of an unknown short (343-bp) DNA by this method, opens the way for rapid development and laboratory-scale implementation of the SBH approach. To achieve our research objective of developing potential daily SBH rates of up to 1 Mb per laboratory, we are exploiting SBH Format 1, in which DNA samples arrayed on a surface are sequentially interrogated by oligonucleotide probes.

This strategy is based on the development of a high-throughput line for simultaneous production of hybridization scores on hundreds of thousands of 1- to 2-kb clones. DNA sample preparation and dense offprinting on filters, hybridization, and imaging are highly parallelized and streamlined for easy automation. A throughput capacity of 1 million scores/d is projected for the next year, increasing to 10 million/d in the near future.

Three levels of sequencing information can be obtained depending on the numbers of probes scored per clone in an experiment. Mapping and identification using clone sequence smgnatures can be achieved with relatively few (50 to 200) probes. Positioning and identifying genome structural elements (partial sequencing) requires more-extensive hybridizations. Complete sequencing by SBH requires data from several thousand probes, either on three to five related genomes or, in the case of single genomes, in combination with single-pass gel sequencing of one genome equivalent.

In an orderly progression toward complete sequencing, we have almost completed the development of SBH for the first group of applications. Typing of 20,000 cDNA clones from human brain with 60 to 110 probes led to grouping them into at least 5000 gene clusters, revealing the abundance structure of the libraries used. A model experiment on known clones simulated the cosmid-sized DNA subclone library of ten equivalents. This experiment demonstrated that SBH data from 110 probes can lead to clone clustering so that the entire DNA is represented in a one- to two-equivalent set of clones drawn from the clusters. This can reduce the redundancy of gel sequencing by five- to tenfold. The principle of partial sequencing was demonstrated by identifying the gamma-actin cDNA cluster only on the basis of its hybridization scores.

Intermediate-term goals are to (1) prepare sequence-ready maps of 1- to 2-kb subclones of human cosmids or bacterial artificial chromosomes and of several related bacterial genomes; (2) identify partially sequenced cDNAs in previously sequenced libraries to avoid redundancy in gene discovery and efficiently provide cDNAs from as-yet-unknown genes for complete sequencing; and (3) starting from the above maps, combine hybridization data from 3000 probes and single-pass gel sequencing to obtain very accurate finished sequence at a scale of 5 to 20 Mb/year.

Coupling Sequencing by Hybridization with Gel Sequencing for Inexpensive Analysis of Genes and Genomes


Radoje Drmanac, Snezana Drmanac, and Ivan Labat
Integral Genetics Group; Center for Mechanistic Biology and Biotechnology; Argonne National Laboratory; Argonne, IL 60439
708/252-3175, Fax: -3387, Internet: rade@everest.bim.anl.gov

Since 1987 when we conceived sequencing by hybridization (SBH), we have developed several procedures and concepts that enable immediate use of the method as well as future "chip"-based technologies. In particular, hybridization conditions were defined and proved by correct sequencing of 343 bp in a blind test. Solutions for inexpensive, large-scale genome analysis with state-of-the-art technologies are represented by (1) partial sequencing or fine structural (and sequence-ready) mapping with 100 to 1000 probes and (2) full sequencing by integrating the incomplete gel and SBH data from single or several similar genomes. A basis for genome sequencing without subcloning is provided by Format 1 (an array of DNA samples) or Format 2 (an array of probes) sequencing chips based on microbeads, and by a recently proposed combination of the two formats. Format 3 (combinatorial chip) involves ligation of arrayed probes and probes in solution.

To implement Format 1, we have developed a data-production line with the present capacity of 1 million clone-probe measurements/d. A high-throughput polymerase chain reaction (PCR) procedure is established using BioOvens. Biomek 1000 is adapted to spot 31,000 DNA samples on a 6- by 9-in. filter. This dot density provides 50 Mb of DNA per membrane, ready for fine mapping and sequencing. Development of a hybridization machine with a capacity of 24 filters is in progress. The PhosphorImager is used to collect data from 33P-labeled probes and our image-analysis program to report dot intensities. Priorities for upgrading current facilities toward a capacity of 10 million scores/d are an automated setting of 10,000 PCR reactions/d, labeling of 100 probes/d, and robotized retrieval of selected subsets of clones.

By the described setup, 20,000 cDNA clones from a brain library (M. B. Soares, Columbia University) have been hybridized with 256 probes. About 13,000 groups or single clones have been recognized by our clustering program. Screening provides a rational choice of clones for gene mapping and full sequencing. The method's simplicity allows inexpensive screening of millions of cDNAs from dozens of tissues. Our first target is 100,000 clones from the brain library. To demonstrate sequence-ready mapping (ordering of shotgun clones), 1100 M13 subclones from a cosmid (B. Koop, University of Victoria, Canada) have been hybridized with 250 probes, and screening a shotgun library of the 2-Mb genome of the archebacteria Pyrococcus furiosus (F. Robb, University of Maryland, Baltimore) has been started.

The next target is a proof of the proposed inexpensive sequencing scheme, which requires 3000 probes and targeted single-pass gel sequences with as much as 20% errors. A further advancement would be comparative sequencing of 4 similar bacterial genomes. Megabase sequencing based on reading 14-mers through ligation of back-to-back hybridized 7-mers will be investigated in parallel.

Sequencing By Hybridization With Oligonucleotide Matrices (SHOM)


Andrei Mirzabekov, Yuri Lysov, Eduard Kraindlin, Gennadi M. Ershov, and Vladimir Florentiev
Engelhardt Institute of Molecular Biology; 117984 Moscow, Russia
+7-095/135-2311, Fax -1405, Internet: amir@imb.msk.su

Sequencing by hybridization with oligonucleotide matrix (SHOM) by this research group has led to the development of sequencing "microchips." These microchips consist of glass plate covered with polyacrylamide gel squares (about 30 x 30 micro m) that are 20 micro m thick and contain certain chemically immobilized octanucleotides. Hybridization with DNA fragments can discriminate among perfect duplexes, duplexes containing single internal mismatches, and major parts of duplexes containing terminal mismatches. Developed procedures have been used successfully in model experiments to sequence a heptadecanucleotide and localize a single base change in three other heptadecanucleotides. A theory has been developed to describe DNA hybridization with gel-immobilized oligonucleotides. The theory predicts the apparent thermostability of duplexes and the thermostability dependence on concentration of immobilized oligonucleotides, gel thickness, and washing time [K. Khrapko et al., DNA Sequence 1, 375-88 (1991); M. Livshits et al., "Dissociation of DNA Duplexes with Gel-Immobilized Oligonucleotides" (in preparation)].

Prototype automatic sequencing equipment has been created and tested. This equipment consists of a thermostated plate and a fluorescent microscope with a charged-coupled-display (CCD) camera connected to a computer for measuring hybridization of fluorescently labeled DNA with immobilized oligonucleotides. Software has also been developed for image analysis of the pattern of hybridized octamers and for DNA sequence reconstitution.

A "continuous stacking hybridization" approach, which makes the efficiency of a matrix of immobilized octanucleotides as high as the efficiency of a tridecanucleotide matrix, has been suggested. The procedure is based on additional rounds of hybridizing the octamer matrix with chosen fluorescently labeled pentanucleotides and unlabeled DNA. Computer simulations have shown that several rounds of continuous stacking hybridization of DNA with an octanucleotide matrix in the presence of a mixture of preselected pentanucleotides imitates hybridization with a tridecanucleotide matrix and thus can be effectively used to sequence DNA that is several thousand nucleotides long [Yu, Lysov et al., "DNA Sequencing by Hybridization to Oligonucleotide Matrix: Calculation of Continuous Stacking Hybridization Efficiency" (in preparation)]. The use of gel to immobilize oligonucleotides provides the important possibility of increasing the capacity of matrices for immobilized oligos and equalizing the thermostability of G+C- and A+T-rich duplexes. Our future efforts will be concentrated on optimizing conditions, materials, equipment, and software so that SHOM can internally sequence millions of bases per day of near-nonrepetitive DNA several thousand nucleotides long.

*Development of a Simple and Rapid Technique for Sequencing DNA Fragments


Oleg I. Serpinsky, Galina F. Sivolobova, Galina V. Kochneva, Ilnur H. Urmanov, Victor N. Krasnikh, and Yura A. Gorbunov
Institute of Molecular Biology; Koltsovo, Novosibirsk Region 633159, Russia
+7-3832/647-887, Fax: /328-831

A limiting factor in Sanger sequencing is the preparation of DNA templates for carrying out polymerase chain reaction. The goal of this project is to develop a simple and timesaving technique for preparing DNA samples. Our project includes the following steps:

Construct a specialized transposable genetic element [Tn5s2 (on Tn5 basis)] containing the (1) NPTII gene for selecting tagged DNA plasmids after insertion mutagenesis, (2) IS1 gene for generating the deletion variants of DNA plasmids in which Tn5s2 will be inserted, (3) original ribosomal S12 gene of Escherichia coli as a genetic marker to allow selection of E. coli clones bearing deletion DNA plasmids, and (4) fragment of M13 bacteriophage DNA allowing single-stranded DNA (ssDNA) to be obtained.

Choose or create E. coli strains necessary for transposition and prepare deletion variants of plasmid DNA and their single-stranded forms.

Improve ssDNA-isolation methods and determine sequences of some DNA fragments (using the transposon Tn5s2 as an example).

We believe this technique will not require DNA subcloning in specialized vectors if a plasmid without kanamycin resistance is used. This technique may be easily automated and thus increase the efficiency of Sanger sequencing.

Large-Scale DNA Sequencing with a Primer Library


F. William Studier and John J. Dunn
Biology Department; Brookhaven National Laboratory; Upton, NY 11973
516/282-3390 or -3012, Fax: -3407, Internet: studier@genome1.bio.bnl.gov or dunn@genome1.bio.bnl.gov

Our aim is to develop a DNA sequencing capacity that can contribute significantly to the goal of sequencing the human genome within the next 10 years. We found that strings of three contiguous hexanucleotides (hexamers) can prime sequencing reactions specifically on templates at least as large as cosmid DNAs (40 kb) if the template DNA is saturated with a single-stranded DNA-binding protein. Most of the 4096 possible hexamers seem to participate effectively in such priming reactions, and the initial success rate of 60 to 90% compares favorably with conventional priming. The ability to prime sequencing reactions from a hexamer library may allow sequencing by primer walking on multiple templates as fast as sequencing reactions can be assembled. As the next steps toward realizing this potential, our immediate goals are to (1) integrate triple-hexamer priming chemistry with four-color fluorescent labeling and detection, (2) implement capillary electrophoresis with a replaceable matrix for rapid readout of many sequencing reactions in parallel, and (3) maximize priming effectiveness by learning more about the factors that affect it. Our ultimate aim is to develop a fully automated machine capable of producing hundreds of thousands of base pairs of finished DNA sequence per day.

Project Renewed in FY 1993

Novel Separation and Detection Methods for Gene Mapping and DNA Sequencing

Edward S. Yeung
Department of Chemistry; Iowa State University; Ames, IA 50011
515/294-8062, Fax: -0266, Internet: yeung@ameslab.gov, BITNET: yeung@alisuvax

Electrophoresis is one of the most powerful proven techniques available for gene mapping and sequencing. The number of possible resolution elements indicates that separation efficiencies and information content in two-dimensional gels easily outperform other techniques. Recently, electrophoresis in capillary tubes has shown potential for extended size range in sequencing runs and for substantially increased speed. The major problem is in detecting the separated components.

In conventional electrophoresis, a tag is introduced to allow measurement by absorption, fluorescence, or radiography. At best, only semiquantitative results are obtained because of unreliable chemistry and difficulties in probing a two-dimensional spot, which can be distorted. Staining can also affect component migration and lead to sequencing errors.

We propose to develop novel separation, detection, and imaging techniques for real-time monitoring in electrophoresis. Emphasis will be on schemes that allow multiplexing and on methods that do not require specialized fluorescent or radioactive tags. These techniques will be used for substantially increasing the speed, reliability, and sensitivity in gene mapping and DNA sequencing applications, both in slab gels and in capillary gels.

Projects Continuing into FY 1993

Sequencing by Hybridization: Methods to Generate Large Arrays of Oligonucleotides
Thomas M. Brennan
Engineering Division; Lawrence Berkeley Laboratory; Berkeley, CA 94720
On site at Stanford University; Palo Alto, CA 94301
415/725-7423, Fax: -1534, Internet: brennan@sumex-aim.stanford.edu

Detection of Luminescence from Lanthanide Ions as Labels for DNA Sequencing
Gilbert M. Brown, Robert S. Foote, K. Bruce Jacobson, Frank W. Larimer, Roswitha S. Ramsey, Richard A. Sachleben, and Richard P. Woychik
Oak Ridge National Laboratory; Oak Ridge, TN 37831-6119
615/576-2756, Fax: -5235

Vacuum Ultraviolet Ionizer Mass Spectrometer for Genome Sequencing
C. H. Winston Chen, Marvin G. Payne,(1) and K. Bruce Jacobson
Health and Safety Research Division; Oak Ridge National Laboratory; Oak Ridge, TN 37831
615/574-5895, Fax: /576-2115
(1)Department of Physics; Georgia Southern University; Statesboro, GA 30460

Development of a Fully Integrated Technology to Facilitate Sequencing the Human Genome
George Church
Department of Genetics; Harvard University; Boston, MA 02115
617/732-7562, Fax: -7663, Internet: church@rascal.bwh.harvard.edu

Sequencing by Hybridization
Radomir Crkvenjakov and Radoje Drmanac
Biological and Medical Research Division; Argonne National Laboratory; Argonne, IL 60439-4833
708/252-3161 or -3175, Fax: -3387, Internet: crkve@mcs.anl.gov

Genomic Instrumentation Development: Detection Systems for Film and High-Speed Gel-Less Methods
Jack B. Davidson and Robert S. Foote(1)
Instrumentation and Controls Division; (1)University of Tennessee Graduate School of Biomedical Sciences and Biology Division; Oak Ridge National Laboratory; Oak Ridge, TN 37831-6010
615/574-5599, Fax: -4058

Single-Molecule Detection Using Charge-Coupled Device Array Technology
M. Bonner Denton and Richard Keller,(1)
Department of Chemistry; University of Arizona; Tucson, AZ 85721
602/621-8246, Fax: -8272, Internet: mbdenton@ccit.arizona.edu
(1)Chemical and Laser Sciences Division; Los Alamos National Laboratory; Los Alamos, NM 87545

Multicolumn Gel Electrophoresis and Laser-Induced Fluorescence Detection for DNA Sequencing at 64,000 Bases/Hour
Norman J. Dovichi
Department of Chemistry; University of Alberta; Edmonton, Alberta, Canada T6G 2G2
403/492-2845, Fax: -8231, Internet: norm_dovichi@dept.chem.ualberta.ca

Rapid Preparation of DNA for Automated Sequencing
John J. Dunn and F. William Studier
Biology Department; Brookhaven National Laboratory; Upton, NY 11973
512/282-3012 or -3390, Fax: -3407, Internet: dunn@genome1.bio.bnl.gov

Using Scanning Tunneling Microscopy to Sequence the Human Genome
Thomas L. Ferrell, Robert J. Warmack, David P. Allison, K. Bruce Jacobson, Gilbert M. Brown, and Thomas G. Thundat
Oak Ridge National Laboratory; Oak Ridge, TN 37831-6123
Warmack: 615/574-6215, Fax: -6210, BITNET: rjw@ornlstc

DNA Sequence Analysis by Solid-Phase Hybridization
Robert S. Foote,(1) Richard A. Sachleben,(2) and K. Bruce Jacobson(1)
University of Tennessee Graduate School of Biomedical Sciences; (1)Biology Division and 2Chemistry Division; Oak Ridge National Laboratory; Oak Ridge, TN 37831-8077
615/574-0801, Fax: -1274

Advanced Sequencing Technology
Raymond F. Gesteland and Robert Weiss
Department of Human Genetics; University of Utah; Salt Lake City, UT 84112
801/581-5190, Fax: /585-3910, Internet: rayg@genetcs.med.utah.edu

Megabase Sequencing of Human Immune Receptor Loci
Leroy E. Hood
Department of Molecular Biotechnology; University of Washington; Seattle, WA 98195
206/685-7367, Fax: -7301

DNA Sequencing Using Stable Isotopes
K. Bruce Jacobson, Heinrich F. Arlinghaus,(1) Gilbert M. Brown,(2) Robert S. Foote, Frank W. Larimer, Richard A. Sachleben,2 Norbert Thonnard,(1) and Richard P. Woychik
Biology Division and (2)Chemistry Division; Oak Ridge National Laboratory; Oak Ridge, TN 37831-8077
615/574-1204, Fax: -1274, BITNET: bru@ornl.stc
(1)Atom Sciences, Inc.; Oak Ridge, TN 37830

Advanced Detectors for Mass Spectrometry
Joseph M. Jaklevic, W. Henry Benner, and Joseph Katz
Human Genome Center and Engineering Division; Lawrence Berkeley Laboratory; Berkeley, CA 94720
510/486-5647, Fax: -5857, Internet: jmjaklevic@lbl.gov, BITNET: jmj@lbl

Rapid DNA Sequencing Based on Fluorescence Detection of Single Molecules
James H. Jett, Richard A. Keller, John C. Martin, and E. Brooks Shera
Center for Human Genome Studies; Los Alamos National Laboratory; Los Alamos, NM 87545
Keller: 505/667-3018, Fax: /665-3024

Transposon-Based Genomic Sequencing
Christopher H. Martin, Michael Strathmann, Carol A. Mayeda, and Michael J. Palazzolo
Human Genome Center; Cell and Molecular Biology Division; Lawrence Berkeley Laboratory; Berkeley, CA 94720
Martin and Palazzolo: 510/486-5909, Fax: -6816, Internet: chrism@genome.lbl.gov or mjpalazzolo@lbl.gov

Ultrasensitive Fluorescence Detection of DNA
Richard A. Mathies, Mark A. Quesada, Hays S. Rye,(1) Xiaohua Huang, Jiun W. Chen, and Alexander N. Glazer(1) Departments of Chemistry and (1<)/em>Molecular and Cell Biology; University of California; Berkeley, CA 94720
510/642-4192, Fax: -3599

Preparation of Oligonucleotide Arrays for Hybridization Studies
Michael C. Pirrung
Department of Chemistry; Duke University; Durham, NC 27708-0346
919/660-1556, Fax: -1591

Thioredoxin-Gene 5 Protein Interactions: Processivity of Bacteriophage T7 DNA Polymerase
Jeff Himawan, Stanley Tabor, and Charles C. Richardson
Department of Biological Chemistry and Molecular Pharmacology; Harvard Medical School; Boston, MA 02115
617/432-3129, Fax: -3362

Improvement and Automation of Ligation-Mediated Genomic Sequencing
Arthur D. Riggs and Gerd P. Pfeifer
Department of Biology; Beckman Research Institute of the City of Hope; Duarte, CA 91010
818/301-8352, Fax: /358-7703

A High-Speed Automated DNA Sequencer
Lloyd M. Smith
Department of Chemistry; University of Wisconsin; Madison, WI 53706
608/263-2594, Fax: /262-0453, Internet: smith@bert.wisc.edu

High-Speed DNA Sequence Analysis by Matrix-Assisted Laser Desorption Mass Spectrometry
Lloyd M. Smith and Brian Chait(1)
Department of Chemistry; University of Wisconsin; Madison, WI 53706
608/263-2594, Fax: /262-0453, Internet: smith@bert.wisc.edu
(1)Rockefeller University; New York, NY 10021

Automation of the Front End of DNA Sequencing
Lloyd M. Smith and David Mead(1)
Department of Chemistry; University of Wisconsin; Madison, WI 53706
608/263-2594, Fax: /262-0453, Internet: smith@bert.wisc.edu
(1)Chimerx; Madison, WI 53704

Ion Cyclotron Resonance-Mass Spectroscopy of DNA Molecular Ions
Richard D. Smith, Charles G. Edmonds, and Joseph A. Loo
Chemical Sciences Department; Pacific Northwest Laboratory; Richland, WA 99352
509/376-0723 or -5665, Fax: -0418

Large-Scale DNA Sequencing with a Primer Library
F. William Studier and John J. Dunn
Biology Department; Brookhaven National Laboratory; Upton, NY 11973
516/282-3390, Fax: -3407

Time-of-Flight Mass Spectrometry of DNA for Rapid Sequence Determination
Peter Williams and Neal Woodbury
Department of Chemistry; Arizona State University; Tempe, AZ 85287-1604
602/965-4107, Fax: -2747