Research Abstracts from the
DOE Genome Contractor-Grantee Workshop IX

January 27-31, 2002 Oakland, CA

 

Human Genome Project Information

Genomes to Life Program Overview


Home

Author Index

Sequencing
Table of Contents
Abstracts

Sequencing Resources
Table of Contents
Abstracts

Instrumentation
Table of Contents
Abstracts

Functional Analysis and Resources
Table of Contents
Abstracts

Bioinformatics
Table of Contents
Abstracts

Microbial Cell Project
Table of Contents
Abstracts

Microbial Genome Program
Table of Contents
Abstracts

Ethical, Legal, and Social Issues
Table of Contents
Abstracts

Low Dose Ionizing Radiation
Table of Contents
Abstracts

Infrastructure
Table of Contents
Abstracts

Ordering Information

Abstracts from Previous Meetings

 

 

Sequencing Abstracts


1. The US DOE Joint Genome Institute’s High Throughput Production Sequencing Program

Susan Lucas, Tijana Glavina, Jamie Jett, Lyle Probst, Andrea Aerts, Nathan Bunker, Sanjay Israni, Astrid Terry, John C. Detter, Sam Pitluck, Heather Kimball, Yunian Lou, Martin Pollard, Anne Olsen, Chris Elkin, Paul Richardson, Dan Rokhsar, Paul Predki, Elbert Branscomb, Trevor Hawkins, and the JGI Sequencing Team

U.S. DOE Joint Genome Institute, Walnut Creek, CA 94598

lucas11@llnl.gov

In May 2001, the Department of Energy’s Joint Genome Institute (JGI) Production Genomics Facility (PGF) automated the use of rolling circle amplification (RCA) as a way to amplify plasmids for high throughput sequencing. With this new approach we are able to produce uniform amounts of template DNA resulting in high quality sequencing results. In addition, this new process has reduced the number of steps for template production, as compared to our previous magnetic bead plasmid preparation, (SPRI). These processes were automated using various liquid transfer robots and a series of quality controls were put in place with each process to track the quality at various stages. The changes have resulted in a simple process that allows for careful monitoring of quality and a significant cost savings in terms of number of steps, time, and people used to produce high quality results. Since May, the US DOE JGI has been concentrating on using this new production line to complete the sequencing of the human chromosomes 5, 16 and 19 as well as several other large genomes such as Fugu rupies and Ciona intestinalis. The PGF is currently building a microbial sequencing program and that will sequence several microbe genomes throughout FY02. In December, the PGF will install new technology bringing 21 Molecular Dynamic MegaBACE 4000 instruments. This 384 well capillary electrophoresis sequencer will increase sequencing throughput by 40% and enable the JGI institute to ramp from 150 to 250 384 well plates.

This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.


2. Leveraging Comparative Sequencing Information to Generate a Complete Functional Map of Human Chromosome 19

Lisa Stubbs, Xiaochen Lu, Sha Hammond, Eddie Wehri, Anne Bergmann, Robin Deis, Angela Kolhoff, and Joomyeong Kim

Genomics Division, Lawrence Livermore National Laboratory

stubbs5@llnl.gov

As part of the Joint Genome Institute comparative sequencing team, we recently reported initial analysis of comparative alignments between the draft sequence of human chromosome 19 (HSA19) and related regions in the mouse. These initial studies identified a large number of conserved sequence elements, totaling ~5% of HSA19 DNA, which represent a rich source of candidate genes and exons in addition to the promoters, enhancers and regulatory sequences that control their tissue-specific expression.

We are presently focusing on carrying HSA19 annotation to the next stage, by providing a complete catalog of functionally verified genes and other biologically active sequence elements along the length of this small and extraordinarily gene-dense human chromosome. To do this, we have focused on tying expressed sequences together to define the full set of HSA19 and related mouse genes, confirming predicted genes and defining their 5'- and 3' borders. To gain clues to the biological functions of each gene, we are also determining cell-type-specific expression patterns systematically, through in situ hybridization of sectioned mouse and human tissues. We are also testing candidate promoters and enhancers for function using a high throughput reporter assay system in cultured cells. These studies are designed to leverage DOE’s long-term investment in HSA19 sequencing, provide a publicly accessible guide to the function of all 1200 genes and the location of associated regulatory sequences throughout the chromosome. These collected data will also permit us to associate regulatory sequence structure with function in both species, providing an unprecedented look at the composition and evolution of promoters, enhancers and other regulatory sequences and their evolution in mammals.


3. The Finishing of Human Chromosomes 19 and 5

Jane Grimwood, Jeremy Schmutz, Mark Dickson, Richard M. Myers, and all members of the Sequencing Group at the Stanford Human Genome Center

The Stanford Human Genome Center and the Department of Genetics, Stanford University School of Medicine, Palo Alto, CA 94304

jane@shgc.stanford.edu

For the last two years, the Stanford Human Genome Center has been collaborating with the Joint Genome Institute to generate high-quality finished sequence from “draft” sequences produced by the JGI. To date, we have submitted 176 Mb of finished sequence with an estimated error rate of 1 in 342,000 basepairs. This current collaboration continues through December 2002, by which time we will have finished both human chromosomes 19 and 5. We will discuss the current status of the chromosomes and the procedures we are using to obtain closure.

Chromosome 19, estimated to be around 58 Mb in length, comprises slightly more than 2% of the human genome. It is extremely gene rich, containing perhaps twice the number of genes per DNA sequence length than the rest of the human genome. It is also extremely repetitive, has a very high GC content and a skewed distribution of CpG islands. For these reasons, the finishing of this chromosome has been, and continues to be, a great challenge. Currently, 61% of the sequence is in a finished form and the remainder is in active finishing. Many small sequence gaps exist between clones and these are being filled by walking directly from spanning clones. The finishing of five highly repetitive areas of the chromosome is being attempted by breaking the repeats down into smaller cosmid units to isolate copies of the repeats.

Chromosome 5 is estimated to be 184 Mb in length. Currently, 77 Mb of the chromosome is in finished sequence form. An additional 50 Mb of the chromosome is in active finishing, with the remainder of the clones being brought up to full draft coverage by the Joint Genome Institute. Mapping efforts are continuing at the JGI to obtain complete clone coverage of this chromosome.


4. Assembly and Analysis of Finished Sequence for Human Chromosome 19

Anne Olsen1, Susan Lucas1 and the JGI Production Sequencing Group; Jane Grimwood2, Jeremy Schmutz2 and the Stanford Finishing Group; Laurie Gordon3 and the LLNL Mapping Group; Paramvir Dehal1, Art Kobayashi1, Sam Pitluck1 and the JGI Informatics Group; and Trevor Hawkins1.

1DOE Joint Genome Institute, Walnut Creek, CA
2Stanford Human Genome Center, Palo Alto, CA
3Lawrence Livermore National Laboratory, Livermore, CA

olsen2@llnl.gov

Chromosome 19 has an estimated size of ~65 Mb and is the most GC-rich human chromosome. It also stands out as the chromosome with the highest content, relative to size, of repetitive sequences, CpG islands, and genes. The BAC/cosmid map of ch19 constructed at LLNL consists of seven contigs spanning ~98% of the estimated 58 Mb comprising the p- and q-arms. Map coverage extends to within 25 kb of the p-telomere (Riethman, http://www.wistar.upenn.edu/Riethman/) and into subtelomeric repeats on the q-arm. The most proximal several hundred kb on both the p- and q-arms have a high content of alpha satellite sequence, indicating proximity to the centromere. Mapping effort continues to close the few remaining map gaps, with TAR cloning (Kouprina et al.) in progress for four gaps that have been resistant to closure by other methods. A tiling path of clones spanning the chromosome has been sequenced, with 43.3 Mb (75% of the p- and q-arms) currently in a finished state. Finished sequence assembles into 154 contigs with an average contig size of 280 kb. Analysis of the sequence indicates over 1200 known genes, including a large number of clustered gene families, as well as several hundred predicted genes. The distribution of repeats differs markedly from that of the genome as a whole, with chromosome 19 exhibiting a much higher density of Alu repeats and lower content of LINE sequences than the genomic average. A comparison of genetic and physical distances across the chromosome indicates several regions of sex-specific enhanced recombination, with an especially high male recombination rate towards the telomeres. Comparative studies with mouse and Fugu (Dehal et al.) are providing further insights into the genomic organization and evolution of this chromosome.

This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.


5. Finishing of Human Chromosome 16

Norman Doggett, Mark Mundt, David Bruce, Cliff Han, Levy Ulanovsky, Larry Deaven, Susan Lucas, Trevor Hawkins, and JGI Staff

DOE Joint Genome Institute and Center for Human Genome Studies, Los Alamos National Laboratory

doggett@lanl.gov

Finishing of human chromosome 16 is being coordinated and conducted by the JGI's Los Alamos Center for Human Genome Studies. Coordination, and progress toward finishing is being monitored with a minimal tiling path clone map providing current active status for each clone and gap. The minimal tiling path map consists of 738 clones, 91 of which are cosmids and the remainder predominately BACs. These provide close to complete coverage of the 89 Mb of euchromatin. There are 17 clone gaps which are being closed by a combination of BAC end sequencing analysis and library screening. Most of the draft sequence for the chromosome was generated by the JGI with some draft contributions from WIBR (44 BACs) and WUGSC (26 BACs). As of November 1, 2001, the centers which have contributed significantly toward the finished sequence of clones in the tiling path include LANL (19.8 Mb), SHGC (12.3 Mb), TIGR (9.9 MB), and SC (1.5 Mb). The unique total of the chromosome completed as of November 1, 2001 is approximately 40 Mb. LANL will finish most of the remaining 49 Mb and we anticipate that a total of 50-60 Mb of the chromosome will have been completed by the time of this DOE conference and that the full euchromatin arms will be completed by this spring. We are finishing by managing the whole chromosome at once and integrating sequencing strategies, robotics, and an information management system into a highly automated process (see abstracts by Mundt et al. and Bruce et al.). Current status of chromosome finishing will be presented.

Supported by the US DOE, OBER under contract W-7405-ENG-36.


6. An Overview of the Finish Sequencing Process at LANL: Design, Automation, and Organization

David C. Bruce, Mark O. Mundt, Levy E. Ulanovsky, Heather A. Blumer, Judy M. Buckingham, Connie S. Campbell, Mary L. Campbell, Olga Chertkov, J. Joe Fawcett, Valentina M. Leyba, Kim K. McMurry, Linda J. Meincke, A. Christine Munk, Beverly A. Parson-Quintana, Donna L. Robinson, Elizabeth H. Saunders, Judith G. Tesmer, Linda S. Thompson, Patti L. Wills, Norman A. Doggett, and Larry L. Deaven

DOE Joint Genome Institute and Center for Human Genome Studies, Los Alamos National Laboratory

dbruce@lanl.gov

The challenge of high-throughput finishing is being addressed at Los Alamos National Laboratory (LANL) by integrating sequencing strategies, information management system, automation and personnel organization. The personnel are organized into specialized teams; Informatics, Subclone Re-Array, Template Preparation, Template Labeling, Oligonucleotide Synthesis, DENS, Sequencing, Subcloning, End Sequencing, and Gap Closure. All samples are handled in a 96 or 384 well format. Library re-array is done using a Genetix Q-Bot or Packard MultiProbe robots. Template purification using a solid-phase reversible immobilization (SPRI) method features Robbins Hydra and TiterTek MultiDrop automations. Thermal cycling is done in 384 well format using MJ Research Tetrads. Primer synthesis is done in 96 well format using Mermaid oligonucleotide synthesizers (See abstract of Thompson, et al) or differential extension with nucleotide subsets (DENS, see abstract of Ulanovsky, et al). Labeling reaction strategies include Big Dye terminator, Big Dye primer, and Big Dye dGTP terminator chemistries. Labeled template is run on capillary ABI PRISM 3700 DNA Analyzers in a 384 well format. The teams are coordinated by instructions generated by an information management system (see abstract of Mundt, et al.).

Supported by the US DOE, OBER under contract W-7405-ENG-36.


217. The Populus Genome Project

Toby Bradshaw1, Jerry Tuskan2

1University of Washington, 2Oak Ridge National Laboratory

toby@u.washington.edu

The complete 550Mbp genome sequence of Populus (poplar, cottonwood, aspen) is to be determined by the Joint Genome Institute in 2002-2003, in collaboration with an international scientific team with a longstanding interest in tree biology. Populus will be only the third higher plant genome to be sequenced in the public sector, and the first tree of any kind. Hybrid poplar is the fastest-growing tree in the temperate zone, and is widely used as model genus for research on biomass fuels, carbon sequestration, and bioremediation. The completed genome sequence will be of tremendous value for identifying the function of individual genes affecting growth, physiology, adaptation, and ecological function.


The online presentation of this publication is a special feature of the Human Genome Project Information Web site.