Research Abstracts from the
DOE Genome Contractor-Grantee Workshop IX

January 27-31, 2002 Oakland, CA

 

Human Genome Project Information

Genomes to Life Program Overview


Home

Author Index

Sequencing
Table of Contents
Abstracts

Sequencing Resources
Table of Contents
Abstracts

Instrumentation
Table of Contents
Abstracts

Functional Analysis and Resources
Table of Contents
Abstracts

Bioinformatics
Table of Contents
Abstracts

Microbial Cell Project
Table of Contents
Abstracts

Microbial Genome Program
Table of Contents
Abstracts

Ethical, Legal, and Social Issues
Table of Contents
Abstracts

Low Dose Ionizing Radiation
Table of Contents
Abstracts

Infrastructure
Table of Contents
Abstracts

Ordering Information

Abstracts from Previous Meetings

 

 

Functional Analysis and Resources Abstracts


47. Comparative and Functional Genomics Technologies

Robi Mitra, Vasudeo Badarinarayana, John Aach, Wayne P. Rindone, and George C. Church

Lipper Center for Computational Genetics, Department of Genetics, Harvard Medical School

church@arep.med.harvard.edu

Our project focuses on developing cost-effective technologies for determining and computationally comparing data on gene expression and selectable phenotypes generally applicable to microbial and vertebrate genomes relevant to the overall DOE goals. In particular we have developed very high-resolution genome-based arrays capable of sub-genic dissection of phenotypes, detection of alternative RNAs, and DNA-protein-binding sites. We have developed a method for in situ amplification, sequencing, and long range (multi-kilobase) RNA-splice-typing and DNA haplotyping.

For more information see: http://arep.med.harvard.edu


48. On Telomeres, Linkage Disequilibrium, and Human Personality

R. K. Moyzis1,2, D. L. Grady1, Y.-C. Ding1, E. Wang1, S. Schuck2, P. Flodman2, M. A. Spence2, and J. M. Swanson2

1Department of Biological Chemistry, 2Department of Pediatrics and the Child Development Center, College of Medicine, University of California, Irvine, Irvine, CA 92715 USA

rmoyzis@uci.edu

Human telomeres end with a stretch of the conserved simple repeat sequence (TTAGGG)n. To capture single-copy human DNA regions linked to telomeres, large telomere-terminal fragments of human chromosomes were cloned using specialized yeast artificial chromosome (YAC) vectors. By contrast, bacterial artificial chromosome (BAC) libraries are not expected to contain sequences extending to the telomere, owing to the absence of restriction sites in (TTAGGG)n, the effects of length associated with the construction of size-selected DNA recombinant clones, and the genomic instability of these regions. By DNA sequencing of cosmid subclones derived from telomere YACs, connection to the working draft human sequence has now been accomplished (Riethman et al., Nature 409, 948-951, 2001; www.genome.uci.edu). Integration with the working draft sequence was confirmed for 32 telomeres (out of the 46 distinct ends), with framework sequence extending to within 250kb-50kb of the physical end of these chromosomes. Subtelomeric sequence structure appears to vary widely, mainly as a result of large differences in subtelomeric repeat sequence abundance and organization at individual telomeres. Many subtelomeric regions appear to be gene-rich, matching both known and unknown expressed genes.

The great variability in subtelomeric regions between individuals has potential biological significance. It is unclear, therefore, if finishing a “single” sequence in these regions has biological meaning. We suggest that extensive population/species sampling will be needed to characterize this variability. We have begun targeting a number of subtelomeric regions for such “high-depth” DNA resequencing/haplotyping. One of our first targets, the dopamine receptor D4 (DRD4) gene, located at the telomere of 11p, yielded surprising results. Associations have been reported of the 7-repeat (7R) allele of the DRD4 gene with both attention deficit/hyperactivity disorder (ADHD) and the personality trait of novelty seeking. This polymorphism occurs in a 48 bp tandem repeat (VNTR) in the coding region of DRD4, with the most common allele containing four repeats (4R), and rarer variants containing two (2R) to eleven (11R) repeats. By DNA resequencing/haplotyping of over 1000 DRD4 alleles, representing a worldwide population sample, we uncovered that the origin of 2R- through 6R-alleles can be explained by simple one-step recombination/mutation events. In contrast, the 7R-allele is not simply related to the other common alleles, differing by greater than 6 recombinations/mutations. Strong linkage disequilibrium (LD) was found between the 7R-allele and surrounding DRD4 polymorphisms, suggesting this allele is at least 5-10 fold “younger” than the common 4R-allele. Based on an observed bias towards nonsynonymous amino acid changes, the unusual DNA sequence organization, and the strong LD surrounding the DRD4 7R-allele, we propose that this allele originated as a rare mutational event, that nevertheless increased to high frequency in human populations by positive selection (Ding, et. al., PNAS, in press, 2001).


49. Strategies for Construction of Subtracted Libraries Enriched for Full-Length cDNAs and for Preferential Cloning of Rare mRNAs

Brian Berger, Sergey Malchenko, Irina Koroleva, Einat Snir, Tammy Kucaba, Maria de Fatima Bonaldo, and Marcelo Bento Soares

The University of Iowa, Departments of Pediatrics, Biochemistry, Physiology and Biophysics

bento-soares@uiowa.edu

Subtracted libraries enriched for full-length cDNAs.
A major challenge of the ongoing NIH Mammalian Gene Collection Program is the identification of sufficient novel full-length cDNAs to enable achieving the yearly full- length sequencing goals of the project. In an effort to assist in the identification of novel full-length cDNAs we have constructed full-length-enriched libraries and we have developed a novel method for generation of subtracted libraries enriched for full-length cDNAs. Conventional subtractive hybridization procedures cannot be applied for full- length-enriched libraries because a truncated clone in the driver population has the potential to subtract its full-length counterpart from the library. Briefly, 100-150 bp single-stranded overhangs are generated at the 5' end of all clones in the library (tracer), for hybridization with a biotinylated driver population comprising representative clones of every sequence contig identified in the starting full-length- enriched library. The subtracted population is purified from the hybrids using streptavidin-coated magnetic beads, repaired and electroporated into bacteria for propagation of a subtracted full-length-enriched library. We have used this method successfully to generate a subtracted full-length-enriched library derived from germinal center B cells.

Preferential cloning of rare mRNAs.
Discovery of rare mRNAs in large-scale EST projects remain difficult and inefficient because of poor representation of such transcripts in cDNA libraries. In an attempt to expedite the identification of rare mRNAs, we developed a novel method for prerential cloning of rare mRNAs. Briefly, mRNA is hybridized with a driver comprising most/all already identified cDNAs and subsequently destroyed with RNAse H. The remainder intact mRNA is linearly amplified and cloned for production of a library enriched for rare mRNAs. We have used this method to construct a mouse cDNA library enriched for rare mRNAs from hippocampus. The efficacy of our method was demonstrated by sequencing and by microarray hybridization analyses.


50. The IMAGE Consortium: Moving Toward a Complete Set of Full-Length Mammalian Genes

P. Folta, N. Ghaus, N. Groves, T. Harsch, A. Johnston, P. Kale, C. Sanders, K. Schreiber, and C. Prange

Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory

prange1@llnl.gov

The I.M.A.G.E. Consortium comprises the largest publicly available collection of cDNAs; currently encompassing over 5.5 million clones from six species. These clones are arrayed at Lawrence Livermore National Laboratory, sequenced at various centers, and the resulting ESTs are immediately deposited into Genbank. The clones themselves are made available through a network of distributors worldwide. Rearrayed clone sets representing unique genes of interest are also developed and distributed through the I.M.A.G.E. pipeline.

Over the last 18 months, IMAGE has been involved in arraying and rearraying clones in support of the Mammalian Gene Collection (MGC) project, an NIH-sponsored effort to generate full-length cDNA resources (http://mgc.nci.nih.gov/). Both EST and full-insert sequences are generated from full-length enriched cDNA libraries. As of November 2001, clones from 100 enriched libraries have been arrayed, resulting in over 1 million ESTs submitted to dbEST. Sequence analysis predicts more than 30,000 of these clones to be unique and full-length. These clones are rearrayed at LLNL and sent to various sequencing centers for full-length sequencing. All sequences generated from the MGC clones are deposited into Genbank, and all clones and rearrayed clone sets are available royalty-free through the I.M.A.G.E. distributors. At this time over 10,000 full-length high-quality human and mouse sequences have been submitted to Genbank.

Another main focus of the I.M.A.G.E. Consortium has been the development of database query tools to aid in the tracking and analysis of clone-related data. These tools offer web-based query capabilities interconnecting many areas of interest, including clones, libraries, tissues, sequences, rearrays, EST and gene clusters, and quality control information. These tools have contributed to the ease of use of this collection and we are continuing to add additional query capabilities.

Further information about the I.M.A.G.E. Consortium is available by email (info@image.llnl.gov) or through the WWW (http://image.llnl.gov).

This work was partially funded by the NIH and was performed under the auspices of the U.S. Department of Energy by the University of California, Lawrence Livermore National Laboratory under contract no. W-7405-Eng-48.


51. Functional Genomics Research in AIST-JBIRC

Naoki Goshima, Tohru Natsume, Kousaku Okubo, and Nobuo Nomura

Japan Biological Information Research Center (JBIRC), National Institute of Advanced Industrial Science and Technology (AIST), Japan

nnomura@jbirc.aist.go.jp

Functional genomics group of JBIRC implements functional analysis of genes and proteins based on 30,000 human full-length cDNA clones which have been collected by collaborators since 1998. There are four teams in the group. The goals of each team are as follows:

i. Protein Expression Team. The complete ORF regions of human full-length cDNAs will be cloned in Gateway entry vectors, which are versatile clones for transferring DNA segments to various expression vectors in high throughput.

ii. Protein Network Team. The primary objective is to discover potential interacting partners and to establish members of functional protein machinery complex using mass spectrometry. Post-translational modifications regulating protein-protein interactions will be also studied.

iii. Expression Profiling Team. Expression profiles and their changes of human genes in cells under both normal and disordered conditions will be quantitatively recorded by the iAFLP method using human full-length cDNA sequence information.

iv. Cellular Function Team. Gene function will be studied by introduction of expression cDNA clones into cells.

The high throughput system, which will quantitatively detect the morphological change of cells including processing, bowing, enlargement and others, will be developed.


52. The Drosophila Gene Collection

Mark Stapleton1, Peter Brokstein2, Guochun Liao2, Ling Hong2, Mark Champe1, Brent Kronmiller1, Joanne Pacleb1, Ken Wan1, Charles Yu1, Joe Carlson1, Reed George1, Susan Celniker1, and Gerald M. Rubin2

1Lawrence Berkeley National Laboratory, BDGP, Berkeley, CA
2Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA

staple@bdgp.lbl.gov

The Berkeley Drosophila Genome Project’s future goals are in functional genomics. Taking advantage of the Drosophila genome sequence, we intend to develop tools and technologies for answering biological questions in a high-throughput environment. Our first step in this direction is to create a publicly available collection of Drosophila cDNAs, sequence them to high quality, and begin converting them into universal Gateway (LifeTechologies) clones. Using an in vitro recombination reaction based on phage lambda, Gateway clones can be subcloned en masse into a variety of expression vectors. In a pilot experiment using Gateway technology, we have created 72 Baculovirus expression constructs representing 36 Drosophila transcription factors. Release 1.0 of the Drosophila Gene Collection (DGC) has been described (Rubin et al. Science 2000). It was produced by sequencing some 80,000 5' ESTs from cDNA libraries derived from various tissues and stages. The DGC Release 1 consists of a non-redundant set of nearly 6,000 clones – 42% of all predicted genes. We are currently full-insert sequencing Release 1 using a commercially available in vitro transposition system. Data will be presented for full-insert sequencing utilizing this transposon-based methodology. Since the DGC Release 1 comprises a fraction of the predicted genes in Drosophila, we have generated an additional 160,000 5' ESTs from existing and newly constructed libraries. The new libraries were generated in a collaboration with Piero Carnicci at the RIKEN in Japan. Given the availability of a highly annotated genome sequence, we have computationally selected over 5,000 clones to generate DGC Release 2, which contains over 11,000 clones. BDGP now has clones representing almost 75% of all predicted genes in D. melanogaster. Release 2 has been added to our full-insert sequencing pipeline and should be completed in early 2002. We are now focusing on identifying and sequencing major splice forms as well as developing directed approaches to obtain full-length cDNAs for the remaining genes.


53. Identification of the Complete Regulon of a Master Transcriptional Regulator

Michael Laub, Swaine Chen, Lucy Shapiro, and Harley McAdams

Department of Developmental Biology, Stanford University

slchen@stanford.edu

The objective of the Stanford Microbial Cell project is to identify the complete transcriptional regulatory network of the aquatic bacterium, Caulobacter crescentus. In this poster, we describe how we have applied a combination of experimental and bioinformatic techniques to determine the complete regulon controlled by CtrA, a master transcriptional regulator that controls many Caulobacter cell cycle processes including DNA replication, polar morphogenesis, and cell division. We used an in vivo technique involving cross-linking bound CtrA to its binding site followed by fragmenting the DNA and using immunoprecipitation to enrich the segments with linked CtrA proteins. We then reversed the crosslinks and used a microarray assay to identify the enriched DNA segments. We combine this binding site assay with data on RNA expression patterns in wild type and mutant cells to determine the complete CtrA regulon.


54. Deciphering the Gene Regulatory Network of a Simple Chordate

Byung-in Lee1, David Keys2, Andrae R. Arellano1, Chris J. Detter1, Paul Richardson1, Michael Levine1,2 Mei Wang1, Orsalem J. Kahsai1, David K. Engle1, Irma Rapier1, Sylvia Ahn1 and Trevor Hawkins1

1DOE Joint Genome Institute, Walnut Creek, CA 94598
2Department of Molecular and Cellular Biology, University of California at Berkeley, Berkeley, CA 94720

lee110@llnl.gov

Regulatory DNA elements such as promoters and enhancers work by serving as docking sites for specific protein complexes. These complexes are comprised of cooperative groups of transcription factor proteins that recognize the target DNA sequences quite specifically and their presence or absence governs the off or on status of their target regulatory sites. Therefore an understanding DNA regulatory element is to understand the composition and function of the biochemical networks and pathways that carry out the essential processes of living organisms.

To characterize gene regulatory network, we used electroporation assays to screen genomic DNA fragments for tissue specific enhancer activities in Ciona intestinalis. The Ciona genome is one of the smallest and most compact of all chordate genome and Ciona tadpole represents the most simplified chordate body plan (the ciona notochord contain only 40 cells). Since the synchronously developing embryos from Ciona can be introduced a transgenic DNA via simple electroporation, we determine cis regulatory DNA modules that lead to the specification of each of the key chordate tissues. 

We screened ~300kb of BAC DNA which contained HOX gene clusters for tissue specific enhancer elements using the shotgun approach, and found 37 screened clones (80kb) of positive tissue specific enhancer elements. And 13 different tissue types were recognized from the screening. Currently we are investigating minimum enhancer elements from random genomic pieces and also whole mount in-situ hybridization with genes within.

This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.


55. Functional Analysis of Gene Regulatory Networks Underlying Skin Biology and Environmental Susceptibility

Brynn H. Jones1, Jay R. Snoddy1, Cymbeline T. Culiat 1,3, Mitchel J. Doktycz1,3, Peter R. Hoyt1, Denise D. Schmoyer2, Erich J. Baker1, Douglas P. Hyatt 1, Line C. Pouchard2, Michael R. Leuze2, Eugene M. Rinchik1,4, and Edward J. Michaud 1,3

1Life Sciences Division, and 2Computer Science and Mathematics Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831
3The University of Tennessee–Oak Ridge National Laboratory Graduate School of Genome Science and Technology, Oak Ridge, TN 37830
4Department of Biochemistry, Cellular, and Molecular Biology, The University of Tennessee, Knoxville, TN 37996

jonesbh@ornl.gov

Deciphering the complex biological systems that underlie human health and susceptibility to the environmental consequences of energy production is an important mission of the DOE’s Biological and Environmental Research Program. For many years, ORNL has focused on annotating human DNA sequence information with gene function information based on the genetic analysis of induced single gene mutations in the mouse (see abstracts by Michaud et al., and Culiat et al.). This single-gene, functional-genomics approach leads naturally to a parallel dissection of complex gene regulatory networks, and of the role of individual genetic variation in susceptibility to environmental agents. The recent availability of the complete genomic sequences from humans and mice, and new technologies to assess gene regulation in a high-throughput manner has dramatically increased our ability to elucidate complex biological systems. Here we describe a new project that combines three areas of expertise at ORNL (mouse molecular genetics, analytical technologies and instrumentation, and bioinformatics and computational biology), designed to develop an integrated-systems approach for defining gene function in skin biology and environmental susceptibility. Our initial efforts focus on a novel Oak Ridge mutation (Hrn) in a transcription factor encoded by the hairless (hr) gene. Hairless mutants are characterized by early and persistent loss of body hair, and by increased susceptibility to UV- and chemical-induced carcinogenesis, and to dioxin toxicity. Using skin-specific cDNA microarrays we have identified numerous genes that are differentially expressed in the skin of Hrn mutants, thus identifying some of the components of the regulatory network associated with the hairless transcription factor. In parallel we are applying the concept of phylogenetic footprinting to the task of elucidating this gene regulatory network. Data obtained experimentally with hairless mutants will be used as the empirical basis for understanding co-regulation of gene expression using bioinformatics tools. Ultimately we will be able to predict membership of genes in networks based on the presence of shared binding site motifs in regulatory regions, and these hypothesis may be tested by examining the whole-animal consequences of induced mutations in the regulatory sequences and coding regions of each network component using ORNL’s Cryopreserved Mutant Mouse Bank (CMMB).

[Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, and by the Office of Biological and Environmental Research, U.S. Department of Energy under contract DE-AC05-00OR22725 with UT-Battelle, LLC.]


56. Genomic Identification and Analysis of Shared cis-Regulatory Elements in a Developmentally Critical Homeobox Cluster

Tsutomu Miyake, Mark Dickson, Jane Grimwood, Steve Irvine, Andrew Brady Stuart, Jeremy Schmutz, Kenta Sumiyama, Richard M. Myers, Frank H. Ruddle, and Chris T. Amemiya

Virginia Mason Research Center, Stanford University School of Medicine, Yale University

camemiya@vmresearch.org

A major problem in biology is the delineation of how a one-dimensional sequence of nucleotides can specify a three-dimensional organism. Central to this process is the assurance that the information hardwired into the DNA sequence directs the regulation of genes in their proper temporal and spatial milieu. This coordinated regulation of genes is very complex, and underlies all biological processes, including development, differentiation, evolution, speciation, and the onset of disease. Numerous studies have been performed using the relatively laborious method of site-directed mutagenesis and subsequent expression analysis, in order to deduce the identity and nature of cis-regulatory elements (enhancers and repressors) at a fundamental level. However, alternative experimental approaches are clearly required to detect and characterize these sequences in order to better understand the systematic and interactive roles they play, particularly in a more global context . This is especially true of developmentally important “gene complexes” which are regulated in a programmatic fashion during development. The fact that the structure, organization, and developmental expression patterns of these genes have been so strikingly conserved throughout metazoan evolution, suggest that there exists sequence-encoded mechanisms ensuring their evolution and deployment in concert. We propose using a combination of genomic, molecular, cellular and morphologic tools in order to make inroads into our understanding of this problem. We will focus our attention on the Distalless (Dlx) homeobox clusters, whose developmental significance is well established with respect to pattern formation. These relatively small gene clusters serve as regulatory models for other developmentally critical gene clusters in complex vertebrate genomes (such as the Hox clusters, olfactory receptors, and genes of the anticipatory immune system). We will incorporate a highly integrated approach for the comparative analysis of these clusters among selected mammalian taxa. This pilot project will necessarily implement aspects of genomics (BAC analysis, long-range DNA sequencing, bioinformatics/ computation) and developmental biology (transgenic and knockout/knock-in technologies). The broad goals of this project are to identify and understand the genomic basis for the cooperative regulation of the Distalless (Dlx7/Dlx3) genes, and to further develop this experimental paradigm for future, larger-scale studies of genomic control. In addition to the practical biological/medical implications of this pilot study, the dataset should prove highly useful for evaluation of novel computational methods for multiple sequence alignments.


57. A Sequence-Ready Comparative Map of Chicken Genomic Segments Syntenically Homologous to Human Chromosome 19

Laurie Gordon, Joomyeong Kim, Hummy Badri, Mari Christensen, Matthew Groza, Mary Tran, and Lisa Stubbs

DOE Joint Genome Institute, 2800 Mitchell Drive, Building 100, Walnut Creek, CA 94598-1604 and Genomics Division, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, 7000 East Avenue, L-441, Livermore CA 94550

gordon2@llnl.gov

Having recently completed mapping and sequencing mouse genomic segments syntenically homologous to human chromosome 19 (HSA19), we are generating a parallel set of sequence-ready chicken BAC clone contigs. The locations of some HSA19-homologous genes are known in chicken, but homology segments are not well characterized. Preliminary comparisons of human, mouse and chicken conserved segments by other groups suggest that the organization of the chicken genome is more like that of human than mouse. Comparative sequencing of a third vertebrate with greater evolutionary distance from the two mammalian species will test and expand these findings, facilitating a better understanding of ancestral chromosomal organization and gene evolution.

Protein-translated HSA19 gene sequences identified well conserved (60-95%) chicken ESTs for =160 gene loci. Overgo and PCR probes were developed wherever conserved sequences were identified to facilitate detection of gene synteny and homology segment breakpoints. Probes were hybridized to two BAC libraries, one each from Gallus domesticus and Gallus gallus (5x, respectively). Clones identified by hybridization were restriction digested and assembled into maps, generating important information on clonal integrity, length and overlap; restriction maps also facilitate contig extension, identification of potential joins between neighboring contigs and sequencing tiling path selection. To date we have successfully identified at least one bac for =80 gene loci, generating forty-five contigs covering 7 Mb of the chicken genome. Given the extraordinary compactness of the chicken genome relative to human, we estimate current coverage of approximately 40% of euchromatic HSA19-related territory. We have confirmed the presence of syntenic homology segments while detecting significant rearrangements relative to human and mouse, including differential organization of clustered gene families. Representative bacs are being sequenced by the Joint Genome Institute in Walnut Creek; comparative data will facilitate the characterization of HSA19-related homology segments and shed light on chromosomal evolution in vertebrates.


58. Characterization of a New Imprinted Domain Located in Human Chromosome 19q13.4/ Proximal Mouse Chromosome 7

Joomyeong Kim, Anne Bergmann, Edward Wehri, Xiaochen Lu, and Lisa Stubbs

Genomics Division, Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, CA 94550

Kim16@llnl.gov

For a subset of mammalian autosomal genes, the two parental alleles are not functionally equivalent due to genomic imprinting. Imprinting involves inactivation of one allele, depending upon the parental origin. More than 40 imprinted genes have been identified from human and mouse in the past 10 years. Most imprinted genes are thought to be involved in either fetal growth or animal behavior, and most imprinted genes are found clustered in specific regions of chromosome, suggesting the presence of long-range regulatory mechanisms for genomic imprinting. In early studies, we located one imprinted gene, Peg3 (paternally expressed gene 3), to human chromosome 19q13.4. Due to the clustering of imprinted genes in specific chromosomal regions, it seemed likely that other imprinted genes would be found in the interval surrounding PEG3. We have since isolated and characterized most genes located in the 1MB-genomic intervals surrounding human and mouse PEG3. Our studies have identified six new imprinted genes in this new domain, including Peg3, Zim1 (imprinted Zinc-finger gene 1), Zim2, Zim3, Usp29 (Ubiquitin-specific processing protease 29), and Znf264. Most of these new imprinted genes are predicted to function as transcription factors based on the zinc-finger motifs detected in the predicted proteins of these genes. In contrast to most imprinted regions, the HSA19q and Mmu7 imprinted domains have changed considerably in terms of the content, coding capacity and transcriptional activities of resident genes.

Two different directions are currently being developed in our lab for the future study. First, we are working to characterize the physiological functions of these new imprinted genes using mouse genetic approaches. Second, we are using comparative genomics approaches to study the regulatory mechanism controlling the imprinting and expression of these six genes. Based on our preliminary results, it is likely that one region, the surrounding region of the first exon of Peg3, might be responsible for the imprinting of a whole domain and we are presently aiming to test this hypothesis as well as to identify regulatory regions associated with all 6 imprinted genes.


59. A New Apolipoprotein Influencing Plasma Triglyceride Levels in Humans and Mice Revealed by Comparative Sequence Analysis

Len A. Pennacchio1, Michael Olivier3, Jaroslav A. Hubacek2, Jonathan C. Cohen2, Ronald M. Krauss1, and Edward M. Rubin1

1Genome Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA
2Center for Human Nutrition and McDermott, Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, TX 75390-9052 USA
3Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226 USA

LAPennacchio@lbl.gov

The apolipoprotein gene cluster on human chromosome 11q23 (ApoAI/CIII/AIV) is a well-studied genomic interval that influences a variety of plasma lipid parameters and atherosclerosis susceptibility in humans. To facilitate the identification of evolutionarily conserved sequences with potential function near this cluster, we determined the sequence of ~200 kilobasepairs (kbp) of orthologous mouse DNA and compared the mouse and human sequences. The presence of a stretch of inter-species sequence conservation approximately 30 kbp proximal to the ApoAI/CIII/AIV gene cluster, led us to an interval that upon further analysis was shown to encode a new member (ApoAV) of the chromosome 11 apolipoprotein gene cluster. We find that the ApoAV gene is expressed primarily in liver tissue and encodes a secreted protein that dramatically impacts plasma triglyceride levels in humans and mice. Specifically, mice over-expressing a human ApoAV transgene display a 70% decrease in plasma triglyceride concentrations, while oppositely, mice lacking ApoAV have a 400% increase in this lipid parameter. These findings in mice suggested that alterations in ApoAV could also influence human plasma lipid levels. To explore this possibility, we identified several single nucleotide polymorphisms (SNPs) in the human ApoAV gene and determined their distribution in two independent patient populations. Through this analysis, we found a significant association between several polymorphisms and abnormal triglyceride levels in both independent studies. Heterozygous individuals had on average a 32% increase in plasma triglyceride levels when compared to individuals homozygous for the common allele. We determined that approximately 20% of the Caucasian population contain an ancestral chromosomal fragment representing a definable susceptibility haplotype. These findings in humans and mice illustrate the utility of comparative sequence analysis to prioritize regions of the genome for further study and suggest an important physiological role for ApoAV in affecting plasma levels of triglyceride, a major risk factor for heart disease in humans.


60. Nell1: A Candidate Gene for ENU-Induced Recessive Lethal Mutations at the l7R6 Locus and Potential Mouse Models for Human Neonatal Unilateral Coronal Synostosis (UCS)

Cymbeline T. Culiat1, Jennifer Millsaps2, Jaya Desai3, Beverly Stanford1, Lori Hughes1, Marilyn Kerley1, Don Carpenter1, and Eugene M. Rinchik1,3

1Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2009, Oak Ridge, TN 37831-8077
2Genome Science and Technology Graduate School, The University of Tennessee, Knoxville, TN 37996
3Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, TN 37996

A gene (l7R6) critical for late embryonic development and survival has been mapped proximal to the pink-eyed dilution (p) gene in mousechromosome 7. Six independent ENU-induced alleles designated 88SJ, 335SJ, 2038SJ, 102DSJ, 11DSJ and 45DSJ all result in late-gestation/ neonatal lethality. l7R6 maps to a region homologous to human 11p15.1, that contains a very large gene for a protein kinase C binding protein, called NELL1. Human NELL1 has a 2433-bp coding region with at least 20 exons spread out in ~800 kb genomic distance. Because the human gene is so large, and because we recovered so many l7R6 alleles in a relatively small number of gametes, the mouse counterpart seemed a logical candidate for l7R6. To determine if l7R6 is Nell1, a near full-length (1920 bp) cDNA was used as a probe for Northern analysis of both wild-type and mutant animals. Nell1 expression was detected from E10-E18, increasing as fetal development progresses and concentrating particularly in the head at E18. In wild-type adults, expression was predominantly in brain. Notably, a severely reduced level of Nell1 expression was detected in one allele (102DSJ). Abnormal expression of human NELL1 is associated with unilateral coronal synostosis (UCS) in newborns, a condition where coronal sutures fuse early, resulting in abnormal head development and limb defects. Mouse hemizygotes recovered at either E18 or two hours after birth also exhibit both gross cranial and limb defects. Cloning and sequencing of RTPCR-derived cDNA clones from mutant and wild-type alleles have shown that the mutation in the 335SJ allele is an AT/GC transition resulting in a cysteine to arginine substitution in the Nell1 protein. The phenotype data, RNA analysis and mutation scanning experiments indicate that l7R6 is Nell1. Studying the spectrum of mutations in this allelic series will be valuable in understanding the structure and function of the Nell1 protein.

[Research sponsored by the Office of Biological and Environmental Research, U.S. Department of Energy under contract DE-AC05-00OR22725 with UT-Battelle, LLC.]


61. Functional Annotation of Human Genes with Phenotype-Driven and Gene-Driven Mutagenesis Strategies in Mice

Edward J. Michaud1,2, Carmen M. Foster1, Rosalynn J. Miltenberger1,2, Miriam L. Land1, Dabney K. Johnson1,2, and Eugene M. Rinchik1,3

1Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831-6445
2The University of Tennessee-Oak Ridge National Laboratory Graduate School of Genome Science and Technology, Oak Ridge, TN 37830-8026
3Department of Biochemistry, Cellular, and Molecular Biology, The University of Tennessee, Knoxville, TN 37996

michaudejiii@ornl.gov

One focus of the Mouse Genetics and Genomics Program at ORNL is to determine the whole-organism biological functions of human genes by inducing mutations in the homologous genes in mice. Our program currently uses a phenotype-driven chromosome-region mutagenesis strategy in the mouse to identify and map gene function in pre-selected segments of the genome, which together total approximately 8% of the mouse genome. The genetic reagents that are necessary to perform these chromosome-region mutagenesis screens, however, are not currently available for most of the mouse chromosome regions that are homologous to the human chromosomes (5, 16, and 19) sequenced by the DOE. In this project, we are exploiting newly developed techniques for engineering chromosomes in mouse embryonic stem cells that will permit phenotype-driven mutagenesis screens and functional-genomics analyses to be performed in any region of the genome. Specifically, we are generating radiation-induced deletions and Cre-loxP-mediated inversions in large, gene-rich regions of mouse chromosomes that are in synteny conservation with human chromosomes. Our initial focus is the proximal two-thirds of mouse chromosome 7, which has homology to all of human chromosome 19q and to portions of chromosomes 11p, 15q, and 11q. Although chromosome-region phenotype-driven mutagenesis in mice is currently the state-of-the-art for identifying and mapping the biological functions of human genes, engineering the appropriate genetic reagents for a new chromosome region and performing the mutagenesis screens are still time consuming endeavors. To augment our phenotype-driven mutagenesis strategy, we are taking advantage of the recent availability of the draft sequence of the mouse genome to develop a new DNA sequence-driven or gene-driven mutagenesis strategy (ORNL’s Cryopreserved Mutant Mouse Bank, CMMB; see other abstract by Michaud et al.) that will allow us to determine the biological functions of any pre-selected genes in the genome, regardless of their chromosomal locations. The ongoing development of the CMMB offers us a new and unprecedented opportunity to apply our expertise in chemical germ-cell mutagenesis in mice specifically to understanding the biological functions of those genes on human chromosomes 5, 16, and 19.

[Research sponsored by the Joint Genome Institute, Office of Biological and Environmental Research, U.S. Department of Energy under contract DE-AC05-00OR22725 with UT-Battelle, LLC.]


62. Resource Archiving and Distribution via the Mutant Mouse Database and the Cryopreservation Program at the Oak Ridge National Laboratory

D. K. Johnson1, E. M. Rinchik1,2, P. R. Hunsicker1, S. G. Shinpock1, K. J. Houser1, D. J. Carpenter1, G. D. Shaw1, W. Pachan1, E. J. Michaud1, B. L. Alspaugh3, and L. B. Russell1

1Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2009, Oak Ridge, TN 37831-8077
2Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, TN 37996
3Science Applications International Corporation, Oak Ridge, TN 37830

v71@ornl.gov

The Mouse Genetics and Genomics Program at Oak Ridge National Laboratory (ORNL) currently curates eight hundred standard or mutant strains of laboratory mice in a conventional colony co-located with laboratory space equipped for molecular biology, broad-based phenotype screening, and genetic engineering. Of these 800 strains, 300 are in live maintenance and 670 are banked as cryopreserved embryos, sperm, and/or ovaries. Detailed information about actively propagated or cryopreserved stocks is listed in ORNL’s searchable Mutant Mouse Database (http://bio.lsd.ornl. gov/mouse/). Mutant stocks may be obtained as breeding pairs, frozen tissues, or frozen embryos, sperm, or ovaries for a cost-recovery fee. Mouse stocks are cryopreserved as embryos or germ cells in order to archive stocks not in current active use, to preserve the necessary materials for the rederivation of all stocks into our new barrier facility, and to provide a means for distribution of requested stocks to the international research community. Protocols and progress of the cryopreservation effort may be viewed at the Mammalian Genetics and Genomics Program website (http://bio.lsd.ornl.gov/mgd), which includes a further link (http://tnmouse.org/) to information on ORNL’s participation in the Tennessee Mouse Genome Consortium, one of the national mouse-mutagenesis centers funded by the National Institutes of Health. ORNL has recently begun construction of the William L. and Liane B. Russell Laboratory of Comparative and Functional Genomics, a 30,000 square-foot colony built to house 60,000 mice in specific pathogen-free conditions. This new, SPF mouse-breeding facility will be important for the future research activities of the ORNL’s Mouse Genetics and Genomics Program.

[Research sponsored by the Office of Biological and Environmental Research, U.S. Department of Energy under contract DE-AC05-00OR22725 with UT-Battelle, LLC.]


63. Mutation Scanning and Candidate-Gene Verification in the ORNL Regional ENU-Mutagenesis Program

Cymbeline Culiat1, Qingbo Li2, Mitchell Klebig3, Dabney Johnson1, Zhaowei Liu2, Heidi Monroe2, Beverly Stanford1, Tse-Yuan Lu1, Lori Hughes1, Marilyn Kerley1, Don Carpenter1, Lisa Webb4, and Eugene M. Rinchik1,3

1Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2009, Oak Ridge, TN 37831-8077
2SpectruMedix Corporation, 2124 Old Gatesburg Rd, State College, PA 16803
3Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, TN 37996
4Graduate School for Genome Sciences and Technology, , The University of Tennessee, Knoxville, TN 37996

9c9@ornl.gov

Identifying the gene alterations in mouse mutations and understanding the resulting perturbed pathways contribute to the functional annotation of the corresponding genes in the human genome. Regional mutagenesis efforts at Oak Ridge National Laboratory have generated and fine-mapped 15 recessive-lethal N-ethyl-N-nitrosourea (ENU)-induced mutations to a small genomic region proximal to the pink-eyed dilution (p) gene in mouse chromosome 7. These mutations represent six genes important for early mammalian development and survival. Candidate genes were assigned to these mutations based on integrating data from genetic and physical mapping, phenotype characterization (e.g., time of death studies), expression profiling (regional transcriptomics), and utilization of publicly available bioinformatics data. Three mutation-scanning techniques (dHPLC/TMHA, TGCE and DNA sequencing) were utilized to identify potential mutations in the candidate genes assigned to the recessive lethal mutations. The application of temperature gradient capillary electrophoresis (TGCE), a new high-throughput heteroduplex analysis technique, permitted rapid assignment of positions where ENU-induced base pair changes were located and is the first demonstration of the effectiveness of using TGCE to find ENU-induced mutations (i.e., SNPs) in the mouse genome. Mutation-scanning data for identification of mutations in the Ldh1 (1 allele), Saa3 (2 alleles), Prmt3 (2 alleles), and Nell1 (8 alleles) genes will be presented. Our data demonstrate that TGCE is an excellent approach for examining several candidate genes for a single mutation or a single gene for a cluster of mutations. In addition, optimization of TGCE protocols for mutation scanning and its success in the regional mutagenesis program have led to its further application in a gene-driven approach for finding mouse mutations genome-wide in the CMMB (Cryopreserved Mouse Mutant Bank) (see abstract by Michaud et al).

[Research sponsored by the Office of Biological and Environmental Research, U.S. Department of Energy under contract DE-AC05-00OR22725 with UT-Battelle, LLC.]


64. Genome-Wide, Gene-Driven Chemical Mutagenesis for Functional Genomics: The ORNL Cryopreserved Mutant Mouse Bank

E. J. Michaud1,2, J. R. Snoddy1, E. J. Baker1, Y. Aydin-Son2, D. J. Carpenter1, L. L. Easter1, C. M. Foster1, A. W. Gardner1, K. S. Hamby1, K. J. Houser1, K. T. Kain1, T.-Y. S. Lu1, R. E. Olszewski1, I. Pinn1, G. D. Shaw1, S. G. Shinpock1, A. M. Wymore1, D. K. Johnson1,2, C. T. Culiat 1,2, E. M. Rinchik1,3

1Life Sciences Division, Oak Ridge National Laboratory, P.O. Box 2008, Oak Ridge, TN 37831
2The University of Tennessee–Oak Ridge National Laboratory Graduate School of Genome Science and Technology, Oak Ridge, TN 37830
3Department of Biochemistry, Cellular, and Molecular Biology, The University of Tennessee, Knoxville, TN 37996

michaudejiii@ornl.gov

A major challenge following the sequencing of the human genome is to determine the biological functions of the estimated 40,000-70,000 genes, and the manner in which these genes are coordinately regulated and affected by environmental factors. By inducing mutations in mouse genes and determining the consequences of the mutations in the whole animal, we gain insight into the functions, regulatory networks, and gene-environment interactions of the homologous human genes. The recent availability of the complete DNA sequence of the mouse genome and high-throughput methods for rapid detection of single-nucleotide polymorphisms (SNPs) has facilitated genome-wide, gene-driven approaches to germline mutagenesis. Gene-driven mutagenesis strategies allow one to perform whole-genome mutagenesis, and then screen for alterations in any pre-selected gene(s) in the genome. To augment embryonic stem-cell-based gene-driven mutagenesis resources, such as gene-trap libraries and banks of N-ethyl-N-nitrosourea (ENU)-mutagenized cells, we are generating a bank of DNA, tissues (for RNAs and proteins), and sperm from 5000 individual C57BL/6JRn mice that carry a load of paternally induced ENU mutations. This ORNL Cryopreserved Mutant Mouse Bank (CMMB) will be a source of induced, heritable SNPs in the regulatory regions and coding sequences of virtually every gene in the genome. High-throughput Temperature Gradient Capillary Electrophoresis (see abstract by Culiat et al.) is being used to identify mutations in pre-selected genes in the DNAs and RNAs from the CMMB, and mutant mice will be recovered from frozen sperm to determine the biological functions of the homologous human genes. Thus, ORNL’s CMMB will provide mouse models of a wide range of altered proteins for phenotypic, gene/protein-network, and structural biology-type analyses. We envision the CMMB as a core component in the integration of mouse mutagenesis, gene-expression microarrays, proteomics, and computational biosciences at ORNL for the purpose of (1) determining the function of every gene on the three human chromosomes sequenced by the DOE (see other abstract by Michaud et al.), (2) deciphering complex biological systems underlying human health and susceptibility to the environmental consequences of energy production (see abstract by Jones et al.), and (3) strategically positioning ORNL to respond to DOE’s Genomes to Life program.

[Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, and by the Office of Biological and Environmental Research, managed by UT-Battelle, LLC for the U. S. Department of Energy under Contract No. DE-AC05-00OR22725.]


65. Filtering Out Functional Open Reading Frame Fragments from DNA

P. Zacchi1, D. Sblattero2, R. Marzari2, and A. Bradbury3

1CIB, Area di Ricerca, Padriciano 99, Trieste, Italy
2Dipartimento di Biologia, Universita' di Trieste, Trieste, Italy
3Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM 87545

amb@lanl.gov

In any functional analysis of the protein products of a genome, some method is required to physically isolate open reading frames, as opposed to merely identify them. There are two general approaches to this. In the first method, open reading frames are identified by the application of informatic methods to cDNA, EST, genomic sequences and specific primers are designed to amplify the open reading frame from a suitable source (e.g. cDNA). This can then be cloned into a vector of interest. In the second method, suitable DNA is fragmented and sampled at random, and open reading frames are selected. The first method provides full length open reading frames, while the second provides open reading frames which are fragments of full genes. The first method is relatively time consuming, but is more useful for the study of protein function, while the second method can be carried out more easily, and is more applicable to the study of immunological epitopes within gene products. In a model system, we have applied an example of the second method to the analysis of a monoclonal antibody epitope found in tissue transglutaminase (tTG). The gene for tTG was cloned into an expression plasmid and the plasmid fragmented into fragments of 300bp. This represents a model system in which four genes (tTG, rop, lacI and kanamycin) are present with an approximately equal amount of non coding sequence. The fragments were cloned into a vector designed to select open reading frames and a number of clones were identified which expressed the known mAb epitope. Furthermore, sequencing of random fragments revealed that the selection vector had a strong bias for real open reading frames of known function, and selected few open reading frames of no known biological function. This system is likely to be applicable to the efficient selection of random open reading frames representing the immunological coding potential of single genes, whole micro-organisms, normalized cDNA libraries or collections of cloned open reading frames.

Presentation of this poster is subject to completion and submission of the corresponding patent.


66. Towards High Throughput Antibody Selection

Jianlong Lou1, Roberto Marzari2, Peter Pavlik3, Milan Ovecka3, Nileena Velappan3, Leslie Chasteen3, Vittorio Verzillo3, Federica Ferrero6, Daniel Pak4, Morgan Sheng4, Chonglin Yang5, Daniele Sblattero2, and Andrew Bradbury 3

1Dept of Anesthesia 3s50, San Francisco General Hospital, UCSF, San Francisco, California
2Department of Biology, Universita' di Trieste, Trieste, Italy
3Biosciences Division, Mail Stop M888, Los Alamos National Laboratory, Los Alamos, New Mexico
4Department of Neurobiology, Massachusetts General Hospital, Boston, Massachusetts
5Department of Biochemistry and Molecular Biology, University of Maryland at Baltimore, Baltimore, Maryland
6SISSA, Trieste, Italy

amb@lanl.gov

Phage antibody libraries represent a relatively easy way to generate antibodies against a vast number of different ligands. Although in principle, phage antibody selection should be amenable to automation, this has not yet been described and present selection protocols are far from high throughput. We have reduced phage antibody selection to a microtitre format, and compared selection using this format to traditional selection.

Antibodies were selected against eleven different antigens using either a microtitre plate selection method (using pins rather than wells) or the “traditional” immunotube method. We find that the two methods tend to select different antibodies, with only 10% of antibodies in common, even if the plastic, the antigen and the library used are identical. This is in contrast to the use of the same method to select antibodies, when over 30% of antibodies selected are in common.

We are presently working on automating the phage antibody selection and screening method using a Tecan Genesis workstation and a Qbot picking robot. Results will be presented.


67. A Pilot Project for Identifying and Characterizing Protein Complexes

Edward C. Uberbacher, Frank Larimer, Bob Hettich, Greg Hurst, Michelle Buchanan, Dong Xu, and Ying Xu

Oak Ridge National Laboratory

ube@ornl.gov

This pilot project is developing the central technologies needed to build a Protein Complex Factory (PCF) designed to meet the needs described in Goal 1 of the Genomes to Life program for large scale identification and characterization of protein machines. This facility will eventually be able to identify the protein components of protein machines from microbial and eukaryotic genomes at high throughput and from whole cells. The identification of complexed proteins and the interrogation of protein complex organization will be conducted using a combination of mass spectrometry, protein crosslinking, and computing. The goals of this pilot project are to (i) to develop a combined experimental and computational capability for the identification and interrogation of protein-protein interactions and (ii) to apply the developed methodology to several isolated protein complexes and protein complexes within cell extracts, to demonstrate that the methodology is sufficiently accurate, informative and scalable to whole microbial and eukaryotic cells. The approach utilizes specialized affinity tagged crosslinking reagents capable of crosslinking proteins in complexes and which then allow for separation of crosslinked proteins in a cell extract from non-crosslinked products. A subsequent liquid chromatography and tandem mass spectrometry step can then resolve this mix into distinguishable fragments, and rapidly and selectively interrogate each fragment to obtain a unique sequence mass fragment fingerprint. This fingerprint can be used to identify the proteins involved in interprotein crosslinking through a database search methodology that also provides the identity of the specific amino acids involved in each crosslink. Once identified, the crosslinked positions and other information, such as crosslinker length, can be used as constraints to derive information about the geometry of each protein complex.

Several demonstrations are being developed as proof of principle based on well characterized complexes: (1) mouse wild-type hemoglobin and mutant forms, (2) the GroELS complex, and (3) the pmf ATPase molecular engine.

For these targeted complexes, the pilot project will (a) generate the necessary crosslinked complexes (b) detect and interrogate significant numbers of intermolecular crosslinked fragments by tandem mass spectrometry, and (c) deconvolute and interpret the data in terms of complex identification and organization. As part of the pilot project, estimates will be obtained that directly address issues of scale, including what data collection will cost and how long it would take to comprehensively examine protein machines in a whole cell. If successful, this pilot will set the stage for a Phase II production capability.


68. High-Throughput Protein Expression and Purification for Proteomics Research

Sharon Doyle, Jennifer Primus, Michael Murphy, Paul Richardson, and Trevor Hawkins

DOE Joint Genome Institute, Walnut Creek, CA 94598

sadoyle@lbl.gov

Full insight into the control of genomic sequences over many biological processes requires the analysis of the protein products. Only through the analysis of proteins on a genomic scale can we begin to understand the complexities encoded in the genome. Methods that allow for the production of proteins in a high-throughput manner are vital to achieve this goal. We have developed a system for high-throughput subcloning, protein expression and purification that is simple, fast and inexpensive. We utilize ligation-independent subcloning to create an expression vector encoding a N-terminal histidine tag. A dot blot expression screen was developed to analyze protein levels following expression in bacterial cultures, which facilitates the testing of multiple expression parameters if necessary. Protein purification in a 96-well format using Ni-NTA resin yields highly purified proteins. Using this system, we have optimized conditions to achieve a first pass rate success of up to 70% in prokaryotic systems, and are currently utilizing the expression screen to increase the efficiency of protein production from eukaryotic systems.

This work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, the Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098, and the Los Alamos National Laboratory under contract No. W-7405-ENG-36.


69. Visualization and Analysis of Protein DNA Complexes

William McLaughlin1, Xiang-jun Lu1, Susan Jones2, Janet Thornton2, and Helen M. Berman1

1Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, NJ 08854-8087
2Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College, Gower Street, London WC1E 6BT England

mclaugwi@rutchem.rutgers.edu

Simple and quantitative rules were created that can be used to discern DNA-binding protein structures in the Protein Data Bank and the Nucleic Acid Database. The rules are based on conserved structural characteristics analyzed by machine learning techniques. Where possible, a functional role has been assigned to each of the structural characteristics found.

We have also developed a computer program that depicts the interactions between DNA and proteins. This application creates schematic diagrams such as those seen in Figure 4 of Jones et al. (J. Mol. Biol. v287, pp877-896, 1999).

This work is funded by the Department of Energy (DE-FG02-96ER62166).


70. Structure/Function Analysis of Protein/Protein Interactions and Role of Dynamic Motions in Mercuric Ion Reductase

Susan M. Miller1, Aiping Dong2, Emil Pai2, Matthew J. Falkowski1, Richard Ledwidge1, Anne O. Summers3, and Jane Zelikova3

1Department of Pharmaceutical Chemistry, University of California - San Francisco
2Departments of Biochemistry, Medical Biophysics, Molecular and Medical Genetics, University of Toronto
3Department of Microbiology, University of Georgia, Athens

smiller@cgl.ucsf.edu

Mercuric ion reductase (MerA) is the key enzyme involved in bacterial pathways for detoxification of Hg(II) and organomercurials that result in the two-electron reduction of Hg(II) to elemental mercury. Extensive studies of this metal ion reductase have advanced our knowledge to the stage where detailed structure/function questions can be asked to gain deeper insight into how the structural components of the protein contribute to the efficient handling of the toxic metal ion. As these pathways are being incorporated into radiation resistant and other durable species for bioremediation purposes, these insights may prove invaluable for enhancing the activity of the protein and the whole pathway in the alternative organisms. In addition, with further insight into what features of the protein are critical for handling one metal ion, a second goal is to incorporate alternative ligands and properties to allow the protein to bind and reduce other toxic metals. Sequences of MerAs indicate the conservation of a multidomain catalytic core, in which a four-cysteine ligand exchange pathway for binding and reduction of Hg(II) has been identified. Two of the cysteines are found on the C-terminal segment of the protein that evidence suggests may require mobility for efficient catalysis. As one aspect of our studies, we are evaluating thermodynamic and kinetic properties of wild type and mutant MerAs with site-directed mutations in the ligand binding pathway and site of reduction, along with crystal structure analysis in order to evaluate the significance of mobility and other physicochemical properties on the efficiency of catalysis. In addition to the catalytic core, all but one reported MerA sequence also contain 1 or 2 N-terminal repeats of a domain (NmerA) with a conserved GMTCXXC metal-binding motif, the function of which has yet to be determined. A second aspect of our studies involves characterization of the NmerA structure, metal-binding properties and interactions with the catalytic core. To facilitate these studies, we have cloned and expressed the catalytic core and its single NmerA domain as separate proteins. As a third aspect of our studies, we are evaluating the effectiveness of the separate domains as participants in the mercury resistance operon in vivo. Results to date of these ongoing studies will be presented.


71. Investigating Protein Complexes by Crosslinking and Mass Spectrometry

Gregory B. Hurst1, Robert L. Hettich1, James L. Stephenson1, Phillip F. Britt1, Matthew Sega3, Jana Lewis1, Patricia K. Lankford2, Michelle V. Buchanan1, Edward C. Uberbacher2, Ying Xu2, Dong Xu2, Jane Razumovskaya3, and Victor N. Olman2

1Chemical Sciences Division and 2Life Sciences Division, Oak Ridge National Laboratory, Oak Ridge TN
3Genome Science and Technology Graduate School, University of Tennessee-Knoxville and Oak Ridge National Laboratory, Knoxville, TN

hurstgb@ornl.gov

In response to the Genomes to Life (GTL) Initiative, one component of the proposed Protein Complex Factory (PCF) at ORNL is a mass-spectrometry-based capability for high-throughput identification of protein complexes. The proposed strategy includes chemical crosslinking of interacting protein pairs, enzymatic digestion, affinity purification, and mass spectrometric analysis of the resulting crosslinked peptides to yield information on interacting protein pairs. Progress has been achieved on several elements of the proposed strategy. A special biotinylated family of crosslinkers will allow affinity isolation of either the crosslinked proteins, or of protolytic peptides from these proteins. A simple method for preparing these biotinylated crosslinking reagents has been tested. High throughput will be achieved by performing crosslinking reactions on cell lysates, or fractions thereof, using a large array of reaction conditions. Some elements of this array of reaction conditions will yield crosslinking of a small subset of the protein complexes in the lysate or fraction. Crosslinking reactions have been performed under a variety of conditions in a 96-well format using model proteins. Identifying interacting proteins from their crosslinked peptides will require tandem mass spectrometry to yield partial amino acid sequence information from each of the two crosslinked peptides of the pair. Tandem mass spectrometry (MS-MS and MS-MS-MS) of a crosslinked peptide pair from bovine ribonuclease A shows a fragmentation pattern that allows confirmation of the identities of the two crosslinked peptides. Computational methods for interpreting the resulting mass spectra are under development.


72. High-Density Protein Microarrays

Judith Maples, Joseph Spangler, Yanhong Wang, and Rajan Kumar

Genome Data Systems, Inc.

rkumar@genomedatasystems.com

There is an increasing interest analysis of whole proteome analysis using high-density microarrays of proteins. Protein microarrays would form the basis of new diagnostics and research tools in the future. For diagnostic applications, protein microarrays can rapidly detect the presence or absence of biomarkers associated with particular diseases. In genomic and proteomic research, arrays of antibodies have been used to investigate how much of a given protein is expressed at a given time and place. However, the difficulties associated with protein microarrays are more difficult to address than DNA microarrays. Since the proteins are larger than DNA molecules, the individual protein molecules have to be deposited further apart resulting in lower sensitivity. Cross-reactivity of proteins is a major concern for protein microarrays. Genome Data Systems, Inc. has developed an innovative and highly flexible technology, called GeneCube, for fabrication, use and analysis of three-dimensional protein microarrays. The method allows mass-production of microarrays as well as stringent quality control. It also provides better accuracy and precision in comparison with conventional microarrays. The detection of signal from the microarrays is performed using a proprietary detection approach that extracts the signal from individual elements of the array without cross talk. During the current program, GeneCube microarrays were used to investigate interactions between proteins and antibodies, and to perform functional screening for potential substrates.


73. Advantages of Multi Photon Detectors in Protein Quantitation

A.K. Drukier

BioTraces Inc.

akd@biotraces.com

We are developing an integrated proteins detection system that is at least a hundred times more sensitive than current techniques. This protein quantification system capitalizes on multiphoton detector's (MPD) exquisite instrumental sensitivity to enable the highest sensitivity detection and high throughput. The ultra high sensitivity and very large dynamic range combine to make MPD instruments far superior to other methods of protein detection. Because of its ability to specifically detect co‑resident labels, MPD technology in combination with prior-art protein microarrays (P-chips) permits about hundred-fold sensitivity improvement, mostly through new methods of non-specific biological background rejection.

MPD techniques: MPD is a proprietary detection system for the measurement of ultra-low amounts of selected radioisotopes [see www.biotraces.com]. MPD enhanced biomedical methods have several advantages over existing methods: 1,000-fold improvement in sensitivity, enabling measurement of previously undetectable amounts of target substances; high dynamic range (8-9 decades), eliminating the need for sample concentration or dilution; use of extremely low levels of radioisotope, avoiding the classification of test samples as radioactive; cost savings due to decreased amounts of reagents and time for testing. With sensitivity better than a thousand atoms of 125I, MPD marks a new milestone in detection where quantitation of sub-zeptomole amounts of biomaterial is possible. MPD techniques require less than one pCi of isotope, which is about a 100-times less activity than in a glass of water.

MPD enhanced immunoassays: A new, super sensitive immunoassay (IA/MPD) that provides quantitative measurement of biological substances at levels as low as a femtogram/ml, i.e. sub-attomole sensitivities of IA/MPDs for several cytokines as well as the HIV-1 p24 antigen. Pilot studies compared the IA/MPD to prior art immunoassay methods. The unprecedented sensitivity of a family of IA/MPDs for interleukins (IL-1 beta, IL-4, IL-6, IL-10, IL-11, IL-12), as well as, the HIV-1 p24 antigen has been documented. This quantitatively accurate MPD immunoassays have a landmark sensitivity of about 1 fg/ml, i.e. better than 0.1 attomole/ml using a total specific activity that is below the natural radioactive background. Essential to the success of each IA/MPD has been our work on developing protocols and proprietary reagents for the reduction of nonspecific biological binding.

P-chips/MPD: We are extending IA/MPD to creation of supersensitive P-chips with MPD read-out (P-chip/MPD) targeting up to 256 different proteins. The pattern of activities is measured by the MPD-Imager with sensitivity better than 10 zeptomole/pixel. The sensitivity of such P-chip/MPD is clearly limited by the non-specific biological backgrounds. Preliminary results suggest that a sensitivity of 10-50 fg/ml can be achieved; this result has been achieved when quantitating several cytokines concurrently.

Detection of BW agents: MPD technology is applicable to the full spectrum of BW agents, including viruses, bacteria, algae and biotoxins. Initially, we propose the use of MPD enhanced detection methods for individual targets. We expect to complete the development of supersensitive P-chip/MPD for BW agents within a year. This universal system, able to detect all major groups of biological warfare agents, represents the most sensitive and pragmatic solution to the detection of BW threats. The proposed applications of MPD for the detection of biological warfare agents can be divided into two synergistic projects: use of a panel of MPD enhanced immunoassays for detection of biotoxins, and development of a universal biological warfare agent detector using supersensitive P-chips/MPD.


205. Developing Genome Scale Models of Metabolism

Christophe Schilling, Bernhard Palsson

Genomatica, Inc., 5405 Morehouse Drive, Ste. 210, San Diego, CA 92121

cschilling@genomatica.com

Genome-scale metabolic networks can now be reconstructed based on annotated genomic data augmented with biochemical and physiological information about an organism. Mathematical analysis can be performed to assess the capabilities of these reconstructed networks. The constraints-based framework for cellular modeling and simulation has been successfully implemented on the genome scale level to predict time course of growth and by-product secretion, effects of mutation and knock-outs, and gene expression profiles. In this talk we discuss this approach, its future course of technology development, and how it is being implemented to develop genome scale models of organism of interest to the DOE.


The online presentation of this publication is a special feature of the Human Genome Project Information Web site.