Researchers Expand Efforts to Explore Functional
Landscape of the Human Genome
Full-Scale ENCODE Project Will Survey Entire Human Instruction
Book
The National Human Genome Research Institute (NHGRI), part of
the National Institutes of Health (NIH), today announced grants
totaling more than $80 million over the next four years to expand
the ENCyclopedia Of DNA Elements (ENCODE) project, which in its
pilot phase yielded provocative new insights into the organization
and function of the human genome.
"Based on ENCODE's early success, we are moving forward with
a full-scale initiative to build a parts list of biologically functional
elements in the human genome," said NHGRI Director Francis
S. Collins, M.D., Ph.D. "The ENCODE pilot, which looked at
just 1 percent of the human genetic blueprint, produced findings
that are reshaping many long-held views about our genome. ENCODE's
effort to survey the entire genome will uncover even more exciting
surprises, providing us with a more complete picture of the biological
roots of human health and disease."
While the sequencing of the human genome was a major scientific
achievement, it was just the first step toward the ultimate goal
of using genomic information to diagnose, treat and prevent disease.
In recent years, researchers have made major strides in using DNA
sequence data to help find genes, which are the parts of the genome
that code for proteins. The protein-coding component of these genes,
however, makes up just a small fraction of the human genome — about
1.5 percent. There is strong evidence that other parts of the genome
have important functions, but very little information exists about
where these other functional elements are located and how they
work. The ENCODE project aims to address this critical goal of
genomics research.
In June, the ENCODE research consortium published a set of landmark
papers in the journals Nature and Genome Research that found the
organization, function and evolution of the genome to be far more
complicated than most had suspected. For example, while researchers
have traditionally focused on studying genes and their associated
proteins, the ENCODE data indicate the genome is a very complex,
interwoven network in which genes are just one of many types of
DNA sequences with functional impact.
"We learned many valuable lessons from the ENCODE pilot project.
Among them was the importance of scientific teamwork," said
Elise A. Feingold, Ph.D., program director for ENCODE in NHGRI's
Division of Extramural Research. "Following the pilot's strong
example of multi-disciplinary collaboration, we are confident that
the scaled-up ENCODE team will succeed in its quest to build a
comprehensive catalog of the components of the human genome that
are crucial to biological function."
In addition to the research grants to support expansion of the
ENCODE project, NHGRI also announced awards today for two pilot-scale
projects, an ENCODE data coordination center, and six projects
to develop novel methods and technologies aimed at helping the
ENCODE project achieve its goals.
"As was the case for the Human Genome Project and the ENCODE
pilot, all of the data generated by the full-scale ENCODE project
will be deposited into public databases as soon as they are experimentally
verified," said Peter Good. Ph.D., program director for genome
informatics in NHGRI's Division of Extramural Research. "Free
and rapid access to this data will enable researchers around the
world to pose new questions and gain new insights into how the
human genome functions."
The principal investigators chosen to receive the ENCODE scale-up
grants are:
- Bradley Bernstein, M.D., Ph.D.; Broad Institute of MIT and
Harvard, Cambridge, Mass.; $4.8 million (four years); High-Throughput
Sequencing of Chromatin Regulatory Elements. Utilizing the technique
of chromatin immunoprecipitation followed by high-throughput
DNA sequencing, this team will map modifications of histones
in various types of human cells. Histones are proteins that play
a key role in DNA packaging.
- Gregory Crawford, Ph.D.; Duke University Institute for Genome
Sciences & Policy, Durham, N,C.; $6.5 million (four years); Comprehensive
Identification of Active Functional Elements in Human Chromatin.
These researchers will seek to identify and characterize regions
of open chromatin through DNase I hypersensitivity assays, formaldehyde-assisted
isolation of regulatory elements and chromatin immunoprecipitation
for a few key DNA-binding factors. Chromatin is the complex of
DNA and proteins that makes up chromosomes.
- Thomas Gingeras, Ph.D.; Affymetrix Inc., Santa Clara, Calif.;
$10.2 million (four years); Comprehensive Characterization and
Classification of the Human Transcriptome. This group will identify
protein-coding and non-protein-coding ribonucleic acid (RNA)
transcripts using microarrays, high-throughput sequencing, sequenced
paired-end ditags and sequenced cap analysis of gene expression
tags. RNA is an information molecule vital to a number of biological
functions, including protein production.
- Tim Hubbard, Ph.D.; Wellcome Trust Sanger Institute, Hinxton,
England; $8.5 million (four years); Integrated Human Genome Annotation:
Generation of a Reference Gene Set. Using computational methods,
manual annotation and targeted experiments, this team will annotate
gene features in the human genome. Such features include genes
that code for proteins; genes that are transcribed, but do not
code for proteins; and pseudogenes, which are DNA sequences similar
to normal genes, but which have been altered slightly so they
are not functional.
- Richard Myers, Ph.D.; Stanford University, Stanford, Calif.;
$14.6 million (four years); Global Annotation of Regulatory Elements
in the Human Genome. This group has two goals: to identify transcription
factor binding sites by using chromatin immunoprecipitation followed
by high-throughput sequencing, and to pilot the use of high-throughput
sequencing to determine the methylation status of CpG-rich regions
of the human genome. Transcription factors are proteins and enzymes
that initiate the transcription of a gene's DNA sequence into
RNA. Methylation refers to a specific chemical modification of
DNA, which can silence or reduce the activity of the affected
region of DNA.
- Michael Snyder, Ph.D.; Yale University, New Haven, Conn., $11.5
million (four years); Production Center for Global Mapping of
Regulatory Elements. These researchers will identify transcription
factor binding sites in the human genome using chromatin immunoprecipitation,
followed by high-throughput sequencing.
- John Stamatoyannopoulos, M.D.; University of Washington, Seattle;
$9.7 million (four years); A Comprehensive Catalog of Human DNase
I Hypersensitive Sites. This team will map and functionally classify
DNase I hypersensitive sites across major human cell lineages.
It will do this using digital DNase I and histone modification
mapping by high-throughput sequencing. DNAse I is an enzyme that
cleaves DNA at sites where it is exposed by regulatory proteins.
DNase I hypersensitive sites mark the location of regulatory
elements in the human genome. team will map and classify DNase
I hypersensitive sites in open chromatin using microarrays, high-throughput
sequencing and the polymerase chain reaction. DNAse I is an enzyme
that cleaves DNA at exposed sites.
The principal investigators chosen to receive the ENCODE pilot-scale
grants are:
- Scott Tenenbaum, Ph.D.; University at Albany-State University
of New York, $2.2 million (three years); Comprehensive Identification
of ENCODE RNA-based, Cis-regulatory Elements. In this pilot project,
researchers will strive to identify sites that are targets for
RNA-binding proteins through immunoprecipitation coupled with
microarrays and high-throughput sequencing.
- Zhiping Weng, Ph.D.; Boston University; $1.5 million (three
years); Identification of Transcriptional Factor-Binding Sites
in Human Promoters. This pilot project will aim to computationally
predict transcription factor binding sites that determine the
activities of promoters. Promoters are regions of DNA that serve
as binding sites for proteins that guide the initiation of transcription
of genes.
The Data Coordination Center for ENCODE will be led by:
- W. James Kent, Ph.D.; University of California, Santa Cruz;
$5 million (four years); The UCSC ENCODE Data Coordination Center.
This group will collect, organize, store, manage and provide
access to data from ENCODE and related projects.
The principal investigators chosen to receive technology development
grants are:
- Howard Chang, M.D., Ph.D.; Stanford University, Stanford, Calif.;
$1.3 million (three years); Structural Motifs in RNA. These researchers
will develop high-throughput methods to predict functional motifs
in RNA, to map RNA structure and to assign biological functions
to RNA motifs.
- Michael Dorschner, Ph.D.; University of Washington, Seattle;
$1.1 million (three years); High-Definition In Vivo Footprinting
via Single Molecule Sequencing. This group’s goal is to develop
an in vivo method that utilizes single molecule sequencing to
identify sites of protein-DNA interaction by differential cleavage
sensitivity with ultraviolet light, dimethylsulfate and DNase
I.
- John Greally, Ph.D.; Albert Einstein College of Medicine, Bronx,
N.Y.; $1.5 million (three years); Massively Parallel Sequencing
Technology for the Epigenome. This team will work to develop
high-throughput sequencing methods to analyze methylation of
cytosine and to map histone modifications.
- Xiaoman Li, Ph.D.; Indiana University, Indianapolis; $870,000
(three years); Discovery of Cis-Regulatory Modules in the Human
Genome. This team will strive to develop computational methods
for identifying conserved cis-regulatory modules in non-protein
coding regions of the human genome.
- Marcelo Nobrega, M.D., Ph.D.; University of Chicago; $1.5 million
(three years); Generation and In Vivo Validation of Cis-regulatory
Maps in Eukaryotic Genomes. The two goals of this group are:
to develop tagged DNA binding proteins that are recognizable
by tag-specific antibodies for use in mapping binding sites for
a wide range of proteins, and to develop platforms to test predicted
enhancers, silencers and insulators in the human genome.
- Yijun Ruan, Ph.D.; Genome Institute of Singapore; $990,000
(three years); Whole Genome Chromatin Interaction Analysis Using
Paired-End diTagging. This team will develop methods to characterize
long-range chromatin interactions involved in transcription using
high-throughput sequencing.
NHGRI is one of 27 institutes and centers at NIH, an agency of
the Department of Health and Human Services. NHGRI's Division of
Extramural Research supports grants for research and for training
and career development at sites nationwide. For more information
about NHGRI, visit www.genome.gov.
The National Institutes of Health (NIH) — The Nation's
Medical Research Agency — includes 27 Institutes and
Centers and is a component of the U.S. Department of Health and
Human Services. It is the primary federal agency for conducting
and supporting basic, clinical and translational medical research,
and it investigates the causes, treatments, and cures for both
common and rare diseases. For more information about NIH and
its programs, visit www.nih.gov.
|