ENCODE Consortium Publishes Scientific Strategy
New Technology Development Grants Will Aid Quest To Find All Functional Elements in Human DNA
Bethesda, Maryland A research consortium organized by the National
Human Genome Research Institute (NHGRI), part of the National Institutes
of Health (NIH), today published a paper in the journal Science
detailing the scientific rationale and strategy behind its quest
to produce a comprehensive catalog of all parts of the human genome
crucial to biological function. Also today, NHGRI announced the
award of $5.5 million in technology development grants to provide
new tools for the pioneering effort.
In a peer-reviewed article published in the Oct. 22 issue of Science,
the ENCyclopedia Of DNA Elements (ENCODE) consortium outlines its
plans for achieving its ambitious goal of building a “parts
list” of all sequence-based functional elements in the human
DNA sequence. The list will include: protein-coding genes; non-protein-coding
genes; regulatory elements involved in the control of gene transcription;
and DNA sequences that mediate chromosomal structure and dynamics.
The ENCODE researchers also anticipate they may uncover additional
functional elements that have yet to be recognized.
“Creating this monumental reference work will help us mine
and fully utilize the human genome sequence. Such knowledge will
lead to a far deeper understanding of human biology and stimulate
the development of new strategies for improving human health,”
said NHGRI Director Francis S. Collins, M.D., Ph.D.
While the completion of the Human Genome Project in April 2003,
and the publication of the finished human genome sequence in Nature
just this week, marked significant scientific achievements, these
are only the first steps toward the ultimate goal of using information
about the human genome sequence to diagnose, treat and prevent disease.
Over the past several years, researchers have made major strides
in using DNA sequence data to help find genes, which are the parts
of the genome that code for proteins. The protein-coding component
of these genes, however, makes up just a small fraction of the human
genome about 1.5 percent. There is strong evidence that other parts
of the genome have important functions, but very little information
exists about where these other “functional elements”
are located and how they work. The ENCODE project aims to address
this critical goal of genomics research.
Launched in September 2003, the ENCODE project is being implemented
in three phases: a pilot phase, a technology development phase and
a production phase. In the pilot phase, which is expected to last
three years, ENCODE researchers are devising and testing high-throughput
ways of efficiently applying known approaches to identify functional
elements. Their collaborative efforts are centered on 44 DNA targets,
which together cover about 1 percent of the human genome, or about
30 million base pairs. The target regions were strategically selected
to provide a representative cross section of the entire human genome
sequence. Simultaneously, in the second phase of the ENCODE Project,
the technology development component, other research groups are
striving to develop new technologies designed to widen the array
of novel methods and technologies available to be applied to the
ENCODE project. Guided by the results of the first two phases, NHGRI
will decide how to initiate the production phase and expand the
ENCODE project to analyze the remaining 99 percent of the human
genome.
“Major challenges lie ahead on the road to a complete encyclopedia
of DNA elements,” said Elise A. Feingold, Ph.D., NHGRI’s
program director in charge of the ENCODE project. “Such work
is well beyond the scope of any single group. However, by bringing
together researchers with a broad range of interests and expertise
to work in a highly collaborative setting, we expect that the ENCODE
consortium will have the power to achieve a goal of this magnitude.”
Among the many hurdles facing the ENCODE consortium is the complexity
of the problem. No single experimental approach can be used to identify
all functional elements, and many current methods may not provide
a cost effective means of finding functional elements in a target
as large as the human genome. Furthermore, many functional elements
are only active in certain types of cells or at certain stages of
development, which means it may be necessary to analyze many different
types of human cells. In addition, if a truly comprehensive inventory
is to be created, more work needs to be done to learn about functional
elements not surveyed in the pilot project, including centromeres
(the middles of chromosomes) and telomeres (the ends of chromosomes).
In their Science article, ENCODE researchers set forth their plans
for addressing these and other challenges.
NHGRI has designated the ENCODE project as a community resource
project, which means that all data generated for this project will
be deposited in free, public databases as soon as they are experimentally
verified. “During the Human Genome Project, our policy of
rapid data release enabled researchers to take advantage of human
genomic sequence data as soon as they were produced. Similarly,
the ENCODE consortium will make valuable data rapidly available
for use by scientists around the world,” said Mark S. Guyer,
Ph.D., director of NHGRI’s Division of Extramural Research.
Also today, NHGRI announced the award of a second set of ENCODE
technology development grants, which are intended to complement
the first set of technology development grants made in 2003 by adding
more novel methods and technologies to the consortium’s “tool
box.” “These grants are aimed at broadening the types
of functional elements that we are studying under ENCODE and also
expanding the portfolio of technologies that we can apply to them,”
said Peter Good, Ph.D., NHGRI’s program director for genome
informatics.
Recipients of the 2004 ENCODE Technology Development Grants and
their total approximate funding are:
Joseph R. Ecker, Ph.D., The Salk Institute, La Jolla, Calif.
“Genome Wide Analysis of DNA Methylation” $1.5 million
(3 years)
Vishwanath Iyer, Ph.D., University of Texas, Austin “Sequence
Tag Analysis of Genomic Enrichment (STAGE) and Formaldehyde-Assisted
Isolation of Regulatory Elements (FAIRE) for Regulatory Element
Identification” $1.3 million (3 years)
Yijun Ruan, Ph.D., Genome Institute of Singapore “Di-tag
Technologies for Complete Transcriptome Annotation” $1 million
(3 years)
Thomas Tullius, Ph.D., Boston University “Structure
of Genomic DNA at Single-Nucleotide Resolution” $870,000 (3
years)
Madaiah Puttaraju, Ph.D., Intronn Inc., Gaithersburg, Md.
“Use of RNA Trans-splicing to Identify Splice Sites”
$420,000 (2 years)
Scott Tenenbaum, Ph.D., University at Albany, State University
of New York “Identifying Functional Regulatory Elements
in RNA” $410,000 (2 years)
The ENCODE consortium currently is comprised of several research
teams in the United States, as well as groups in Canada, Singapore,
Spain and the United Kingdom. The collaborative effort is open to
all interested researchers in academia, government and industry
who agree to abide by the consortium’s guidelines.
For more detailed information on the ENCODE project, including
a complete list of participants and the consortium’s data
release and accessibility policies, go to: www.genome.gov/ENCODE.
ENCODE data that can be directly linked to genomic sequence will
be made available at the University of California, Santa Cruz ENCODE
Genome Browser (www.genome.ucsc.edu/ENCODE)
and the ENSEMBL Browser (www.ensembl.org).
NHGRI is one of 27 institutes and centers at NIH, an agency of
the Department of Health and Human Services. The NHGRI Division
of Extramural Research supports grants for research and for training
and career development at sites nationwide. Additional information
about NHGRI can be found at: www.genome.gov.
|