ENCODE Consortium Publishes Scientific Strategy
New Technology Development Grants Will Aid Quest
To Find All Functional Elements in Human DNA
BETHESDA, Md., Thurs., Oct. 21, 2004 - A research consortium organized by the
National Human Genome Research Institute (NHGRI), part of the National Institutes
of Health (NIH), today published a paper in the journal Science detailing the
scientific rationale and strategy behind its quest to produce a comprehensive
catalog of all parts of the human genome crucial to biological function. Also
today, NHGRI announced the award of $5.5 million in technology development grants
to provide new tools for the pioneering effort.
In a peer-reviewed article published in the Oct. 22 issue of Science, the
ENCyclopedia Of DNA Elements (ENCODE) consortium outlines its plans for achieving
its ambitious goal of building a "parts list" of all sequence-based
functional elements in the human DNA sequence. The list will include: protein-coding
genes; non-protein-coding genes; regulatory elements involved in the control
of gene transcription; and DNA sequences that mediate chromosomal structure
and dynamics. The ENCODE researchers also anticipate they may uncover additional
functional elements that have yet to be recognized.
"Creating this monumental reference work will help us mine and fully utilize
the human genome sequence. Such knowledge will lead to a far deeper understanding
of human biology and stimulate the development of new strategies for improving
human health," said NHGRI Director Francis S. Collins, M.D., Ph.D.
While the completion of the Human Genome Project in April 2003, and the publication
of the finished human genome sequence in Nature just this week, marked significant
scientific achievements, these are only the first steps toward the ultimate
goal of using information about the human genome sequence to diagnose, treat
and prevent disease. Over the past several years, researchers have made major
strides in using DNA sequence data to help find genes, which are the parts of
the genome that code for proteins. The protein-coding component of these genes,
however, makes up just a small fraction of the human genome - about 1.5 percent.
There is strong evidence that other parts of the genome have important functions,
but very little information exists about where these other "functional
elements" are located and how they work. The ENCODE project aims to address
this critical goal of genomics research.
Launched in September 2003, the ENCODE project is being implemented in three
phases: a pilot phase, a technology development phase and a production phase.
In the pilot phase, which is expected to last three years, ENCODE researchers
are devising and testing high-throughput ways of efficiently applying known
approaches to identify functional elements. Their collaborative efforts are
centered on 44 DNA targets, which together cover about 1 percent of the human
genome, or about 30 million base pairs. The target regions were strategically
selected to provide a representative cross section of the entire human genome
sequence. Simultaneously, in the second phase of the ENCODE Project, the technology
development component, other research groups are striving to develop new technologies
designed to widen the array of novel methods and technologies available to be
applied to the ENCODE project. Guided by the results of the first two phases,
NHGRI will decide how to initiate the production phase and expand the ENCODE
project to analyze the remaining 99 percent of the human genome.
"Major challenges lie ahead on the road to a complete encyclopedia of
DNA elements," said Elise A. Feingold, Ph.D., NHGRI's program director
in charge of the ENCODE project. "Such work is well beyond the scope of
any single group. However, by bringing together researchers with a broad range
of interests and expertise to work in a highly collaborative setting, we expect
that the ENCODE consortium will have the power to achieve a goal of this magnitude."
Among the many hurdles facing the ENCODE consortium is the complexity of the
problem. No single experimental approach can be used to identify all functional
elements, and many current methods may not provide a cost effective means of
finding functional elements in a target as large as the human genome. Furthermore,
many functional elements are only active in certain types of cells or at certain
stages of development, which means it may be necessary to analyze many different
types of human cells. In addition, if a truly comprehensive inventory is to
be created, more work needs to be done to learn about functional elements not
surveyed in the pilot project, including centromeres (the middles of chromosomes)
and telomeres (the ends of chromosomes). In their Science article, ENCODE researchers
set forth their plans for addressing these and other challenges.
NHGRI has designated the ENCODE project as a community resource project, which
means that all data generated for this project will be deposited in free, public
databases as soon as they are experimentally verified. "During the Human
Genome Project, our policy of rapid data release enabled researchers to take
advantage of human genomic sequence data as soon as they were produced. Similarly,
the ENCODE consortium will make valuable data rapidly available for use by scientists
around the world," said Mark S. Guyer, Ph.D., director of NHGRI's Division
of Extramural Research.
Also today, NHGRI announced the award of a second set of ENCODE technology
development grants, which are intended to complement the first set of technology
development grants made in 2003 by adding more novel methods and technologies
to the consortium's "tool box." "These grants are aimed at broadening
the types of functional elements that we are studying under ENCODE and also
expanding the portfolio of technologies that we can apply to them," said
Peter Good, Ph.D., NHGRI's program director for genome informatics.
Recipients of the 2004 ENCODE Technology Development Grants and their total
approximate funding are:
Joseph R. Ecker, Ph.D., The Salk Institute, La Jolla, Calif. "Genome
Wide Analysis of DNA Methylation" - $1.5 million (3 years)
Vishwanath Iyer, Ph.D., University of Texas, Austin "Sequence Tag Analysis
of Genomic Enrichment (STAGE) and Formaldehyde-Assisted Isolation of Regulatory
Elements (FAIRE) for Regulatory Element Identification" - $1.3 million
(3 years)
Yijun Ruan, Ph.D., Genome Institute of Singapore "Di-tag Technologies
for Complete Transcriptome Annotation" - $1 million (3 years)
Thomas Tullius, Ph.D., Boston University "Structure of Genomic DNA at
Single-Nucleotide Resolution" - $870,000 (3 years)
Madaiah Puttaraju, Ph.D., Intronn Inc., Gaithersburg, Md. "Use of RNA
Trans-splicing to Identify Splice Sites" - $420,000 (2 years)
Scott Tenenbaum, Ph.D., University at Albany, State University of New York "Identifying Functional Regulatory Elements in RNA" - $410,000 (2
years)
The ENCODE consortium currently is comprised of several research teams in the
United States, as well as groups in Canada, Singapore, Spain and the United
Kingdom. The collaborative effort is open to all interested researchers in academia,
government and industry who agree to abide by the consortium's guidelines.
For more detailed information on the ENCODE project, including a complete list
of participants and the consortium's data release and accessibility policies,
go to: www.genome.gov/ENCODE. ENCODE data that can be directly linked to genomic
sequence will be made available at the University of California, Santa Cruz
ENCODE Genome Browser (www.genome.ucsc.edu/ENCODE) and the ENSEMBL Browser (www.ensembl.org).
NHGRI is one of 27 institutes and centers at NIH, an agency of the Department
of Health and Human Services. The NHGRI Division of Extramural Research supports
grants for research and for training and career development at sites nationwide.
Additional information about NHGRI can be found at: www.genome.gov.
Contact:
Geoff Spencer
NHGRI
(301) 402-0911
spencerg@mail.nih.gov
Last Reviewed: September 2006
|