LARGE SCALE SEQUENCING CAPACITY

RELEASE DATE:  January 16, 2003                

RFA:  HG-03-002
 
National Human Genome Research Institute (NHGRI)
 (http://www.nhgri.nih.gov/)

LETTER OF INTENT RECEIPT DATE:  February 24, 2003

APPLICATION RECEIPT DATE:  April 7, 2003

THIS RFA CONTAINS THE FOLLOWING INFORMATION

o   Purpose of this RFA
o   Background
o   Research Objectives
o   Mechanism(s) of Support 
o   Funds Available
o   Eligible Institutions
o   Individuals Eligible to Become Principal Investigators
o   Where to Send Inquiries
o   Letter of Intent
o   Submitting an Application
o   Peer Review Process
o   Review Criteria
o   Receipt and Review Schedule
o   Award Criteria
o   Required Federal Citations

PURPOSE OF THIS RFA 

The NHGRI invites applications for cooperative agreements to support 
research centers capable of providing large-scale capacity to sequence 
the genomes of a variety of organisms of high biomedical interest.   It 
is expected that these large-scale sequencing centers will initially 
operate at the state of the art in terms of throughput, quality, and 
cost, and that they will continually improve that state of the art over 
the period of the award.
 
BACKGROUND

The NHGRI initiated a pilot project program to develop the capability 
for sequencing the human genome in 1995.  Based on the success of that 
program, several large-scale sequencing centers were funded in early 
1999 to determine the complete DNA sequence of the human genome by 
2005.  That effort is now on track to achieve this historic milestone 
by April 2003.  The success of the program was, in significant part, 
due to improvements in sequencing technology and strategy, and in the 
organization of the large-scale sequence production efforts.
  
Technology.  In 1995, commercially available sequencing machines used 
polyacrylamide slab gel technology to analyze products of four-color 
fluorescent cycle sequencing. Those instruments were succeeded in 1999 
with 96-capillary sequencers.  Use of the new instruments led to major 
increases in the output of sequence data and reductions in cost because 
they allowed significant reductions in the amount of labor and reagents 
required, and increases in the number of runs per day.  Their use also 
resulted in important increases in data quality by eliminating the 
lane-tracking problem that had plagued slab gel-based sequencing.  
Other important technological advances, including improved robotics for 
DNA preparation and sequencing reactions, new purification methods, 
better project tracking, and incremental improvements in chemistry, 
also contributed to the increased efficiency and output of large-scale 
genomic DNA sequencing in this period.

Strategy.  Important advances were also made in sequencing strategy 
during this time.  The public effort to sequence the human genome 
initially employed a hierarchical shotgun strategy based on shotgun 
sequencing of mapped large-insert clones.  A different approach, the 
whole-genome shotgun (WGS) strategy, originally developed for bacterial 
genomes and proposed for large genomes by Weber and Myers in 1997, was 
implemented to produce a high quality draft sequence of the Drosophila 
melanogaster genome in 1999 and was used in assembling a draft version 
of the human genome in 2000.  The development of whole-genome assembly 
software was critical to the success of whole genome shotgun 
sequencing.   With the publication of two draft versions of the human 
genome sequence, NHGRI concluded that a hybrid strategy employing both 
whole genome and hierarchical shotgun components is actually the best 
way currently available to efficiently achieve an accurate and useful 
finished assembly of a large genome with a complex repeat structure.  
Accordingly, a hybrid strategy was used for the sequencing of the mouse 
genome by the public consortium, which initiated the project with a WGS 
approach that quickly (15 months) led to a draft version of the genome 
sequence assembled and refined with one of two publicly available WGS 
assembly programs. At the same time, a BAC physical map of the mouse 
genome was developed.  Information from this map was used to anchor the 
sequence scaffolds and improve the whole genome assembly, and 
individual BAC clones from that map are now being used in the finishing 
of the mouse genomic sequence.  The current state of the art for 
sequencing whole genomes thus uses both hierarchical and WGS elements 
in a complementary fashion, varying the proportion of each and 
customizing the read types, insert sizes, and the use of map 
information based on the size and other characteristics of the target 
genome and the end-point being sought (i.e., finished, draft, etc.).  
 
Organization.  Yet another major contributor to the success of large-
scale genome sequencing was the centralization of the effort into a few 
large sequencing laboratories, as predicted by many at the outset of 
the HGP.  In the case of the NHGRI program, a number of centers were 
initially funded during a pilot program to develop and test different 
approaches to large-scale genomic sequencing.  Through subsequent 
rounds of competition, several of the approaches and groups that were 
most successful in implementing large-scale sequence production were 
expanded, and three very large-scale, highly efficient NHGRI-funded 
centers emerged. Similarly, a small number of other very large centers 
supported by other funding sources have established themselves around 
the world.

Programmatic Conclusions. Several lessons emerged during the sequencing 
of the human genome.  First, by implementing new technologies, 
sequencing costs in all large-scale centers dropped approximately ten-
fold between 1995 and 2002, and it is likely that further significant 
cost reductions can be gained by additional incremental technology or 
process improvements.  Second, a small number of truly large-scale 
centers have demonstrated that they can sequence mammalian-sized 
genomes rapidly and cost-effectively, simplifying the administrative 
and coordination issues involved in implementing a worldwide program 
for genomic sequencing.  Third, it does not appear that large-scale 
sequencing centers have yet reached a capacity limit where the 
efficiencies of scale are no longer realized.  On the basis of such 
experience, the NHGRI has concluded that sufficient capacity is now 
available so that one, or at most two, large centers can sequence a 
mammalian genome in one to three years, and there are few compelling 
reasons to divide a genome project among many centers.  NHGRI plans to 
continue to support genomic sequencing capacity at levels sufficient to 
meet the demand for sequencing high priority targets.  Because 
experience has shown that programmatic efficiency is obtained by 
concentrating capacity in a few large centers, the Institute's sequence 
production program will focus on a small number (three to four) of 
high-throughput, state-of-the-art centers.
 
The Research Network.  The NHGRI large-scale sequencing program has 
been comprised of a set of cooperative agreements organized as a 
Research Network.  This was necessary at the start of the program when 
the centers were relatively small and the efforts of several sequencing 
groups were necessary to complete the sequence of the genome of a 
single organism.  Although that rationale is no longer applicable and 
it is anticipated that the awards made under the renewal of this 
program will be large enough to support a single center to sequence a 
genome of moderate to large size within the period of the award, NHGRI 
has determined that it will be useful to maintain the Research Network 
organization (see Terms and Conditions).  It is likely that there will 
still be benefits to close collaboration between two or more centers on 
certain large genomes. Most importantly, there is significant 
scientific benefit to close coordination between several large-scale 
sequencing centers on issues such as technology development, quality 
standards, and sequencing strategies.

RESEARCH OBJECTIVES

The features of a state-of-the-art large-scale sequencing center are:

o   An automated production pipeline with a minimum throughput of 20 
million attempted reads per year and a success rate of at least 80% 
(successful reads defined as having a minimum of 100 high quality 
bases, i.e. bases with quality scores of 20 using the phred base-
calling program, or the equivalent);
o   A capacity to finish at least 15Mb of sequence per month;
oAn overall average read length for successful reads of at least 550 
high quality bases;
o   A high read pairing rate, if a strategy using paired end reads is 
employed; 
o   A cost per attempted production read of $1.50 or less. (Production 
read costs considered here are inclusive of amortized equipment, 
technology development, informatics, and indirect costs, as detailed 
below); 
o   The ability to produce both high quality draft genome assemblies and 
finished genome sequences: 
o   The ability to use multiple genome sequencing strategies, 
including clone-based and whole genome shotgun approaches, or 
even more effective strategies; and the ability to adapt and 
modify approaches as the technology and demands on the center 
change;
o   The ability to assemble genomic sequence at all scales (large-
insert clone through whole genome); 
o   Expertise in efficiently closing and finishing large genomic 
regions (of the size of mammalian chromosomes), with a total 
cost per finished base of $0.05 or less above the cost of 
draft sequence;
o   Integrated bioinformatics capabilities to support production, 
including systems and database administration, laboratory 
information management, and data handling and deposition; 
o   Ability to perform primary annotation of genomic sequence to 
maximize its utility for the biological community;
o   A technology development capability focused on developing and 
integrating technology improvements (e.g., robotics, chemistries, 
protocols) that lead to increased efficiency and decreased 
sequencing cost;
o   A proven track record in rapidly and efficiently releasing both 
sequence and trace data to public repositories; 
o   A high degree of flexibility and ability to interact with other 
publicly funded large-scale sequencing centers, public databases, 
genome analysis experts, physical mapping groups, and other entities 
with whom it may be necessary to integrate to be maximally 
productive in taking on very large projects.

The purpose of this RFA is to solicit applications for projects for 
continued large-scale, state-of-the-art production of genomic sequence, 
coupled with the likelihood of further improvement in cost, quality, 
and efficiency of large-scale sequencing over the term of the award.  

Target choice.  The NHGRI large-scale sequencing program has separated 
the process for selecting organisms to be sequenced from the process 
for funding the large-scale sequencing centers (see a description of 
the white paper process for identifying new target genomes at 
http://www.genome.gov/page.cfm?pageID=10002189).  Funded sequencing 
centers will choose sequencing targets from the list developed through 
the white-paper process, in consultation with NHGRI and the Scientific 
Advisory Panel of the large-scale sequencing Research Network (see 
Definitions below). Therefore, applications submitted in response to 
this RFA should focus on the technical aspects of production sequencing 
and should not propose specific sequencing targets. However, the 
applicant should discuss the center's internal process for choosing 
sequencing targets based on the white-paper process.  The applicant 
may, if desired, discuss areas of likely interest, including rationales 
for potential choices; this discussion could illuminate the applicant's 
motivation for sequencing or approach to using genomic sequence data 
(e.g., comparative genomics, etc.).

Current approaches to large-scale genome sequence production can be 
conceptualized in four general areas: shotgun sequencing read 
production (including the associated laboratory management and other 
informatics infrastructure), sequence finishing, analytical 
informatics, and technology development.  The application should be 
constructed so that each of these components is broken out and 
discussed separately.  Additionally, the budget requested for each 
component must be broken out and presented separately.  Applicants may 
propose appropriate collaborations and/or subcontracts for specific 
elements such as sequence read production, physical map production, 
assembly, annotation, etc.  Specific instructions on how to respond to 
these sequence production considerations are detailed below in the 
section SPECIAL APPLICATION GUIDANCE.

Shotgun sequence read production.  The application should describe the 
shotgun sequence production component of the proposed center in terms 
of the role that it will play in the overall sequencing objectives of 
the center, the strategy or strategies the center will take to generate 
shotgun sequence, the projected throughput, expected or needed read 
characteristics and quality, assembly strategy and all other pertinent 
factors.  The discussion should take into account the fact that NHGRI 
objectives will require that different genomic sequences will need to 
be finished to different degrees and that, therefore, it cannot be 
known until a genomic sequencing project is initiated what the 
requirements for finishing that particular sequence will be.  

In the presentation of the shotgun-sequencing component, the applicant 
should discuss all pertinent informatics issues.  These include, but 
are not limited to, the informatics infrastructure proposed for the 
center, such as the basic IT infrastructure/system administration, lab 
information management, and data handling/deposition, as well as the 
informatics required for the assembly of the shotgun sequence data up 
to the point at which it is handed off to the finishing component.  In 
all cases, software development should be described in detail.

Finished sequence production.  The applicant should describe all 
pertinent aspects of sequence finishing, defined as starting at handoff 
from the draft-sequencing component and proceeding through finishing 
projects and data deposition.  The applicant should also discuss map 
closure. Informatics and software development integral to this 
component should be discussed.

NHGRI appreciates that the needs for production of draft and finished 
sequence are no longer coupled – many genomes will be sequenced to 
draft quality, but will not need to be finished, or could even be 
partly or completely finished by a different center than produced the 
draft. In addition, we anticipate that the quality of a draft genome is 
likely to become sufficiently high that there will be even less demand 
for finished sequence. However, for the foreseeable future, there is a 
clear need for the Institute's program to maintain some amount of 
finishing capacity and to maintain an appropriate balance of finishing 
and production activities among the funded centers as a group.

If an applicant chooses to propose only shotgun sequence production, 
then the question of how the necessary finishing needs can be met must 
be addressed, including how the draft product and any substrates for 
finishing will be archived or transferred.  Applications proposing a 
substantial finishing component in addition to shotgun sequence 
production should include a brief discussion of their ability to finish 
draft data generated by other groups, if desired. Applications 
proposing finishing without a substantial and efficient draft sequence 
production component will not be considered responsive to this RFA; 
while the demands for draft sequence are clear, there is currently no 
separate, well-defined backlog or pipeline of draft genomes that could 
be used by a stand-alone finishing center. 

Analytical informatics.  Most large-scale sequencing centers currently 
perform automated annotation of the genome sequences they produce. The 
appropriate level of annotation in sequencing centers strikes a balance 
between, on the one hand, providing a minimal level of annotation 
necessary to provide a useful product for the community and, on the 
other, rapidly releasing that sequence to the community to make it 
available for more extensive and complete annotations and analyses. The 
appropriate types of annotation carried out by sequencing centers 
usually include annotation of gaps, low quality bases, and other 
sequence quality measures; automated annotation of repeat sequences; 
and automated annotation of genes. Additionally, a modest amount of 
analysis of annotation features to perform quality assessment of 
sequence may be appropriate. 

Applicants to this RFA may propose appropriate automated annotation 
and, if so, should provide details in the application about the extent 
of annotation to be done, justifying choices based on the utility of 
the annotated sequence to the user community, and defining the point at 
which primary annotation (of projects, genome assemblies, finished 
genomes, etc.) will be considered to be complete enough for hand-off to 
the community.  More extensive analyses of biological features of 
sequence beyond those needed for assembly and appropriate primary 
annotation should not be requested in responses to this RFA.

Technology development.  Incremental technology improvements within 
centers have played an important role in increasing the efficiency and 
decreasing the cost of large-scale sequencing.  NHGRI encourages such 
technology development activities in this RFA. Plans and costs for 
technology improvement within the center must be well described and 
justified in terms of leading to a reduction in sequencing costs.   The 
cost of such technology development should be clearly described so that 
its contribution to the overall sequence production costs can both be 
reflected in read or per base costs, and be separately identified and 
evaluated.

Crosscutting issues to be explicitly discussed in the application.  In 
addition to the four components outlined above, there are a number of 
other issues important to the successful operation of a state-of-the-
art large-scale sequencing center that should be discussed separately 
in the application:

1) Physical maps. If the proposed approach to genomic sequencing 
requires a physical map, the application must describe how the maps 
will be generated or acquired, and made available. The cost of 
production or acquisition of such maps must be clearly described so 
that its contribution to the overall sequence production costs can both 
be reflected in read or per base costs, and be separately identified 
and evaluated.

2) Sequence quality.  The applicant should describe how s/he will 
ensure the quality of, or otherwise validate, the genomic sequence that 
will be produced, and all intermediate products (maps, libraries, 
reads, paired ends, assemblies, and finished or other end products). 
Evidence of the effectiveness of such quality assessment programs 
should be included. 

3) Sequence cost.  The applicant should describe plans for achieving 
continued reduction in sequencing cost.  Proposed cost analyses should 
be described in the same terms used in the format for reporting prior 
costs (see SPECIAL APPLICATION GUIDANCE), that is, in terms of the 
total costs, the fully loaded costs of shotgun reads, and the 
incremental costs per base of finishing.  The calculated costs of 
sequencing must take into account all of the expenses associated with 
sequence production (that is, the total costs for the grant).  In 
addition, the portion of shotgun read costs and finishing costs that 
are attributable to informatics infrastructure, assembly, annotation, 
and technology development should be identified.  

4) Management Plan.  The management of a sequencing center requires a 
significant commitment by the Principal Investigator (P.I.) of the 
project.  The P.I. of a large-scale project funded under this RFA is 
expected to devote at least 30% effort to the project.   The 
application should describe the management plan for the proposed 
center, and how it will support the goals proposed.  It should describe 
the organization of the proposed center and its management structure, 
including integration of the separate components to form an efficient 
pipeline, key personnel, section leaders, and reporting relationships.  
Recruitment and training of personnel should be discussed. The plan 
should describe how the various components of the proposed center will 
be integrated, and how collaborations or subcontracts, if proposed, 
will be managed.  The issue of how any other, ongoing large-scale 
sequencing projects would be integrated with the one to be funded under 
this RFA should be discussed.

5) Data release.  NHGRI strongly endorses rapid release of genomic data 
and materials.  The NHGRI policy on release of sequence data is 
available at http://www.genome.gov/page.cfm?pageID=10000910.
Applicants should also be familiar with the NIH statements regarding
intellectual property of resources developed with Federal 
funds (http://www.ott.nih.gov/policy/rt_guide_final.html).
Responses to this RFA should propose a plan for data release, as quality of the
data release plan will be a criterion in the review of the application. 
Appropriate data release plans will be made a condition of the awards made as a
result of this RFA.  Each of the following items should be discussed separately:

o Release of sequence read and trace data
o Release of assembled projects (large-insert clone assemblies)
o Release of whole genome shotgun assemblies
o Release of annotation performed by the center
o Availability of map information
o Availability of software developed with funds from this award
o Availability of resources developed (eg, 10kb libraries, fosmids)
o Availability of technologies developed with funds from this award

6) Training.  The applicant must propose a plan in response to NHGRI's 
Action Plan for increasing the number of underrepresented minorities in 
genome research.  Please see http://www.genome.gov/page.cfm?pageID=10003996
for a description of the Action Plan.  An application that is rated highly
for its scientific program will not be funded until its response to the
Action Plan is deemed to be adequate by peer review.  

NHGRI recognizes that it is possible that by the time applications are 
submitted to this RFA there may be superior alternatives to the current 
state-of-the-art of technology platforms; if so, NHGRI encourages 
applicants to propose incorporating such technology as the core of a 
large-scale sequencing center.  In that case, the applicant must 
provide justification, including any preliminary data supporting the new
technology, that such new technology would result in a sequence production
process that would equal, or exceed, the current process in terms of
throughput, success rate, quality, and cost per read (or cost per base).  

In summary, applicants for awards under this RFA:

o   should focus primarily on all aspects of genome sequence production, 
explicitly describing how the proposed center will attain or exceed 
the current state-of-the-art in throughput, quality, and cost using 
the information requested under the SPECIAL GUIDANCE FOR APPLICANTS 
(below) as a format;
o   should describe how the center will improve on the state-of-the-art, 
with the goal of reducing costs; 
o   should use explicit milestones and timelines when describing 
production goals;
o   should discuss major production goals in relation to the state-of-
the-art outlined above; 
o   should use the web-based progress and cost reporting spreadsheets as 
described in the SPECIAL GUIDANCE FOR APPLICANTS;
o   should propose appropriate technology development that will improve 
efficiency;
o   should explicitly discuss informatics infrastructure needed for 
sequence production;
o   should propose assembly and/or primary annotation, but not extensive 
biological analyses;  
o   should explicitly discuss center management and how various 
components of the center will be integrated; 
o   should provide information about the applicant's prior experience in 
large-scale genomic sequencing and available resources;
o   should propose a plan for assessing the quality of sequence and 
sequence assemblies;
o   should discuss the general approaches or strategies to be used for 
sequencing large genomes, and also rationales or motivations that 
will be used to select genomes identified by the NHGRI white paper 
process (see http://www.genome.gov/page.cfm?pageID=10002189), 
without choosing in advance a specific target genome;
o   should discuss interactions, collaborations, or subcontracts that 
may be appropriate (eg, in support of sequencing large genomes, for 
interactions with interested model organism communities,  for 
physical mapping, assembly, etc.);
o   should propose a comprehensive data release plan;
o   should propose a plan for training, especially of under-represented 
minorities.

MECHANISM OF SUPPORT
 
This RFA will use the NIH Specialized Center -- Cooperative Agreement 
(U54) award mechanism.  This RFA is a one-time solicitation.  The 
anticipated award date is November 2003.
 
The U54 is a cooperative agreement award mechanism in which the 
Principal Investigator retains the primary responsibility and dominant 
role for planning, directing, and executing the proposed project, with 
NIH staff being substantially involved as a partner with the Principal 
Investigator, as described under the section "Cooperative Agreement 
Terms and Conditions of Award"  

FUNDS AVAILABLE

Three to four large-scale sequencing centers will be funded for a 
three-year term.  Toward the end of this period, the NHGRI large-scale 
sequencing effort will be evaluated to determine the priority of 
genomic sequence and improvements in sequencing technology with other 
research priorities. 

The NHGRI large-scale sequencing effort was funded at a level of about 
$190M in fiscal year 2002.  The planning process that NHGRI is 
currently undergoing has highlighted many other vital and exciting 
research opportunities that have been afforded by the availability of 
genomic sequence data.  As the planning process has not been completed 
at the time of the publication of this RFA, it is not yet possible to 
definitively state the proportion of the NHGRI budget that will be used 
for large-scale sequencing.  However, it is not expected to increase, 
and indeed may decrease somewhat as new research priorities are 
defined.  In spite of this, the expectation is that the output of the 
NHGRI sequencing program will not decrease and it is hoped it will 
increase because of reduced sequencing costs.    If more precise 
information about the funds available to support awards made in 
response to this RFA is available before the applications are due, an 
updated notice will be published in the NIH Guide for Grants and Contracts.
 
Awards pursuant to this RFA are contingent on the availability of funds 
appropriated to the NHGRI and the receipt of a sufficient number of 
meritorious applications.
 
ELIGIBLE INSTITUTIONS
 
You may submit (an) application(s) if your institution has any of the 
following characteristics:
        
o   For-profit or non-profit organizations 
o   Public or private institutions, such as universities, colleges, 
hospitals, and laboratories 
o   Units of state and local governments
o   Eligible agencies of the Federal government  
o   Domestic 

Foreign institutions are not eligible.
 
INDIVIDUALS ELIGIBLE TO BECOME PRINCIPAL INVESTIGATORS   

Any individual with the skills, knowledge, and resources necessary to 
carry out the proposed research is invited to work with his/her 
institution to develop an application for support.  Individuals from 
underrepresented racial and ethnic groups as well as individuals with 
disabilities are always encouraged to apply for NIH programs.   
 
SPECIAL REQUIREMENTS 

In describing the research plan, the applicant must address the issues 
and questions in the format described in the Special Guidance to 
Applicants section below.  

Cooperative Agreement Terms and Conditions of Award 

The following terms and conditions will be incorporated into the award 
statement and will be provided to the Principal Investigator, as well 
as the appropriate institutional official, at the time of award.  The 
following special terms of award are in addition to, and not in lieu 
of, otherwise applicable OMB administrative guidelines, HHS grant 
administration regulations at 45 CFR Parts 74 and 92 [Part 92 is 
applicable when State and local Governments are eligible to apply], and 
other HHS, NIH, and NIH grant administration policies: 

1.  The administrative and funding instrument used for this program 
will be the Specialized Center -- Cooperative Agreement (U54).  The 
cooperative agreement is an "assistance" mechanism (rather than an 
"acquisition" mechanism), in which substantial NIH scientific and/or 
programmatic involvement with the awardee is anticipated during the 
performance of the activity. Under the Cooperative Agreement, the NIH 
purpose is to support and/or stimulate the recipient's activity by 
involvement in and otherwise working jointly with the award recipient 
in a partner role, but it is not to assume direction, prime 
responsibility, or a dominant role in the activity. Consistent with 
this concept, the dominant role and prime responsibility for the 
activity resides with the awardee(s) for the project as a whole, 
although specific tasks and activities in carrying out the study will 
be shared among the awardee(s) and the NHGRI Program Director.

2. P.I. Rights and Responsibilities: 

The P.I. will have the primary responsibility for defining the details 
for the sequencing production center within the guidelines of RFA HG-
03-002 and for performing the scientific activities. The P.I. will 
agree to accept close coordination, cooperation, and participation of 
NHGRI staff in those aspects of scientific and technical management of 
the project as described under "NIH Program Staff Responsibilities."
 
The P.I. of a genome sequencing production center will: 

o   Determine experimental approaches, design protocols, set project 
milestones and conduct experiments;
o   Provide goals for throughput, quality, and cost to the NHRI as 
requested (usually at the outset of the award and in six-month 
progress reports, but also at other times as requested by NHGRI 
program staff); 
o   Ensure that the genomic sequence produced meets the quality 
standards and costs agreed upon at the time of award; 
o   Ensure that the sequence data (reads, assemblies) are deposited in 
the appropriate public database (e.g., GenBank or other, as 
specified by NHGRI program staff), that resources developed as part 
of this project are made publicly available according to NHGRI 
policies, and that results are published in a timely manner;
o   Adhere to the NHGRI policies regarding intellectual property, data 
release and other policies that might be established during the 
course of this activity;
o   Integrate with the NHGRI large-scale sequencing white paper process 
for selecting target genomes (see http://www.genome.gov/page.cfm?pageID
=10002189);
o   Submit data for quality assessment in any manner specified by the 
Steering Committee or the Scientific Advisory Panel; 
o   Submit periodic progress reports in a standard format, as agreed 
upon by the Steering Committee and the Scientific Advisory Panel;
o   Accept and implement any other common guidelines and procedures 
developed for the NHGRI large-scale sequencing program and approved 
by the Steering Committee and the Scientific Advisory Panel;
o   Accept and participate in the cooperative nature of the Genome 
Sequencing Research Network;
o   Coordinate and collaborate with other U.S. and international groups 
sequencing large genomes; 
o   Inform the Program Director of all major interactions of members of 
the Steering Committee;
o   Attend Steering Committee meetings;
o   Lead the center's efforts to respond to the NHGRI Action Plan for 
increasing the representation of under-represented minorities in 
genome research.

3. NHGRI Program Staff Responsibilities: 

The NHGRI Program Director will have substantial 
scientific/programmatic involvement during the conduct of this activity 
through technical assistance, advice and coordination.  However, the 
role of NHGRI will be to facilitate and not to direct the activities. 
It is anticipated that decisions in all activities will be reached by 
consensus of the Genome Sequencing Research Network and that NHGRI 
staff will be given the opportunity to offer input to this process.  
One NHGRI Program Director shall participate as a member of the 
Steering Committee and will have one vote.

The Program Director will: 

o   Participate (with the other Steering Committee members) in the group 
process of setting research priorities, deciding optimal research 
approaches and protocol designs, and contributing to the adjustment 
of research protocols or approaches as warranted. The Program 
Director will assist and facilitate the group process and not direct it; 
o   Negotiate throughput, quality, and cost goals with the awardees as 
necessary;
o   Serve as a liaison between the awardees and the Scientific Advisory 
Panel, the National Advisory Council for Human Genome Research, and 
the larger community in helping the awardee(s) select targets from 
the list developed by the white paper process;
o   Coordinate the efforts of the awardees with other participants in 
the NHGRI large-scale sequencing program, including other awardees 
under this RFA and those awardees involved in the NHGRI BAC library 
programs (library production, characterization, and construction of 
physical maps; see URL http://www.genome.gov/page.cfm?pageID=10001691);
with other U.S. large-scale sequencing efforts, and with the international 
sequencing community; as well as with the larger biological community;
o   Attend all Steering Committee meetings as a voting member and assist 
in developing operating guidelines, quality control procedures, and 
consistent policies for dealing with recurrent situations that 
require coordinated action.
o   Schedule the time for, and prepare concise (3 to 4 pages) summaries 
of, the Steering Committee meetings, which will be delivered to 
members of the group within 30 days after each meeting;  
o   Periodically report progress to the Director, NHGRI; 
o   Lend relevant expertise and overall knowledge of NHGRI- sponsored 
research to facilitate the selection of scientists not affiliated 
with the awardee institutions who are to serve on the Scientific 
Advisory Panel and the Steering Committee; 
o   Serve as liaison between the Steering Committee and the Scientific 
Advisory Panel;
o   Serve on subcommittees of the Steering Committee and the Scientific 
Advisory Panel, as appropriate; 
o   Provide advice in the management and technical performance of the 
investigation; 
o   Assist in promoting the availability of the genome sequence and 
related resources developed in the course of this project to the 
scientific community at large;  
o   Participate in data analyses, interpretations, and where warranted, 
co-authorship of the publication of results of studies conducted 
through the Genome Sequencing Research Network; 
o   Assist awardees in the development, if needed, of policies for 
dealing with situations that require coordinated action; 
o   Retain the option to recommend, with the advice of the Scientific 
Advisory Panel, the withholding or reduction of support from any 
cooperative agreement that substantially fails to achieve its goals 
according to the milestones agreed to at the time of award, fails to 
maintain state-of-the-art capabilities, or fails to comply with the 
Terms and Conditions of the award. 

An NHGRI Program Director will be responsible for the normal 
stewardship of this award; this same Program Director may, in addition, 
be substantially involved as described above.

4. Collaborative Responsibilities—Steering Committee

The Steering Committee will serve as the main governing board of the 
Genome Sequencing Research Network.  The Steering Committee membership 
will include the NHGRI Program Director and the P.I. of each awarded 
cooperative agreement.  Additional members may be added by action of 
the Steering Committee. Other government staff may attend the Steering 
Committee meetings, if their expertise is required for specific discussions. 

The Steering Committee will:
 
o   Discuss progress in meeting the research community's need for 
genomic sequence.
o   Help to develop uniform procedures and policies, for example for 
data quality measures and assessment, nomenclature and annotation 
conventions for data depositions, and so forth. Members of the 
Steering Committee will be required to accept and implement the 
common guidelines and procedures approved by the Steering Committee, 
program director and Scientific Advisory Panel.
o   Serve as a venue for coordination on improving the state of the art, 
for example by reporting progress, disseminating best practices and 
collectively evaluating new procedures, resources, and technologies.
o   Serve, in appropriate subgroups, as a coordinating body in cases 
where two or more centers are collaborating closely on sequencing a 
single large genome, where for example common policies and data 
exchange are critical to the success of the effort.

5. Scientific Advisory Panel 

The Scientific Advisory Panel (SAP) will be responsible for reviewing 
and evaluating the progress of the members of the Genome Sequencing 
Research Network toward meeting their individual and collective goals. 
The SAP will provide recommendations to the Director, NHGRI, about 
continued support of the components of the Genome Sequencing Research 
Network. The Advisory Panel is composed of four to six senior 
scientists with relevant expertise who are not P.I.s of a cooperative 
agreement involved in the Genome Sequencing Research Network. The 
membership of the Scientific Advisory Panel may be enlarged 
permanently, or on an ad hoc basis, as needed.  

The Scientific Advisory Panel will meet at least once a year. During 
part of this meeting, there will be a joint meeting with the Steering 
Committee to allow the Scientific Advisory Panel members to interact 
directly with the awardees. Annually, the Scientific Advisory Panel 
will make recommendations regarding progress of the Genome Sequencing 
Research Network and present advice about changes, if any, which may be 
necessary in the Genome Sequencing Research Network program to the 
Director, NHGRI. 

5. Arbitration Process 

Any disagreement that may arise on scientific/programmatic matters 
(within the scope of the award), between award recipients and the NHGRI 
may be brought to arbitration.  An Arbitration Panel will be composed 
of (i) a designee of the Steering Committee chosen without the NHGRI 
staff voting, (ii) one NHGRI designee, and (iii) a third designee with 
relevant expertise who is chosen by the other two (in the case of an 
individual disagreement, the first member may be chosen by the 
individual awardee). The Arbitration Panel will help resolve both 
scientific and programmatic issues that develop during the course of 
work and that restrict progress.  This special arbitration procedure 
in no way affects the awardee's right to appeal an adverse action that 
is otherwise appealable in accordance with NIH regulations 42 CFR Part 
50, Subpart D and HHS regulation at 45 CFR Part 16. 
 
6. Yearly Milestones 

Each awardee will be asked to define a set of yearly milestones at the 
time of the award and to update these milestones annually at the 
anniversary date. These will be made a condition of the award.  In 
accord with the procedures described above, NHGRI may withhold or 
reduce funds for a project that substantially fails to meet its 
milestones or to maintain the state of the art. 

WHERE TO SEND INQUIRIES

We encourage inquiries concerning this RFA and welcome the opportunity 
to answer questions from potential applicants.  Inquiries may fall into 
three areas – programmatic/ scientific, peer review, and financial or 
grants management issues:

Direct inquiries regarding programmatic issues to:

Dr. Jane L. Peterson or Dr. Adam Felsenfeld
Division of Extramural Research 
National Human Genome Research Institute 
National Institutes of Health 
Building 31, Room B2B07 MSC 2033
Bethesda, MD 20892-2033
Telephone:  (301) 496-7531 
FAX:  (301) 480-2770 
E-mail:  Jane_Peterson@nih.gov; Adam_Felsenfeld@nih.gov 

Direct inquiries regarding peer review issues to:

Dr. Rudy Pozzatti
Scientific Review Administrator 
Office of Scientific Review 
National Human Genome Research Institute 
National Institutes of Health 
Building 31, Room B2B37, MSC 2032
Bethesda, MD 20982-2032
Telephone:  (301) 402-0838
Fax:  (301) 435-1580
E-mail:  Rudy_Pozzatti@nih.gov 

Direct inquiries regarding fiscal matters to:

Ms. Jean Cahill 
Grants Management Officer
Grants Administration Branch 
National Human Genome Research Institute 
Building 31, Room B2B34, MSC 2031
Bethesda, MD 20892-2031
Telephone:  301-435-7858
FAX:  (301) 402-1951 
E-mail:  Jean_Cahill@nih.gov 

LETTER OF INTENT
 
Prospective applicants are asked to submit a letter of intent that 
includes the following information:

o Descriptive title of the proposed research
o Name, address, and telephone number of the Principal Investigator
o Names of other key personnel 
o Participating institutions
o Number and title of this RFA 

Although a letter of intent is not required, is not binding, and does 
not enter into the review of a subsequent application, the information 
that it contains allows IC staff to estimate the potential review 
workload and plan the review.
 
The letter of intent is to be sent by the date listed at the beginning 
of this document.  The letter of intent should be sent to:

Dr. Jane L. Peterson 
Division of Extramural Research 
National Human Genome Research Institute 
National Institutes of Health 
Building 31, Room B2B07 MSC 2033
Bethesda, MD 20892-2033
Telephone:  (301) 496-7531 
FAX:  (301) 480-2770 
E-mail:  Jane_Peterson@nih.gov 

SUBMITTING AN APPLICATION

Applications must be prepared using the PHS 398 research grant 
application instructions and forms (rev. 5/2001).  The PHS 398 is 
available at http://grants.nih.gov/grants/funding/phs398/phs398.html in 
an interactive format.  For further assistance contact GrantsInfo, 
Telephone (301) 435-0714, Email: GrantsInfo@nih.gov.

Applicants must use the web-based cost reporting format and are 
encouraged to use the progress report format as described in the 
SPECIAL GUIDANCE FOR APPLICANTS as part of the Progress Report and 
Research Proposal sections. 
 
USING THE RFA LABEL: The RFA label available in the PHS 398 (rev. 
5/2001) application form must be affixed to the bottom of the face page 
of the application.  Type the RFA number on the label.  Failure to use 
this label could result in delayed processing of the application such 
that it may not reach the review committee in time for review.  In 
addition, the RFA title and number must be typed on line 2 of the face 
page of the application form and the YES box must be marked. The RFA 
label is also available at: 
http://grants.nih.gov/grants/funding/phs398/label-bk.pdf.
 
SENDING AN APPLICATION TO THE NIH: Submit a signed, typewritten 
original of the application, including the Checklist, and three signed, 
photocopies, in one package to:
 
Center For Scientific Review
National Institutes Of Health
6701 Rockledge Drive, Room 1040, MSC 7710
Bethesda, MD  20892-7710
Bethesda, MD  20817 (for express/courier service)
 
At the time of submission, two additional copies of the application 
must be sent to:

Dr. Rudy Pozzatti
Scientific Review Administrator 
Office of Scientific Review 
National Human Genome Research Institute 
National Institutes of Health 
Building 31, Room B2B37, MSC 2032
Bethesda, MD 20982-2032
Telephone:  (301) 402-0838
Fax:  (301) 435-1580
E-mail:  Rudy_Pozzatti@nih.gov

APPLICATION PROCESSING: Applications must be received by the 
application receipt date listed in the heading of this RFA.  If an 
application is received after that date, it will be returned to the 
applicant without review.
 
The Center for Scientific Review (CSR) will not accept any application 
in response to this RFA that is essentially the same as one currently 
pending initial review, unless the applicant withdraws the pending 
application.  The CSR will not accept any application that is 
essentially the same as one already reviewed. This does not preclude 
the submission of substantial revisions of applications already 
reviewed, but such applications must include an Introduction addressing 
the previous critique.

PEER REVIEW PROCESS  
 
Upon receipt, applications will be reviewed for completeness by the CSR 
and for responsiveness by the NHGRI.  

Incomplete applications will be returned to the applicant without 
further consideration.  And, if the application is not responsive to 
the RFA, CSR staff may contact the applicant to determine whether to 
return the application to the applicant or submit it for review in 
competition with unsolicited applications at the next appropriate NIH 
review cycle.

Applications that are complete and responsive to the RFA will be 
evaluated for scientific and technical merit by an appropriate peer 
review group convened by the NHGRI in accordance with the review 
criteria stated below.  As part of the initial merit review, all 
applications will:

o   Receive a written critique
o   Undergo a process in which only those applications deemed to have 
the highest scientific merit, generally the top half of the 
applications under review, will be discussed and assigned a priority score
o   Receive a second level review by the National Advisory Council for 
Human Genome Research. 

REVIEW CRITERIA

The goals of NIH-supported research are to advance our understanding of 
biological systems, improve the control of disease, and enhance health.  
In the written comments, reviewers will be asked to discuss the 
following aspects of the application in order to judge the likelihood 
that the proposed research will have a substantial impact on the 
pursuit of these goals: 

o   Significance 
o   Approach 
o   Innovation
o   Investigator
o   Environment
  
The scientific review group will address and consider each of these 
criteria in assigning the application's overall score, weighting them 
as appropriate for each application.  The application does not need to 
be strong in all categories to be judged likely to have major scientific
impact and thus deserve a high priority score.  For example, it may propose
to carry out important work that by its nature is not innovative but is
essential to move a field forward.

(1) SIGNIFICANCE:  Does the proposal address the Research Objectives 
outlined in this RFA? How much will the proposed center contribute to 
the NHGRI-supported large-scale genomic sequencing program? What is the 
potential for further increases in the efficiency of large-scale 
sequencing beyond current practices?  Will the output of the proposed 
center make a significant contribution to the availability of genome 
sequences to the community?

(2) APPROACH:  Are the conceptual framework, design, methods, and 
analyses adequately developed, well integrated, and appropriate to the 
aims of the project?  Are they likely to lead to successful attainment 
of the stated goals? Does the applicant acknowledge potential problem 
areas and consider alternative tactics? Are the proposed milestones 
reasonable?  Is the quality assessment/validation plan adequate?

(3) INNOVATION:  Does the project employ novel concepts, approaches or 
methods to improve technology for genomic sequencing, aimed at reducing 
costs and/or increasing throughput or quality? 

(4) INVESTIGATOR:  Is the applicant appropriately trained and well 
suited to carry out this work?  

(5) ENVIRONMENT:  Does the scientific environment in which the proposed 
work will be done contribute to the probability of success?  Does the 
proposed center take advantage of unique features of the scientific 
environment or employ useful collaborative arrangements?  Is there 
evidence of institutional support?

ADDITIONAL REVIEW CRITERIA: In addition to the above criteria, 
applications received in response to RFA HG-03-002 will also be 
reviewed with respect to the following:

o   The likelihood that the proposed center can produce high-quality 
genome sequence, based on past experience and future plans for 
generating high quality read data, accurate subassemblies and whole 
genome assemblies, and finished sequence at and beyond state-of-the-
art levels of throughput, quality, and cost. 
o   The quality of the plan to continue increasing throughput while 
lowering costs.
o   The quality of the plan for technology development and identifying 
and solving critical integration problems.
o   The quality of the plans for bioinformatics, including 
infrastructure/laboratory information management, assembly, and 
primary annotation.
o   The quality of the applicant's approach to sequencing, including 
considerations for choosing target genomes, sequencing 
approach/strategy, degree of completion, and considerations that may 
arise due to the variety of potential target genomes available.
o   The quality of the plan for release of sequence data, including 
evidence that the systems are in place to support data release, and 
the plans for release or distribution of other resources, software, 
or technologies developed under this award.
o   The quality of the plans to coordinate efforts with other large-
scale sequencing centers in the U.S. and abroad, and with 
appropriate subcontractors or collaborators that may be needed.  
o   The track record of the Principal Investigator and other key 
personnel in large-scale genomic sequencing.
o   The reasonableness of the proposed budget, milestones, timelines, 
and goals in relation to the proposed research.

RECEIPT AND REVIEW SCHEDULE

Letter of Intent Receipt Date:  February 24, 2003
Application Receipt Date:  April 7, 2003
Peer Review Date:  May/June 2003
Council Review:  September 2003
Earliest Anticipated Start Date:  November 2003

AWARD CRITERIA

Award criteria that will be used to make award decisions include:

o   Scientific merit (as determined by peer review);
o   Availability of funds;
o   Programmatic priorities, both as a matter of achieving balance 
within the sequencing program to ensure a balanced program for 
meeting variable sequencing target goals, and between the sequencing 
program and other NHGRI activities;
o   The likelihood that the proposed center will make a significant 
contribution to the availability of high-quality sequenced genomes 
to the community; 
o   The prospect for attaining and improving the state-of-the-art in 
genome sequencing with regard to throughput, quality, and cost.

REQUIRED FEDERAL CITATIONS 

INCLUSION OF WOMEN AND MINORITIES IN CLINICAL RESEARCH: It is the 
policy of the NIH that women and members of minority groups and their 
sub-populations must be included in all NIH-supported clinical research 
projects unless a clear and compelling justification is provided 
indicating that inclusion is inappropriate with respect to the health 
of the subjects or the purpose of the research. This policy results 
from the NIH Revitalization Act of 1993 (Section 492B of Public Law 103-43).

All investigators proposing clinical research should read the AMENDMENT 
"NIH Guidelines for Inclusion of Women and Minorities as Subjects in 
Clinical Research - Amended, October, 2001," published in the NIH Guide 
for Grants and Contracts on October 9, 2001 
(http://grants.nih.gov/grants/guide/notice-files/NOT-OD-02-001.html);
a complete copy of the updated Guidelines is available at 
http://grants.nih.gov/grants/funding/women_min/guidelines_amended_10_
2001.htm.  The amended policy incorporates the use of an NIH definition 
of clinical research; updated racial and ethnic categories in 
compliance with the new OMB standards; clarification of language 
governing NIH-defined Phase III clinical trials consistent with the new 
PHS Form 398; and updated roles and responsibilities of NIH staff and 
the extramural community.  The policy continues to require for all NIH-
defined Phase III clinical trials that: a) all applications or 
proposals and/or protocols must provide a description of plans to 
conduct analyses, as appropriate, to address differences by sex/gender 
and/or racial/ethnic groups, including subgroups if applicable; and b) 
investigators must report annual accrual and progress in conducting 
analyses, as appropriate, by sex/gender and/or racial/ethnic group 
differences.

INCLUSION OF CHILDREN AS PARTICIPANTS IN RESEARCH INVOLVING HUMAN 
SUBJECTS: The NIH maintains a policy that children (i.e., individuals 
under the age of 21) must be included in all human subjects research, 
conducted or supported by the NIH, unless there are scientific and 
ethical reasons not to include them. This policy applies to all initial 
(Type 1) applications submitted for receipt dates after October 1, 1998.

All investigators proposing research involving human subjects should 
read the "NIH Policy and Guidelines" on the inclusion of children as 
participants in research involving human subjects that is available at 
http://grants.nih.gov/grants/funding/children/children.htm.

REQUIRED EDUCATION ON THE PROTECTION OF HUMAN SUBJECT PARTICIPANTS: NIH 
policy requires education on the protection of human subject participants
for all investigators submitting NIH proposals for research involving human
subjects.  This policy announcement can be found in the NIH Guide for Grants
and Contracts Announcement, dated June 5, 2000, at
http://grants.nih.gov/grants/guide/notice-files/NOT-OD-00-039.html.

PUBLIC ACCESS TO RESEARCH DATA THROUGH THE FREEDOM OF INFORMATION ACT: 
The Office of Management and Budget (OMB) Circular A-110 has been 
revised to provide public access to research data through the Freedom 
of Information Act (FOIA) under some circumstances.  Data that are (1) 
first produced in a project that is supported in whole or in part with 
Federal funds and (2) cited publicly and officially by a Federal agency 
in support of an action that has the force and effect of law (i.e., a 
regulation) may be accessed through FOIA.  It is important for applicants to
understand the basic scope of this amendment.  NIH has provided guidance at 
http://grants.nih.gov/grants/policy/a110/a110_guidance_dec1999.htm.

Applicants may wish to place data collected under this PA in a public 
archive, which can provide protections for the data and manage the 
distribution for an indefinite period of time.  If so, the application 
should include a description of the archiving plan in the study design 
and include information about this in the budget justification section 
of the application. In addition, applicants should think about how to 
structure informed consent statements and other human subjects 
procedures given the potential for wider use of data collected under 
this award.

URLs IN NIH GRANT APPLICATIONS OR APPENDICES: All applications and 
proposals for NIH funding must be self-contained within specified page 
limitations. Unless otherwise specified in an NIH solicitation, 
Internet addresses (URLs) should not be used to provide information 
necessary to the review because reviewers are under no obligation to 
view the Internet sites.   Furthermore, we caution reviewers that their 
anonymity may be compromised when they directly access an Internet site.

HEALTHY PEOPLE 2010: The Public Health Service (PHS) is committed to 
achieving the health promotion and disease prevention objectives of 
"Healthy People 2010," a PHS-led national activity for setting priority 
areas. This RFA is related to one or more of the priority areas. 
Potential applicants may obtain a copy of "Healthy People 2010" at 
http://www.health.gov/healthypeople.

AUTHORITY AND REGULATIONS: This program is described in the Catalog of 
Federal Domestic Assistance No. 93.172, and is not subject to the 
intergovernmental review requirements of Executive Order 12372 or 
Health Systems Agency review.  Awards are made under authorization of 
Sections 301 and 405 of the Public Health Service Act as amended (42 
USC 241 and 284) and administered under NIH grants policies described 
at http://grants.nih.gov/grants/policy/policy.htm and under
Federal Regulations 42 CFR 52 and 45 CFR Parts 74 and 92. 

The PHS strongly encourages all grant recipients to provide a smoke-
free workplace and discourage the use of all tobacco products.  In 
addition, Public Law 103-227, the Pro-Children Act of 1994, prohibits 
smoking in certain facilities (or in some cases, any portion of a 
facility) in which regular or routine education, library, day care, 
health care, or early childhood development services are provided to 
children.  This is consistent with the PHS mission to protect and 
advance the physical and mental health of the American people.

SPECIAL GUIDANCE FOR APPLICANTS 

The NHGRI has conducted several competitions for large-scale sequencing 
projects during the course of the human sequencing effort.  It has been 
our experience that there are specific information items and 
presentation formats that the reviewers have found to be critical to 
their assessment of large-scale sequencing proposals. The following 
guidance summarizes that experience in the form of a format that the 
applicant must use to provide that information.  If there is additional 
information, not addressed in this Guidance, that the applicant wishes 
to present, the applicant is encouraged to provide it concisely in 
addition to the information requested here.  Please note that, in 
addition to the textual description requested below, the applicant 
should also complete the indicated standardized formats to report 
recent sequence production and costs in a consistent manner; these are 
available at URL: http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format.

I.  The Progress Report.  
The progress report section should adequately describe the applicant's 
past experience in large-scale sequence production.  This section of 
the application should include both textual and graphic information, as 
follows:

Section A. Text.  The total length for this section must not exceed 15 
pages (5000-7500 words).  Brief, concise summaries are encouraged.  
Please base the report on the center's past accomplishments, rather 
than on future plans.  In addition to the discussion in this section, 
please complete the Progress Report and Cost Format spreadsheet 
available at URL: http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format.

1.  Shotgun sequence production.  The applicant should describe the 
center's current shotgun sequencing pipeline, starting from large-
insert clones (hierarchical approach) or genomic DNA (whole genome 
approach).  This part of the discussion is limited to genomic 
sequencing capacity, but the center's total capacity (i.e., human 
and other organisms, whether funded by NHGRI or other sources) 
should be included.   The discussion must address data throughput, 
data quality, and cost, and should include, but not be limited to, 
the following: 

a.  Data Generation:  
i.  The amount of genomic shotgun sequence produced in the 
past twelve months in terms of reads, both the number of 
attempted reads and the number of successful reads, as well 
as the frequency of successfully paired end reads from 
sequencing double ended inserts (if applicable);  
ii.  The proportions of the production sequencing that are 
whole genome shotgun reads and BAC-based shotgun reads;
iii.  The average length of production sequencing reads (in 
bases of phred 20 – or equivalent – quality) and the 
average useable read length;
iv.  The amount of that data deposited in a public database 
(bases deposited in a public nucleotide sequence database 
and reads deposited in a trace archive).  NB:  All sequence 
claimed as evidence of past production must be available to 
the reviewers;
v.  The center's total current monthly production capacity.  
This number should be based on an average of the last six 
months of sequencing and should include number of attempted 
and successful reads, the number of base pairs per read of 
at least phred 20 – or equivalent – quality, and the 
frequency of double ended reads (if appropriate);
vi.  The internal metrics (e.g., reads per month, failed 
lanes, base pairs per lane, etc.) that the center has found 
to be most useful in evaluating and managing progress in 
sequence production.

b.  Sequence assembly:
The applicant should discuss the center's experience in assembling
genome sequence at all scales (e.g., large-insert clones through
genomes).  The progress report should include any draft genome
assemblies that have been deposited to a public database or otherwise
made available (including any conditions on the use of pre-publication data). 

2.  Finishing.  The applicant should describe the center's finishing 
process starting from draft-level sequence or whole genome shotgun 
assemblies.  The discussion should include any experience in closing 
gaps and building contiguous finished sequence.

a.  The report should include the amount of finished genomic 
sequence (in finished base pairs) that the center produced in the 
last twelve months and how much of that, if any, was deposited in 
a public database.  N.B.  All sequence claimed as evidence of 
past production must be available to the reviewers.

b.  The report should include the center's current monthly 
finishing capacity.  This number should be based on an average of 
the last six months of sequencing. 

3.  Quality.  The applicant should report the center's procedures for 
maintaining and checking the quality of the sequence and sequence 
assemblies it produces, at all scales (reads, shotgun assemblies, 
finished sequence, and finished genomes).  

In the event that NHGRI and the reviewers wish to assess the data 
quality in more detail, the applicant must be prepared to submit 
sequence data produced in the last six months, including sequence 
traces, success rates, and information about data tracking, during 
the review process 

4.  Technology Development.  The progress report should describe any 
experience the center has in developing and improving production-
sequencing technology.  The discussion should describe, in 
quantitative terms, the effect that such technology development has 
had in decreasing the center's sequencing costs and improving its efficiency 

5.  Prior experience in attaining milestones.  The applicant should 
discuss the center's experience in defining and meeting useful 
milestones for a sequencing project.  

6.  Cost analysis.  Using the cost portion of the Progress Report 
and Cost Format spreadsheet provided at URL: 
http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format, the 
applicant must report the current cost per attempted lane for 
shotgun sequencing, for finished sequence, and for other aspects of 
sequencing, as well as breaking out the specific costs for 
technology development and informatics.

The progress report should also include a description of the process 
by which the center's production effort is monitored internally with 
respect to costs.

Section B. Graphical and Tabular Material.  Please provide the 
following: 

1.  Graphs showing, for the past six months, the number of lanes 
attempted per week, the number of successful lanes per week, and the 
weekly success rate. 

2.  If the proposal includes a component to produce finished 
sequence, please also provide a graph indicating monthly depositions 
of finished sequence for the past six months (non-cumulative).

II.  The Research Proposal.  
This section (a maximum of 50 pages) comprises the applicant's proposal 
for operating and further developing the sequencing center during the 
next funding period.  The organization suggested below for this section 
of the application is based on the NHGRI staff's current understanding 
that the most efficient strategy for generating a finished genomic 
sequence involves a shotgun phase to generate draft sequence, followed 
by a finishing phase.  The applicant is free to propose an alternative 
strategy, but in doing so, must address all of the issues raised below.

A.  Shotgun sequence production.
  
1.  Data generation.  The applicant must present a clear plan, 
including concrete milestones, for (1) achieving the proposed 
level of sequence production, and (2) increasing the efficiency 
of production.  The following must be addressed:

a.  All phases of the sequence production pipeline, starting 
with sub-cloning of large clones and/or generation of a 
whole genome shotgun library through release of the trace 
data to a Trace Archive and sequence to a public sequence 
database.  The following should be included:
i.  the number attempted and successful sequencing reads 
and definition of a successful read;  
ii.  the overall projected throughput of the proposed 
center and how it will be attained, increased or maintained; 
iii.  potential bottlenecks or other problems that can be 
anticipated as well as proposals as to how they will be addressed; 
iv.  timelines and quantitative milestones where appropriate. 
v.  plans for improving efficiency; the discussion of 
expected costs should be expressed in the same way as in 
the progress report cost format, i.e., as fully loaded 
costs per read and per finished base. It is imperative that 
projections of cost reduction be fully justified, and to the
extent possible based on data that are provided in the application. 

To report overall proposed production costs, applicants 
should use the Projected Cost Format spreadsheet available 
at URL http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format.  

b.  Sequencing targets.  The applicant should assume that the 
specific sequencing targets will be determined on the basis of 
the white paper process described above and subject to 
negotiation with the sequencing centers, and that the NHGRI 
genomic sequencing program will be interested in sequencing 
different target genomes to different levels of completion 
(draft through finished).  Thus, the applicant should not 
propose specific sequencing targets.  However, the application 
should include:
i.  a discussion of strategies and approaches that the 
center will employ to achieve the goal identified for any 
particular genome; 
ii.  a discussion of the expected characteristics of the 
assembled (or finished) products with regard to contiguity, 
order/orientation, and completeness; 
iii.  a discussions of how issues such as genome size, 
repeat content, polymorphism, etc. could affect the 
strategy, production pipeline, and costs.

c.  Informatics.  In this section, four aspects of the overall 
informatics component of the center should be discussed:
i.  The center's informatics infrastructure. The applicant 
must include a description of the basic informatics 
infrastructure (including database management, laboratory 
information management, data handling, and data submission) 
of the sequencing center as part of the shotgun-sequencing 
component of the Research Plan.  
ii.  Assembly.  The applicant must describe how assemblies 
of all types will be done at different sequencing depths 
proposed and description of additional genome assembly 
software or capability, if proposed.
iii.  Automated annotation. The primary, automated 
annotation process should be described in detail, including 
what will be done, and where the point of completion and 
handoff to the community will occur.
iv.  The development of new informatics systems for the 
three components listed above should be discussed, if 
appropriate. 

Proposed informatics costs should be broken out of the total 
costs in the Projected Cost Format spreadsheet available at URL 
http://www.genome.gov/Pages/Grants/RFAHG-02-002-Format.
 
B.  Finished sequence production. The NHGRI considers the finishing 
component to include all of the activities that are required by the 
center to improve draft-quality sequence to the point at which the 
center will not work on the sequencing project any longer and will 
deposit the information as "finished" in a public database (current 
definition: "finished" sequence has a frequency of no more than one 
error in 10,000 bases and no gaps that can be closed by state-of-
the-art technology.).

1.  Data generation.  The Research Plan must include a discussion of:
a.  How the finishing capacity of the center will be 
maintained, increased and made more efficient.
b.  The decision-making process that the center will use to 
determine the degree to which different projects will be 
finished; the proposed incremental cost per base of finishing 
a draft genome 

2.  Informatics.  The informatics issues and requirements 
associated with the finishing process.

Proposed finishing costs should be indicated in the Projected 
Cost Format spreadsheet available at URL 
http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format. 

C.  Additional informatics:  Any informatics activities beyond those 
addressed in the shotgun sequence production and finished sequence 
production components should be described in a separate section of 
the Research Plan.  

D.  Technology development. The Research Plan should include a 
separate section describing plans for technology development for the 
purpose of continuing to advance the state of the art by reducing 
costs and increasing throughput. 

Proposed technology development costs should be broken out of the 
total costs in the Projected Cost Format spreadsheet available at 
URL http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format.  

III.  Budget Request.
The budget requested must be described clearly and be well justified.  
Applicants should submit the Detailed Budget for the Initial Budget 
Period (page 4 of PHS-398) and the Budget for Entire Proposed Period of 
Support (page 5 of PHS-398).  Based on extensive experience with review 
of large-scale sequencing grant proposals, NHGRI believes it is very 
important that reviewers understand both requested overall costs on a 
per-read basis, and also the portions of those costs that are due to 
various typical components of a large-scale sequencing center.  
Therefore, NHGRI strongly suggests that applicants use the Projected 
Cost Format spreadsheet available at URL 
http://www.genome.gov/Pages/Grants/RFAHG-03-002-Format. This format is 
intended to provide a view of all-inclusive total production costs per 
read, and incremental finishing costs per base. The format also breaks 
out from those totals the amount requested for technology development 
and informatics activities.


Weekly TOC for this Announcement
NIH Funding Opportunities and Notices


Office of Extramural Research (OER) - Home Page Office of Extramural
Research (OER)
  National Institutes of Health (NIH) - Home Page National Institutes of Health (NIH)
9000 Rockville Pike
Bethesda, Maryland 20892
  Department of Health and Human Services (HHS) - Home Page Department of Health
and Human Services (HHS)
  USA.gov - Government Made Easy


Note: For help accessing PDF, RTF, MS Word, Excel, PowerPoint, RealPlayer, Video or Flash files, see Help Downloading Files.