Data Release and Usage Plan
Guiding Principles
NIAID recognizes that large-scale pre-publication DNA sequence information is a unique research resource for scientists and that rapid and unrestricted sharing of microbial genome sequence data is essential for advancing research on infectious agents responsible for human disease. Data release plans for NIAID-funded genome sequencing projects should be based on the guiding principle that pre-publication genome sequence data should be released to the scientific community as rapidly as possible via deposition into a searchable, public international database. Therefore, it is anticipated that pre-publication genome sequence data generated at the NIAID Microbial Sequencing Centers (MSCs) will be made freely and publicly available via deposition to GenBank, a publicly searchable international database, as rapidly as possible.
This principle is based on an expectation that users of the data will act responsibly to promote the highest standards of respect for the quality and the priority of the scientific contribution of others and that normal standards of scientific etiquette and "fair use" will be respected within the broad scientific community using the pre-publication data.
Data Release
A data release plan for each NIAID-funded sequencing project is required, and final details will be negotiated between NIAID, the sequencing group, and collaborators to ensure that genome sequence data release will support the guiding principles stated above. Final approval for the data release plan will be given by NIAID.
The data release plan will include data release for chromatograph files, genome assemblies, and annotation. Consideration of the range of the projects and size of the genomes being sequenced suggests strongly that one set of requirements for data release does not fit all sequencing projects; the approach to data release should be sensitive to the aims of the activity and the overall guiding principles.
Chromatogram Files
All sequences and trace files (chromatograms) generated under this proposal will be submitted to the Trace Archive at NCBI/NLM/NIH on a weekly basis. These data will also include information on templates, vectors, and quality values for each sequence.
Genome Assemblies
Genome assemblies will be made available via GenBank, The Broad Institute (non-government link), the J. Craig Venter Institute (non-government link), and a NIAID-funded database/Web site (for example, the NIAID Bioinformatics Resource Center), as specified by the Program Officer after internal and community validation. Assuming no significant errors are detected during the validation process, assemblies will be released to GenBank within 45 calendar days of being generated, followed by release to other Web sites, if appropriate.
Genome Annotation
Annotation data will be made available via GenBank, The Broad Institute (non-government link), the J. Craig Venter Institute (non-government link), and a NIAID-funded database/Web site (for example, the NIAID-funded Bioinformatics Resource Center), as specified by the Program Officer after internal and community validation. Assuming no significant errors are detected during the validation process, annotation data will be released within 45 calendar days of being generated to GenBank, followed by release to other Web sites, if appropriate.
back to top