What is GenBank?
GenBank® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research, 2008 Jan;36(Database issue):D25-30). There are approximately 85,759,586,764 bases in 82,853,685 sequence records in the traditional GenBank divisions and 108,635,736,141 bases in 27,439,206 sequence records in the WGS division as of February 2008.
The complete release notes for the current version of GenBank are available on the NCBI ftp site. A new release is made every two months. GenBank is part of the International Nucleotide Sequence Database Collaboration, which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI. These three organizations exchange data on a daily basis.
An example of a GenBank record may be viewed for a Saccharomyces cerevisiae gene.
In The News: Platypus Genome
Explore Platypus Genome resources.
- Platypus Genome Project
- Platypus Taxonomic and Sequence Resources
- Platypus Genome Resource Guide
- Duck-Billed Platypus Genome Sequence Published (NIH Press Release)
- Duck-billed Platypus Genome Sequencing (NIH Extramural Research)
Submissions to GenBank
Many journals require submission of sequence information to a database prior to publication so that an accession number may appear in the paper. There are several options for submitting data to GenBank:
- BankIt, a WWW-based submission tool for convenient and quick submission of sequence data
- Sequin, NCBI's stand-alone submission software for MAC, PC, and UNIX platforms, is available by FTP. When using Sequin, the output files for direct submission should be sent to GenBank by e-mail.
- tbl2asn, a command-line program, automates the creation of sequence records for submission to GenBank using many of the same functions as Sequin. It is used primarily for submission of complete genomes and large batches of sequences.
- Barcode Submission Tool, a WWW-based
tool for the submission of GenBank sequences and trace data for
Barcode of Life projects.
>
There are specialized, streamlined procedures for batch submissions of sequences, such as EST, STS, and GSS sequences.
Submissions of Sequence Reads
- Reads of Sanger-style sequencing can be submitted to the Trace Archive.
- Runs of next-generation sequencing, for example 454 or Solexa, can be submitted to the Short Read Archive (SRA).
Updating or Revising a GenBank Sequence
Revisions or updates to GenBank entries can be made by the submitters at any time and can be accepted through the Update option on the BankIt page, in the text of an e-mail message, or as a Sequin file. Send updates and revisions to gb-admin@ncbi.nlm.nih.gov. Be sure to give the accession number of the sequence to be updated in the subject line.
Information about the correct format for different types of updates can be found at: http://www.ncbi.nlm.nih.gov/Genbank/update.html
Access to GenBank
There are several ways to search and retrieve data from GenBank.
- Search GenBank for sequence identifiers and annotations with Entrez Nucleotide, which is divided into three divisions: CoreNucleotide (the main collection), dbEST (Expressed Sequence Tags), and dbGSS (Genome Survey Sequences).
- Search and align GenBank sequences to a query sequence using BLAST (Basic Local Alignment Search Tool). BLAST searches CoreNucleotide, dbEST, and dbGSS independently; see BLAST info for more information about the numerous BLAST databases.
- Search, link, and download sequences programatically using NCBI e-utilities.
GenBank Data Usage
The GenBank database is designed to provide and encourage access within the scientific community to the most up to date and comprehensive DNA sequence information. Therefore, NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. NCBI is not in a position to assess the validity of such claims, and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information contained in GenBank.
New Developments
NCBI is continuously developing new tools and enhancing existing ones to improve both submission and access to GenBank. The easiest way to keep abreast of these and other developments is to sign up on the NCBI Announce e-mail list, read the NCBI News, available via the web and free subscription, and check the "What's New" section of the NCBI Web page.
Last revised: April 2, 2008.