For each class of eukaryotic histone, we used sequence- and text-based queries to extract all relevant entries from NCBI's Entrez protein sequence database, which is compiled from several independent databases and redundant with respect to sequence. Sequences are formatted as FASTA records: a definition line containing the NCBI unique identifier (i.e., the gi number), a database-specific accession number, and a sequence description, followed by the sequence itself.
In each FASTA record, the gi number doubles as a link to its complete Entrez Protein record, which, in turn, can contain links to nucleotide, taxonomy, PubMed reference, structure, and other records.
|