UniGene: An Organized View of the Transcriptome.
Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.

Species UniGene Entries
Chordata
Mammalia
Bos taurus (cow) 43,448
Canis lupus familiaris (dog) 27,853
Equus caballus (horse) 8,133
Homo sapiens (human) 122,726
Macaca fascicularis (crab-eating macaque) 11,951
Macaca mulatta (rhesus monkey) 15,359
Monodelphis domestica (gray short-tailed opossum) 966
Mus musculus (mouse) 79,541
Ornithorhynchus anatinus (platypus) 1,827
Oryctolagus cuniculus (rabbit) 6,576
Ovis aries (sheep) 14,659
Papio anubis (olive baboon) 5,673
Pongo abelii (Sumatran orangutan) 6,996
Rattus norvegicus (Norway rat) 64,563
Sus scrofa (pig) 51,027
Trichosurus vulpecula (silver-gray brushtail possum) 11,771
Actinopterygii
Danio rerio (zebrafish) 56,944
Fundulus heteroclitus (killifish) 4,618
Gadus morhua (Atlantic cod) 14,542
Gasterosteus aculeatus (three spined stickleback) 18,938
Oncorhynchus mykiss (rainbow trout) 25,025
Oryzias latipes (Japanese medaka) 22,486
Pimephales promelas (fathead minnow) 21,765
Salmo salar (Atlantic salmon) 31,957
Takifugu rubripes (pufferfish) 3,809
Amphibia
Xenopus laevis (African clawed frog) 35,003
Xenopus tropicalis (western clawed frog) 42,665
Ascidiacea
Ciona intestinalis 30,774
Ciona savignyi 7,639
Molgula tectiformis 8,526
Aves
Gallus gallus (chicken) 33,376
Meleagris gallopavo (turkey) 1,167
Taeniopygia guttata (zebra finch) 14,432
Cephalochordata
Branchiostoma floridae (Florida lancelet) 14,645
Hyperoartia
Petromyzon marinus (sea lamprey) 11,069
Echinodermata
Echinoidea
Paracentrotus lividus (common urchin) 8,684
Strongylocentrotus purpuratus (purple sea urchin) 19,620
Arthropoda
Branchiopoda
Daphnia pulex (common water flea) 14,190
Insecta
Acyrthosiphon pisum (pea aphid) 12,891
Aedes aegypti (yellow fever mosquito) 19,345
Anopheles gambiae (African malaria mosquito) 21,387
Apis mellifera (honey bee) 9,758
Bombyx mori (domestic silkworm) 11,198
Culex pipiens (house mosquito) 4,957
Drosophila melanogaster (fruit fly) 17,197
Ixodes scapularis (black-legged tick) 18,161
Tribolium castaneum (red flour beetle) 9,053
Nematoda
Chromadorea
Ancylostoma caninum (dog hookworm) 7,394
Caenorhabditis elegans (nematode) 21,662
Platyhelminthes
Trematoda
Schistosoma japonicum 9,395
Schistosoma mansoni 10,219
Turbellaria
Schmidtea mediterranea 9,930
Cnidaria
Anthozoa
Nematostella vectensis (starlet sea anemone) 19,167
Hydrozoa
Hydra magnipapillata 10,656
Streptophyta
Bryopsida
Physcomitrella patens 20,137
Coniferopsida
Picea glauca (white spruce) 17,809
Picea sitchensis (Sitka spruce) 16,755
Pinus taeda (loblolly pine) 18,921
Eudicotyledons
Aquilegia formosa x Aquilegia pubescens 8,063
Arabidopsis thaliana (thale cress) 30,383
Artemisia annua (sweet wormwood) 9,462
Brassica napus (rape) 26,733
Brassica oleracea 5,617
Brassica rapa (field mustard) 14,491
Citrus clementina 6,107
Citrus sinensis (Valencia orange) 15,815
Glycine max (soybean) 30,248
Gossypium hirsutum (upland cotton) 21,743
Gossypium raimondii 3,297
Helianthus annuus (sunflower) 7,846
Lactuca sativa (garden lettuce) 7,940
Lotus japonicus 14,493
Malus x domestica (apple) 16,932
Medicago truncatula (barrel medic) 17,781
Nicotiana tabacum (tobacco) 19,753
Populus tremula x Populus tremuloides (hybrid aspen) 9,652
Populus trichocarpa (western balsam poplar) 14,958
Prunus persica (peach) 7,078
Raphanus raphanistrum (wild radish) 16,940
Raphanus sativus (radish) 17,649
Solanum lycopersicum (tomato) 17,784
Solanum tuberosum (potato) 19,645
Vigna unguiculata (cowpea) 16,494
Vitis vinifera (wine grape) 23,152
Liliopsida
Hordeum vulgare (barley) 23,045
Oryza sativa (rice) 40,762
Saccharum officinarum (sugarcane) 15,594
Sorghum bicolor (sorghum) 13,895
Triticum aestivum (Wheat) 41,288
Zea mays (maize) 72,632
Chlorophyta
Chlorophyceae
Chlamydomonas reinhardtii 11,303
Volvox carteri 5,638
Dictyosteliida
Dictyostelium
Dictyostelium discoideum (slime mold) 5,957
Apicomplexa
Coccidia
Toxoplasma gondii 6,623
Ascomycota
Eurotiomycetes
Coccidioides posadasii 3,994
Sordariomycetes
Gibberella moniliformis 5,259
Magnaporthe grisea 5,380
Neurospora crassa 2,216
Basidiomycota
Heterobasidiomycetes
Filobasidiella neoformans 4,838
Oomycetes
Peronosporales
Phytophthora infestans (potato late blight agent) 7,257
Bacillariophyta
Bacillariophyceae
Phaeodactylum tricornutum 6,778
Mollusca
Gastropoda
Aplysia californica (California sea hare) 24,994
Lottia gigantia 15,602
Ciliophora
Oligohymenophorea
Paramecium tetraurelia 14,074
Tetrahymena thermophila 7,840

In addition to sequences of well-characterized genes, hundreds of thousands novel expressed sequence tag (EST) sequences have been included. Consequently, the collection may be of use to the community as a resource for gene discovery. UniGene has also been used by experimentalists to select reagents for gene mapping projects and large-scale expression analysis.

However, it should be noted that the procedures for automated sequence clustering are still under development and the results may change from time to time as improvements are made. Feedback from users has been especially useful in identifying problems and we encourage you to report any problems you encounter.

It should also be noted that no attempt has been made to produce contigs or consensus sequences. There are several reasons why the sequences of a set may not actually form a single contig. For example, all of the splicing variants for a gene are put into the same set. Moreover, EST-containing sets often contain 5' and 3' reads from the same cDNA clone, but these sequences do not always overlap.

Currently, sequences from the animals human, rat, mouse, cow, zebrafish, clawed frog, fruitfly and mosquito have been processed. Plant organisms are wheat, rice, barley, maize and cress. These species were chosen because they have the greatest amounts of EST data available and represent a variety of species. Additional organisms may be added in the future.

A representation of the UniGene datasets is available by ftp

Descriptions of the UniGene transcript based and genome based build procedures are available.
UniGene References

Pontius JU, Wagner L, Schuler GD. UniGene: a unified view of the transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for Biotechnology Information; 2003.
[Full Text] [PDF]

Wheeler DL, et al. Database Resources of the National Center for Biotechnology. Nucl Acids Res 31:28-33;2003.
[PubMed] [Full Text] [PDF]

Schuler GD. Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med 75:694-698; 1997.
[PubMed]

Schuler GD, et al. A gene map of the human genome. Science 274:540-546; 1996;
[PubMed] [Full Text]

Boguski MS, Schuler GD ESTablishing a human transcript map. Nature Genetics 10: 369-371; 1995.
[PubMed]