UniGene: An Organized View of the Transcriptome. Each UniGene entry is a set of transcript sequences that appear to come from the same transcription locus (gene or expressed pseudogene), together with information on protein similarities, gene expression, cDNA clone reagents, and genomic location.
|
Species |
UniGene Entries |
Chordata |
Mammalia |
Bos taurus (cow) |
43,448 |
Canis lupus familiaris (dog) |
27,853 |
Equus caballus (horse) |
8,133 |
Homo sapiens (human) |
122,726 |
Macaca fascicularis (crab-eating macaque) |
11,951 |
Macaca mulatta (rhesus monkey) |
15,359 |
Monodelphis domestica (gray short-tailed opossum) |
966 |
Mus musculus (mouse) |
79,541 |
Ornithorhynchus anatinus (platypus) |
1,827 |
Oryctolagus cuniculus (rabbit) |
6,576 |
Ovis aries (sheep) |
14,659 |
Papio anubis (olive baboon) |
5,673 |
Pongo abelii (Sumatran orangutan) |
6,996 |
Rattus norvegicus (Norway rat) |
64,563 |
Sus scrofa (pig) |
51,027 |
Trichosurus vulpecula (silver-gray brushtail possum) |
11,771 |
Actinopterygii |
Danio rerio (zebrafish) |
56,944 |
Fundulus heteroclitus (killifish) |
4,618 |
Gadus morhua (Atlantic cod) |
14,542 |
Gasterosteus aculeatus (three spined stickleback) |
18,938 |
Oncorhynchus mykiss (rainbow trout) |
25,025 |
Oryzias latipes (Japanese medaka) |
22,486 |
Pimephales promelas (fathead minnow) |
21,765 |
Salmo salar (Atlantic salmon) |
31,957 |
Takifugu rubripes (pufferfish) |
3,809 |
Amphibia |
Xenopus laevis (African clawed frog) |
35,003 |
Xenopus tropicalis (western clawed frog) |
42,665 |
Ascidiacea |
Ciona intestinalis |
30,774 |
Ciona savignyi |
7,639 |
Molgula tectiformis |
8,526 |
Aves |
Gallus gallus (chicken) |
33,376 |
Meleagris gallopavo (turkey) |
1,167 |
Taeniopygia guttata (zebra finch) |
14,432 |
Cephalochordata |
Branchiostoma floridae (Florida lancelet) |
14,645 |
Hyperoartia |
Petromyzon marinus (sea lamprey) |
11,069 |
Echinodermata |
Echinoidea |
Paracentrotus lividus (common urchin) |
8,684 |
Strongylocentrotus purpuratus (purple sea urchin) |
19,620 |
Arthropoda |
Branchiopoda |
Daphnia pulex (common water flea) |
14,190 |
Insecta |
Acyrthosiphon pisum (pea aphid) |
12,891 |
Aedes aegypti (yellow fever mosquito) |
19,345 |
Anopheles gambiae (African malaria mosquito) |
21,387 |
Apis mellifera (honey bee) |
9,758 |
Bombyx mori (domestic silkworm) |
11,198 |
Culex pipiens (house mosquito) |
4,957 |
Drosophila melanogaster (fruit fly) |
17,197 |
Ixodes scapularis (black-legged tick) |
18,161 |
Tribolium castaneum (red flour beetle) |
9,053 |
Nematoda |
Chromadorea |
Ancylostoma caninum (dog hookworm) |
7,394 |
Caenorhabditis elegans (nematode) |
21,662 |
Platyhelminthes |
Trematoda |
Schistosoma japonicum |
9,395 |
Schistosoma mansoni |
10,219 |
Turbellaria |
Schmidtea mediterranea |
9,930 |
Cnidaria |
Anthozoa |
Nematostella vectensis (starlet sea anemone) |
19,167 |
Hydrozoa |
Hydra magnipapillata |
10,656 |
Streptophyta |
Bryopsida |
Physcomitrella patens |
20,137 |
Coniferopsida |
Picea glauca (white spruce) |
17,809 |
Picea sitchensis (Sitka spruce) |
16,755 |
Pinus taeda (loblolly pine) |
18,921 |
Eudicotyledons |
Aquilegia formosa x Aquilegia pubescens |
8,063 |
Arabidopsis thaliana (thale cress) |
30,383 |
Artemisia annua (sweet wormwood) |
9,462 |
Brassica napus (rape) |
26,733 |
Brassica oleracea |
5,617 |
Brassica rapa (field mustard) |
14,491 |
Citrus clementina |
6,107 |
Citrus sinensis (Valencia orange) |
15,815 |
Glycine max (soybean) |
30,248 |
Gossypium hirsutum (upland cotton) |
21,743 |
Gossypium raimondii |
3,297 |
Helianthus annuus (sunflower) |
7,846 |
Lactuca sativa (garden lettuce) |
7,940 |
Lotus japonicus |
14,493 |
Malus x domestica (apple) |
16,932 |
Medicago truncatula (barrel medic) |
17,781 |
Nicotiana tabacum (tobacco) |
19,753 |
Populus tremula x Populus tremuloides (hybrid aspen) |
9,652 |
Populus trichocarpa (western balsam poplar) |
14,958 |
Prunus persica (peach) |
7,078 |
Raphanus raphanistrum (wild radish) |
16,940 |
Raphanus sativus (radish) |
17,649 |
Solanum lycopersicum (tomato) |
17,784 |
Solanum tuberosum (potato) |
19,645 |
Vigna unguiculata (cowpea) |
16,494 |
Vitis vinifera (wine grape) |
23,152 |
Liliopsida |
Hordeum vulgare (barley) |
23,045 |
Oryza sativa (rice) |
40,762 |
Saccharum officinarum (sugarcane) |
15,594 |
Sorghum bicolor (sorghum) |
13,895 |
Triticum aestivum (Wheat) |
41,288 |
Zea mays (maize) |
72,632 |
Chlorophyta |
Chlorophyceae |
Chlamydomonas reinhardtii |
11,303 |
Volvox carteri |
5,638 |
Dictyosteliida |
Dictyostelium |
Dictyostelium discoideum (slime mold) |
5,957 |
Apicomplexa |
Coccidia |
Toxoplasma gondii |
6,623 |
Ascomycota |
Eurotiomycetes |
Coccidioides posadasii |
3,994 |
Sordariomycetes |
Gibberella moniliformis |
5,259 |
Magnaporthe grisea |
5,380 |
Neurospora crassa |
2,216 |
Basidiomycota |
Heterobasidiomycetes |
Filobasidiella neoformans |
4,838 |
Oomycetes |
Peronosporales |
Phytophthora infestans (potato late blight agent) |
7,257 |
Bacillariophyta |
Bacillariophyceae |
Phaeodactylum tricornutum |
6,778 |
Mollusca |
Gastropoda |
Aplysia californica (California sea hare) |
24,994 |
Lottia gigantia |
15,602 |
Ciliophora |
Oligohymenophorea |
Paramecium tetraurelia |
14,074 |
Tetrahymena thermophila |
7,840 |
|
In addition to sequences of well-characterized genes, hundreds of thousands novel expressed sequence tag (EST) sequences have been included. Consequently, the collection may be of use to the community as a resource for gene discovery. UniGene has also been used by experimentalists to select reagents for gene mapping projects and large-scale expression analysis.
However, it should be noted that the procedures for automated sequence clustering are still under development and the results may change from time to time as improvements are made. Feedback from users has been especially useful in identifying problems and we encourage you to report any problems you encounter.
It should also be noted that no attempt has been made to produce contigs or consensus sequences. There are several reasons why the sequences of a set may not actually form a single contig. For example, all of the splicing variants for a gene are put into the same set. Moreover, EST-containing sets often contain 5' and 3' reads from the same cDNA clone, but these sequences do not always overlap.
Currently, sequences from the animals human, rat, mouse, cow, zebrafish, clawed frog, fruitfly and mosquito have been processed. Plant organisms are wheat, rice, barley, maize and cress. These species were chosen because they have the greatest amounts of EST data available and represent a variety of species. Additional organisms may be added in the future.
A representation of the UniGene datasets is available by ftpDescriptions of the UniGene transcript based and genome based build procedures are available. UniGene References Pontius JU, Wagner L, Schuler GD. UniGene: a unified view of the transcriptome. In: The NCBI Handbook. Bethesda (MD): National Center for Biotechnology Information; 2003. [Full Text] [PDF] Wheeler DL, et al. Database Resources of the National Center for Biotechnology. Nucl Acids Res 31:28-33;2003. [PubMed] [Full Text] [PDF] Schuler GD. Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med 75:694-698; 1997. [PubMed] Schuler GD, et al. A gene map of the human genome. Science 274:540-546; 1996; [PubMed] [Full Text] Boguski MS, Schuler GD ESTablishing a human transcript map. Nature Genetics 10: 369-371; 1995. [PubMed] |
|