COGs
Phylogenetic classification of proteins encoded in complete genomes
Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain.
66 genomes
38 orders
28 classes
14 phyla
Unicellular clusters     FTP Initial

version
Science 1997 Oct 24;278(5338):631-7,
BMC Bioinformatics 2003 Sep 11;4(1):41.
Euryarchaeota
Methanobacteriales   Mth
Methanococcales   Mja
Halobacteriales   Hbs
Thermoplasmatales   Tac Tvo
Thermococcales   Pho Pab
Archaeoglobales   Afu
Methanopyrales   Mka
Methanosarcinales   Mac

Crenarchaeota
Thermoproteales   Pya
Sulfolobales   Sso
Desulfurococcales   Ape

Ascomycota
Saccharomycetales   Sce
Schizosaccharomycetales   Spo

Microsporidia
Apansporoblastina   Ecu
Aquificae
Aquificales   Aae

Thermotogae
Thermotogales   Tma

Cyanobacteria
Nostocales   Nos
Chroococcales   Syn

Deinococcus-Thermus
Deinococcales   Dra

Fusobacteria
Fusobacterales   Fnu

Spirochaetes
Spirochaetales   Tpa Bbu

Chlamydiae
Chlamydiales   Ctr Cpn
Actinobacteria
Actinomycetales   Cgl Mtu MtC Mle

Firmicutes
Clostridiales   Cac
Bacillales   Sau Lin Bsu Bha
Lactobacillales   Lla Spy Spn
Mycoplasmatales   Uur Mpu Mpn Mge

Proteobacteria
Pseudomonadales   Pae
Enterobacteriales   Eco EcZ Ecs Ype Sty Buc
Xanthomonadales   Xfa
Vibrionales   Vch
Pasteurellales   Hin Pmu
Burkholderiales   Rso
Neisseriales   Nme NmA
Campylobacterales   Hpy jHp Cje
Caulobacterales   Ccr
Rhizobiales   Atu Sme Bme Mlo
Rickettsiales   Rpr Rco

Upcoming microbial genomes
genomes genera orders classes phyla
261 126 63 33 17
[N]   Nano
[A]   Euryarchaeota (8)

*   Methanobacteria *   Methanococci
*   Methanomicrobia *   Halobacteria
*   Thermoplasmata *   Thermococci
*   Archaeoglobi *   Methanopyri
[R]   Creno (3)
[D]   Deinococcus (2)
[T]   Actinobacteria (3)
[P]    Proteobacteria (26)

α
(6)
β
(5)
γ
(10)
δ
(4)
ε
(1)
[O]   Other (9)
*  Bacteroidetes
*  Chlorobi
*  Fusobacteria
*  Aquificae
*  Chloroflexi
*  Thermotogae
*  Planctomycetes
*  Spirochaetes
*  Chlamydiae
[F]   Firmicutes (7)

Mollicutes (3)
Bacilli (2)
Clostridia (2)
[C]   Cyanobacteria (4)

*   Gloeobacteria
*   Nostocali
*   Prochlorali
*   Chroococcali






Eukaryotic Clusters    FTP
Code Name Abbreviation
A Arabidopsis thaliana
(thale cress)
ath
C Caenorhabditis elegans
(worm)
cel
D Drosophila melanogaster
(fruit fly)
dme
H Homo sapiens
(human)
hsa
Y Saccharomyces cerevisiae
(baker yeast)
sce
P Schizosaccharomyces pombe
(fission yeast)
spo
E Encephalitozoon cuniculi
(Microsporidia)
ecu
Upcoming eukaryotic genomes
O Oryza sativa
(rice)
osa
Q Anopheles gambiae
(mosquito)
aga
Z Pan troglodytes
(chimpanzee)
ptr
W Canis familiaris
(dog)
cfa
M Mus musculus
(mouse)
mmu
R Rattus norvegicus
(rat)
rno
Ascomycota genomes including
L Magnaporthe grisea mgr
N Neurospora crassa ncr


Comments and questions to info@ncbi.nlm.nih.gov or to the Author