National Center for Biotechnology Information

Collaborating on
public cancer data

National Cancer Institute
PubMed Entrez BLAST OMIM Taxonomy Structure
  Search for
NCBI Projects with CGAP Data

Clone Registry
dbEST
Entrez GEO DataSets.
Gene Expression Omnibus
HomoloGene
Human BAC Resource
Gene
Map Viewer
SKY/CGH
Cancer Chromosomes
UniGene
UniSTS

Cancer Genome Anatomy Project


   The Cancer Genome Anatomy Project (CGAP) is an interdisciplinary program established and administered by the National Cancer Institute (NCI) to generate the information and technological tools needed to decipher the molecular anatomy of the cancer cell.
   From CGAP's inception in 1996, NCBI has been a key participant in bioinformatics planning, data tracking, data archiving, and analytical tool development and implementation. NCBI's role in CGAP has been constantly evolving to meet the demands of changing and expanding raw data sources, as well as the needs of the scientific community.
   The information below highlights NCBI's contributions to CGAP.

"characterizing genes"


   NCBI collaborates closely with NCI as the CGAP pipeline generates large numbers of expressed sequence tags (EST), all of which are deposited into dbEST, and subsequently incorporated into UniGene and HomoloGene. Gene-based, manual annotations may also be added via Feedback for RefSeq and Entrez Gene.

"designing analytic tools"


   NCBI created the first public CGAP website and designed all analytical tools for the CGAP Tumor Gene Index during the first four years of the project. Also, NCBI offers analytic tools to infer gene expression information from our extensive EST libraries. Gene expression profiles are linked to many of the Unigene clusters (see examples); in addition, there is a standalone tool Digital Differential Display to compare computed gene expression profiles between selected cDNA libraries. The CGAP website provides other similar tools under Tissues.

"profiling gene expression"


   Serial analysis of gene expression (SAGE) is a more cost-efficient method of producing gene expression data (compared to EST data). CGAP supports the production of SAGE libraries and their sequencing while NCBI archives the SAGE libraries in the Gene Expression Omnibus (GEO) database. These data are searchable using Entrez GEO DataSets.

"tracking biological reagents"


   NCBI designed and continues to maintain an internal database that tracks CGAP samples, libraries and clones used in the generation of EST and SAGE data.
   The wealth of data produced by CGAP has lead NCBI and NCI staff to develop an ad hoc classification system of hierarchically related keywords. Both NCBI and NCI staff are responsible for the classification of all new human and mouse EST libraries and SAGE libraries on a routine basis.

"mapping chromosome aberrations"


   NCBI is collaborating closely with a component of CGAP called the Cancer Chromosome Aberration Project (CCAP).
   Using fluorescent in situ hybridization (FISH), CCAP is generating clones that are spaced 1-2 Mb across the human genome. Once mapped, these sequence-ready DNA BAC clones are then made available to the research community. NCBI is involved in identifying candidate BAC clones to be FISH-mapped, archiving the results, and localizing these clones onto draft sequence contigs. This data can be viewed through NCBI's Map Viewer, is linked to NCBI's Clone Registry and UniSTS sites and can also be displayed synoptically on NCI's CGAP Chromosome and NCBI's Human BAC Resource websites.
   CCAP is also characterizing chromosome aberrations in selected tumor types through the use of spectral karyotyping (SKY) and comparative genomic hybridization (CGH). SKY facilitates identification of chromosomal aberrations and CGH can be used to generate a map of DNA copy number changes in tumor genomes. The SKY/CGH database has been designed to house this data, and is publicly accessible.
   Drs. Mitelman, Mertens and Johansson have been systematically summarizing recurrent neoplasia-associated chromosomal aberrations from the Mitelman Database of Chromosome Aberrations in Cancer. This work, which originally appeared in the April 1997 Special Issue of Nature Genetics, entitled "A breakpoint map of recurrent chromosomal rearrangements in human neoplasia", is continuously updated and a summary of the latest data can be viewed with NCBI´s Map Viewer. All recurrent aberrations can be interactively queried at NCI's CGAP website at Recurrent Chromosome Aberrations in Cancer.
   The Cancer Chromosomes database integrates the SKY/M-FISH & CGH Database with the Mitelman Database of Chromosome Aberrations in Cancer and the Recurrent Chromosome Aberrations in Cancer database. These three data sets can now be searched seamlessly by use of the Entrez search and retrieval system for chromosome aberrations, clinical data, and reference citations. Common diagnoses, anatomic sites, chromosome breakpoints, junctions, numerical and structural abnormalities, and bands gained and lost among selected cases can be compared by use of the "similarity" report. Because the model used for CGH data is a subset of the karyotype data, it is now possible to examine the similarities between CGH results and karyotypes directly.



NLM | NIH | NCBI Help | Disclaimer | Section 508