Posted: March 28, 2012
TCGA Data and Cancer Systems Biology: Ex pluribus unum
Dr. Andrea Califano
The Cancer Genome Atlas (TCGA) is providing an extraordinarily comprehensive set of complementary resources to help researchers elucidate the complete repertoire of mechanisms contributing to tumor initiation and progression. Yet, perhaps surprisingly, no research field has benefited more from TCGA’s comprehensive profiling efforts within specific tumor types than cancer systems biology. Indeed, the success of this discipline is fundamentally rooted in the ability to generate accurate and informative models of cell regulation, including transcriptional, post-transcriptional, and post-translational processes determining normal cell physiology, whose dysregulation may lead to tumorigenesis. Necessary data to reverse engineer and interrogate these distinct, yet complementary regulatory layers in integrative fashion was essentially non-existent in the pre-TCGA world.
Pre-TCGA, efforts to understand global regulation had been stifled by the complexity of capturing multiple data modalities and by the difficulties of studying even simple relationships such as those between microRNA, gene copy number alteration, promoter methylation, and gene expression. Indeed, with few exception in non-cancer related fields, 1, 2 large-scale expression, microRNA, or copy number alteration profiles of human malignancies collected by individual labs or by a few consortia had been mostly studied in isolation, thus providing a useful, yet highly fragmentary picture of the underlying processes.3, 4
The TCGA program has dramatically altered this relationship, allowing for the first time integration of these complementary and highly interdependent layers. This has helped produce a better, more comprehensive picture of the dysregulated processes that contribute to oncogenesis and even to discover entirely new layers of regulations that could not have been glimpsed without these data. For instance, availability of microRNA and gene expression profiles for a large number of matched samples in glioblastoma, has allowed the inference of hundreds of thousands of mRNA-mRNA regulatory interactions that are mediated by microRNAs and yet do not depend on microRNA variability.5 Experimental validation of these interactions has helped address some of the missing variability associated with oncogenesis. For instance, 13 genes were identified, whose deletions in glioma contribute to PTEN inactivation through microRNA mediated interactions.5 Similarly, coordinated study of glioblastoma samples across gene expression, copy number alterations, methylation, and mutational data from TCGA has allowed the characterization of several glioblastoma subtypes, as well as a repertoire of genetic and epigenetic alterations that contribute to some of them,6,7 including specific temporal event patterns leading to gliomagenesis.8 Finally, analysis of TCGA data in combination with other datasets, has helped dissect genome-wide regulatory mechanisms that can be interrogated to identify synergistic regulation of tumor subtypes, candidate biomarkers for aggressive tumors, and potential therapeutic targets.9
We are far from being done, as TCGA data modalities available to the research community have been heavily biased towards those that can be most effectively and economically profiled via microarray and sequencing technologies. This, unfortunately, does not encompass proteomics and in particular, the recently established ability to profile the phospho-proteome.10,11 Large-scale profiling of other omics layers, such as glycomics, metabolomics, and lipidomics for instance, are even further on the horizon. Yet, TCGA has created an extensible and highly scalable model to pursue the complete molecular characterization of large-scale repositories of high-quality, clinically annotated tumor samples. Thus, we have no doubt that the collection of omics data under the banner of the TCGA program is destined only to grow and that these more exoteric data modalities will eventually become available to the research community.
Taken together, these contributions have had a tremendous impact in establishing the ability of cancer systems biology to elucidate critical mechanisms of oncogenesis and tumor progression that would have escaped more traditional approaches, thus helping this new discipline transition from theoretical promise to tangible value in the study of human malignancies.