NATIONAL CANCER INSTITUTE

Childhood Cancer TARGET Initiative - Data Access Policy

Through the NCI’s Therapeutically Applicable Research to Generate Effective Treatments (TARGET) Initiative (http://target.cancer.gov/), the Institute seeks to accelerate the process by which new therapeutic targets are discovered so that researchers can develop effective new drugs and treatments, with the ultimate goal of making significant strides in improving outcome for children with cancer.

Scientific progress in identifying novel therapeutic targets from the TARGET Initiative will be accelerated if its data are made available as broadly as possible to the research community. Data analyses by many researchers will increase the likelihood of discovering potential therapeutic leads that might be missed were the data not available to the larger research community. The data access policy for the TARGET Initiative is therefore focused on providing as-broad-as-possible access, consistent with requisite privacy protections appurtenant to the combination of comprehensive genetic data derived from research participants who are children.

While the data access policy for the TARGET Initiative is modeled after the policy used for The Cancer Genome Atlas (TCGA) Pilot Project and Cancer Genetic Markers of Susceptibility (CGEMS), the distinctive issues related to research involving children require some differences between the policies of these projects. The first key issue is that the relatively small number of children diagnosed with cancer, in distinction to the much larger number of adults implies a need for greater aggregation of demographic data to minimize the risk of individual identification. Secondly, the surrogate consent, which is required for the participation of minors in research, argues for greater caution in protecting children from potential research risks.

The research conducted by TARGET has three distinct yet tightly integrated components for selecting new molecular targets for the development of novel therapies for these childhood cancers:

Data generated by the TARGET Initiative will be deposited into several databases, including project specific NCI databases, that will be accessible through the Cancer Molecular Analysis (CMA) portal, a set of software applications developed by NCI Center for Biomedical Informatics and Information Technology (CBIIT).


TARGET Data Access Policy

The data access policy will provide investigators access to TARGET Initiative data in two tiers: an open-access tier, comprising information that is deemed to present minimal risk of participant re-identification; and a controlled-access tier that will include broader (yet still not directly-identifying) demographic, clinical and genotypic information. These two tiers for data access are summarized below, with detailed description provided in Table 1.

Open-Access Data tier

The Open-Access Data tier includes publicly accessible data that cannot be aggregated to generate a dataset unique to an individual using TARGET data and other publicly available data. The Open-Access Data tier does not require user certification for data access. This tier allows open access by researchers to all transcriptome data, mutations identified in the tumor tissue, as well as processed data describing cancer-specific copy number alterations and loss-of-heterozygosity described in detail below. Example data elements are provided in Listing 1 and Listing 2.

Data within the Open-Access Data Tier are available in several public databases, such as the TARGET Initiative Data Portal, caArray, GEO and the NCBI Trace Repository. These data types may include:

Controlled Access Data tier

The Controlled-Access Data tier contains broader clinical data and molecular signatures, which while stripped of direct identifiers, are individually unique. The Controlled-Access Data tier is for research projects related to target identification and therapeutics development for which the Open-Access data are not sufficient. Access to this tier requires user certification. Requests for access to this tier of data should be for research projects that can only be conducted using pediatric data (i.e., the research objectives cannot be accomplished using data from adults) and that have likely relevance to developing more effective treatments, diagnostic tests, or prognostic markers for childhood cancers. An example data set is provided in Listing 2.

Access to this data tier is available to researchers who:

Data within the Controlled-Access tier are (currently) only available via the Target Initiative Data Portal. These data types may include:


TARGET resources

TARGET main website:

Related information resources

The Cancer Genome Atlas (TCGA) pilot project: Key human subjects protection and data access resources:

NIH policy on genome-wide association studies


Table 1 – Listing of TARGET data types and assigned access tier

Dataset Content Access Policy
Core Clinical (Open Access) Demographic and clinical data, aggregated to minimize risk of individual identification (see example below for ALL dataset) Open/Public
Core Clinical (Controlled Access) More granular clinical data compared to Open Access dataset (see example below for ALL dataset) DAC approval required
Gene expression Gene Expression (raw and normalized) Open/Public
Gene methylation DNA methylation Open/Public
SNP array Raw genotype calls DAC approval required
Copy number alterations (from SNP array) Processed SNP array data describing changes in copy number for chromosomal regions in each cancer cases Open/Public
Loss-of-heterozygosity (LOH) Processed SNP array data listing regions of LOH for each cancer case Open/Public
Mutations Somatic variants (i.e., identified in cancer specimen DNA but not in germline DNA) as well as known cancer gene mutations (whether somatic or germline) Open/Public
Sequence traces Trace files with NCBI required annotations. Traces from the same amplicon (forward-reverse reads from cancer and germline DNAs) will be identified. Ability to aggregate all traces from a single sample across amplicons, however, will not be supported in the open/public data set Open/Public
Sequence linking table Information that links all released sequence data to an individual case DAC approval required

Listing 1 - Example of Core Open Access Clinical Dataset (Proposed for ALL Project)


Listing 2 - Example of Core Controlled Access Clinical Dataset (Proposed for ALL Project) (Additional Items beyond Those in Core Open Access Clinical Dataset)