How to Access Multiple Datasets

What OCG Data is Open vs. Controlled Access?

OCG employs stringent human subjects’ protection and data access policies to protect the privacy and confidentiality of the research participants. Depending on the risk of patient identification, OCG programs’ data are available to the scientific community in two tiers: open or controlled access. Both types of data can be accessed through its corresponding OCG program-specific data matrix or portal.

OCG Program Open Access Controlled Access Access OCG Program Data
CTD2

All CTD2 data are open access. Examples include data from:

  • Small molecule screenings
  • Small molecule profiling
  • Comparative genomic hybridization
n/a CTD2 Data PortalOpens in a New Tab
CGCI & TARGET

Most CGCI and TARGET data are open access. OCG strives to make all data open access. Examples include:

  • Clinical information that could not be used to identify the patient
  • Tissue pathology data
  • Gene expression data
  • Tumor-specific copy number alterations and loss of heterozygosity
  • Sequence data of single amplicons
  • Tumor-associated somatic mutations

Certain CGCI and TARGET data are controlled access. Examples include:

  • Specific demographic and clinical data; genome-wide genotypes for each case
  • Information linking all sequence traces to an individual
  • Whole genome, exome or transcriptome sequences for an individual case
CGCI Data MatrixOpens in a New Tab TARGET Data MatrixOpens in a New Tab

 

Open-access Data

Data within this category presents minimal risk of participant identification. Much of OCG program data, excluding patient identifiers, are open-access. OCG provides the scientific community the maximum amount of open-access data allowable under HIPAA guidelines. Access to these data does not require user certification, and researchers may explore data content without restriction.

Controlled-access Data

Data within this category presents a higher risk of patient identification. While stripped of direct patient identifiers as defined by HIPAA, controlled-access data contain specific demographic, clinical, and genotypic information that are excluded in open-access data. Controlled-access data are unique and valuable to research projects for which the open-access data are insufficient. Access to protected data requires user certification which can be obtained through NCBI’s dbGaP (National Center for Biotechnology Information’s database of Genotypes and Phenotypes). 

 

How Do I Apply for Data Use Certifications for CGCI and TARGET Controlled-Access Data?

Researchers must apply for Data Use Certifications by submitting Data Access Request forms on NCBI’s dbGaP website.

Visit Using CGCI Data and/or Using TARGET Data for more detailed instructions.

 

Do I Need Multiple Data Use Certifications to Access All CGCI and TARGET Controlled-Access Data?

Yes. Researchers must get a separate Data Use Certification for each consent group within CGCI and TARGET to gain access to its corresponding controlled data. TARGET has only one consent group (Pediatric Cancer Research), while CGCI has two separate consent groups: one for pediatric data specifically (Pediatric Cancer Research) and another for all other data (Cancer Research and General Methods). To obtain Data Use Certifications for all or some combination of CGCI and/or TARGET data, researchers must submit an electronic Data Access Request through dbGaP for each consent group (details can be found at https://www.ncbi.nlm.nih.gov/gap?db=gapOpens in a New Tab). On each form, the requestor must agree with the corresponding Data Use Limitations of that consent group.

Below is a table that outlines the Data Use Limitations for each consent group.

 

TARGET

CGCI

Consent Group

Pediatric Cancer Research

Pediatric Cancer Research

Cancer Research and General Methods

Types of Data

All TARGET data

Pediatric Medulloblastoma

Adult HIV-related, lymphoid (including Burkitt lymphoma), and lung cancers

Data Use Limitations

Requests for controlled-access data will be considered for research projects that can only be conducted using pediatric data (i.e., the research objectives cannot be accomplished using data from adults) and that focus on the development of more effective treatments, diagnostic tests, or prognostic markers for childhood cancers. Moreover, TARGET data can be used for research relevant to the biology, causes, treatment and late complications of treatment of pediatric cancers. Applications proposing methods, software, or other tool development are not considered acceptable uses of the data.

 

Access to protected pediatric data will be granted solely for those research projects that can only be conducted using pediatric data (i.e., the research objectives cannot be accomplished using data from adults) and that focus on the development of more effective treatments, diagnostic tests, or prognostic markers for childhood cancers.

Use of the data is limited to scientific research relevant to the biology, prevention, treatment, and late complications of cancers and for the development of applications proposing analytical methods, software, and other research tools.

 

 

Where Do I Go if I Still Have More Questions?

Please visit Using CGCI Data and/or Using TARGET Data for more detailed instructions on how to gain access to controlled data from those programs.

If you still have questions, please don’t hesitate to contact us

Last updated: January 30, 2020