Request for Information Modifications to the
NHLBI Policy for Distribution of Data from
Clinical Trials and Epidemiology Studies Executive Summary

National Heart, Lung, and Blood Institute
National Institutes of Health

August 2006

 

Background

Comments were solicited on a proposal to modify the Limited Access Data policy from January 1 through March 8, 2006. In brief, the proposed modifications to the policy concerned investigator initiated or institute initiated studies in which high density, genome-wide genotyping is obtained with NHLBI support.

The modifications to the policy for distribution of data propose to limit the time for quality control of collected data and shorten the time between availability and distribution of a cleaned data set to external investigators from the existing intervals of 2-3 years to either: 1) immediately with a year of protected time for the original investigators or 2) release after one year of protected time.

The proposed policy was released for public comment in the NIH Guide to Grants and Contracts on January 31, 2006. All comments were due by March 8, 2006.



Index: Summary of Comments on Specific Topics



Brief Summary

A total of 53 comments were received. Of these, 11 were strongly in favor of the proposed modifications, 12 provided equivocal responses, and 28 provided responses with either a low level of support or opposition to the proposed policy modifications. Almost half of the responses suggested specific changes in the policy. Multiple investigators responded from University of Washington, Southwest Foundation for Biomedical Research and Wake Forest University. At least five responses were submitted on behalf of major collaborative studies. Six of the comments were anonymous.

Those who were strongly supportive of the proposal argued that more individuals working with the data would produce new findings faster than currently can be done. In addition, they noted that the sheer size of the data sets will make it impossible for any one group to learn all there is to learn from the data. Finally, the availability of the data will aid investigators developing new algorithms by providing a test bed on which to try their new approaches. Two respondents suggested the time should be shortened for data cleaning or released prior to quality control and re-released following quality control to minimize time to availability. Another supportive respondent argued that strong enforcement from NHLBI would be required to make it work. Almost by definition, those who favored the policy change offered few additional comments beyond their strong support. Mixed responses were sometimes supportive but with conditions.

Those with low or no support argued that data sharing as proposed could not currently be done for most studies given the level of consent provided by participants and the implicit or explicit social contract made with the participants. Ten or more respondents addressed each of the following concerns: consent, privacy and confidentiality, IRB approval, data quality, timeline, quality of the research, funding, liability for misuse of the data, trust/continued participant cooperation, minority issues, and effects on response rates. Of these, consent, privacy and confidentiality, and IRB approval issues were the strongest concerns. Specific comments for each concern are presented below.

Consent forms and the consent process:

  • Most respondents opined that their study consents do not allow sharing of such extensive genetic data with investigators external to the study, unknown to them, and outside their control.

  • Most investigators felt participants would have to be re-consented for such extensive genetic data sharing and many would not provide such consent.

  • In the Jackson Heart Study, one of the few studies providing data, participants were offered a layered consent allowing them to opt out of sharing DNA or data derived from their DNA; 29% opted out of sharing with outside investigators or with industry.

Up to Index

Privacy and Confidentiality

    Respondents viewed the GWA data tied to phenotypic data as a forensic data set which may be used to identify individuals; it is not clear that such data can be de-identified.

  • Respondents suggested that the wider the distribution of the data the more likely the chance for misuse or abuse.

  • GWA data may contain information that may be very sensitive; Huntington's disease status could be inferred, paternity could be tested.

  • Including extended pedigree data makes it more difficult if not impossible to anonymize.

  • Release of data from small geographically defined populations identifies individuals.

  • Modifications should be reviewed with OHRP to determine if they violate participant privacy and confidentiality.

Up to Index

IRB Approval:

  • Respondents expected that their IRBs would not agree that prior informed consent had provided for the extensive genetic data sharing proposed and new consent would be required.

  • Many respondents suggested IRBs will not allow such identifiable data to be shared; at least 3 confirmed with their IRB that this policy change would be problematic.

Up to Index

Data Quality:

  • Respondents suggested data quality would suffer with the short timeline (6 months) for a finished product, particularly for GWA data, for which data cleaning experience is limited.

  • The 6-month window does not allow time for investigation and corrective action if quality issues are identified in the process.

  • Respondents suggested that data documentation could not convey all the relevant information to external analysts and that familiarity with the actual data collection was necessary to avoid naïve misuse and misinterpretation of the data.

Up to Index

Timeline:

  • Of the 18 respondents commenting on the timeline, 15 recommended increasing the time for data cleaning and/or data analysis; 2 suggested it could be shortened, 1 suggested that protected time needed to be reasonable for publication.

  • Many felt that the short term for data cleaning would lead to rushed output, lower quality data, wasted effort in data analysis due to corrections, and lower quality research.

  • Of those with low support for the modifications, many felt the timeline was extremely short for exclusivity by the original investigators and that the 2-3 years in the current LADS policy was more appropriate.

  • The short timeline for data analysis was seen as devaluing the contribution of the original investigators to “data collectors” that would ultimately do harm to the field.

Up to Index

Quality of the Research:

  • Respondents to this issue opined that lack of knowledge of data collection and the context of the community and study could lead to misuse and abuse of the data.

  • Opportunity for conflicting results would be high with multiple independent analysts working in overlapping areas of the data.

  • Public confusion would result from conflicting conclusions and original investigators would spend much time refuting erroneous results of external users.

  • Links between data collection and data analysis create the best science.

Up to Index

Funding:

  • Additional funds would be needed for studies to re-consent participants for the extensive data sharing requirements of this policy.

  • Most original investigators felt additional funds would be needed to create the data sets and documentation for LADS and to respond to inquiries from the external users.

  • If the timeline remained short, investigators suggested additional funding could provide more staff and allow faster progress in addressing specific study aims within the time provided.

  • External investigators requesting LADS data sets will need funding to pursue analyses and perhaps more funding than might otherwise be needed due to redundancy in effort.

Up to Index

Liability for Misuse of the Data:

  • Many respondents suggested the data agreement without sanctions or penalties was a rather weak instrument for assuring data would not be misused by data requestors.

  • Respondents assumed NHLBI would take responsibility for overseeing ethical and scientific conduct of data users and would assume liability for breaches of confidentiality and misuse since the original study investigators would play no role in data distribution.

Up to Index

Trust/continued participant cooperation:

  • Multiple respondents made reference to the social compact that is established between investigators and community participants, and the trust that would be lost, particularly with minority communities, if data sharing led to misuse.

Up to Index

Minority issues:

  • The point that Tribal consultation must occur before making decisions of what can be done with data from American Indians was firmly made by investigators from the Strong Heart Study.

  • Several respondents noted the particular suspicion and sensitivity of minority populations and community consultation was needed before expanding data access.

  • Changing the plans for data sharing after receiving consent from a minority community could be seen as a “bait and switch” tactic by the community.

Up to Index

Effects on response rates:

  • Most respondents opined that breaches of confidentiality and misuse of data by external data users would greatly reduce participation in future epidemiologic studies.

  • Misuse or abuse of shared data would lead to more requests to drop out of research studies for which data were already available.

  • Truly informed consent describing plans for data sharing, as would happen in re-consenting to meet data distribution requirements, would reduce new study participation.

Up to Index

Remaining Assorted Comments and Concerns:

  • Development of analytic methods would make greater progress with the availability of real data, large data sets, and multiple data sets on which to perfect methods.

  • Since there are inadequate NHLBI grant mechanisms to support data analysis-only projects, it seems that only industry and people with existing resources could use the data.

  • There is no emphasis on collaboration in this policy though collaboration and multi-disciplinary teams are the cornerstone of future research based the NIH Roadmap.

  • Requirements to submit data in SAS format and documentation in pdf files is restrictive.

  • Data distribution promotes duplication and inefficiency and wastes money in tight times.

Many minor solutions are offered by the respondents to meet the concerns summarized in this document. Two major solutions are presented below because they appear to get beyond the primary issues of concern related to informed consent, privacy and confidentiality, IRB approval and collaboration:

  • Allow investigators to release summaries of SNPs associated with HLBS phenotypes within a prescribed timeline. This would push gene finding forward while avoiding the human subjects issues described above. This would allow sharing of association results that could be confirmed by investigators in their own studies. It would also promote collaboration without duplication of effort within the existing data set.
  • A second suggestion would be to allow investigators to proactively promote data sharing and new investigator opportunities and be evaluated on their progress as has been done in the CHS Study. This would promote collaboration, interdisciplinary research, and cross-training while keeping a project management system in place to avoid duplication and share expertise. Projects failing to generate sufficient collaborations would have to submit data to NHLBI.

The two approaches offered above keep the hand of the data collectors in the analysis and interpretation of the data, offer the community a link to the investigators using their data, establish a level playing field for all data analysts, offer opportunities for the development of analytic methods, and increase the speed of gene finding. All this is possible without compromising informed consent and the rights to privacy/confidentiality of study participants.

Up to Index

Skip footer links and go to content