Version 2.5.2.0 CRISP Logo CRISP Homepage Help for CRISP Email Us

Abstract

Grant Number: 1R01LM007273-01
Project Title: Preserving Privacy in Medical Data Sets
PI Information:NameEmailTitle
VINTERBO, STAAL A. staal@dsg.harvard.edu

Abstract: Privacy is a fundamental right and needs to be protected. For health care related d information, there are regulations for disclosure. These regulations were motivated by the public's concern of breaches of confidentiality that might result in discrimination. The recent progress in electronic medical record technology, the Internet, and the genetic revolution, together with media reports on violations of privacy have generated increasing interest in this topic. A common belief is that sensitive information is more easily available with the use of networked computers. Since total lack of disclosure is not realistic, current regulations require that the "minimal amount" of information be given to a certain party. A thorough study on what constitutes "minimal" for particular types of applications and a "usefulness index" is lacking. An exact quantification of the potential for privacy breach in de-identified or anonymized databases is also lacking. Definition and quantification of these indices is important for decision-making. As we demonstrate, de-identified data sets can still be used for inference and therefore may disclose sensitive information. The use of machine learning methods to verify the remaining functional dependencies in a de- identified data set leads to better understanding of the possible inferences. Anonymization techniques based on logic, statistics, database theory, and machine learning methods can help in the protection of privacy. We will formally define and study anonymity in databases, from a theoretical and a practical standpoint. We will develop and implement algorithms to anonymize data sets that will be in accordance with the balance of anonymity and "usefulness" of the disclosed data sets. We will also develop and implement algorithms to verify the anonymity of a given data set and indicate the type of records that are at highest risk for a privacy attack. We will make our methods and documented tools freely available to researchers via the WWW.

Public Health Relevance:
This Public Health Relevance is not available.

Thesaurus Terms:
computer system design /evaluation, data management, health care facility information system, health care policy, human rights, information dissemination, information retrieval, mathematical model, medical record, model design /development
Internet, computer program /software, decision making
behavioral /social science research tag, computer simulation, human data, patient oriented research, statistics /biometry

Institution: BRIGHAM AND WOMEN'S HOSPITAL
RESEARCH ADMINISTRATION
BOSTON, MA 02115
Fiscal Year: 2002
Department:
Project Start: 01-FEB-2002
Project End: 31-JAN-2005
ICD: NATIONAL LIBRARY OF MEDICINE
IRG: ZLM1


CRISP Homepage Help for CRISP Email Us