text-only page produced automatically by LIFT Text Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation
Search  
Awards
design element
Search Awards
Recent Awards
Presidential and Honorary Awards
About Awards
Grant Policy Manual
Grant General Conditions
Cooperative Agreement Conditions
Special Conditions
Federal Demonstration Partnership
Policy Office Website


Award Abstract #0428168
ITR - (ASE+NHS) - (dmc+int): Privacy-Preserving Data Integration and Sharing


NSF Org: IIS
Division of Information & Intelligent Systems
divider line
divider line
Initial Amendment Date: September 13, 2004
divider line
Latest Amendment Date: April 20, 2005
divider line
Award Number: 0428168
divider line
Award Instrument: Standard Grant
divider line
Program Manager: Le Gruenwald
IIS Division of Information & Intelligent Systems
CSE Directorate for Computer & Information Science & Engineering
divider line
Start Date: September 15, 2004
divider line
Expires: August 31, 2008 (Estimated)
divider line
Awarded Amount to Date: $1012000
divider line
Investigator(s): Christopher Clifton clifton@cs.purdue.edu (Principal Investigator)
Ahmed Elmagarmid (Co-Principal Investigator)
Dan Suciu (Co-Principal Investigator)
AnHai Doan (Co-Principal Investigator)
Gunther Schadow (Co-Principal Investigator)
divider line
Sponsor: Purdue University
302 Wood Street
West Lafayette, IN 47907 765/494-4600
divider line
NSF Program(s): ITR FOR NATIONAL PRIORITIES,
INFORMATION TECHNOLOGY RESEARC
divider line
Field Application(s): 0104000 Information Systems
divider line
Program Reference Code(s): smet,OTHR,HPCC,9251,9218,9178,0000
divider line
Program Element Code(s): 7314,1640

ABSTRACT

Integrating and sharing data from multiple sources has been a long-standing challenge in the database community. This problem is crucial in numerous contexts, including data integration for enterprises and organizations, data sharing on the Internet, collaboration among government agencies, and the exchange of scientific data. Many applications of national importance, such as emergency preparedness and response; as well as research in many scientific domains, require integrating and sharing data among participants.

Data integration is seriously hampered by an inability to ensure privacy. Without a privacy framework, sources are reluctant to share their data. Problems include fear of disclosing confidential information as well as regulations protecting individual privacy. While there has been progress in computing aggregations of distributed data without disclosing that data; e.g., privacy-preserving distributed data mining, it assumes data integration problems (schema matching, record linkage) are solved. As a consequence, the lack of a privacy-preserving data integration framework has become a key bottleneck to deploying data integration.

This project will develop the technology needed to create and manage federated databases while controlling the disclosure of private data. While the emphasis will be on general techniques for data integration that preserve privacy, the project will work in the context of diverse but particularly relevant problem domains, including scientific research and emergency preparedness. Involvement of domain experts from these fields in developing and testing the techniques will ensure impact on areas of national importance.


PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Next (Showing: 1 - 20 of 23).

Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis, and Vassilios S. Verykios.  "Duplicate Record Detection: A Survey,"  IEEE Transations on Knowledge and Data Engineering (TKDE),  v.19,  2007,  p. 1.

Chris Clifton, AnHai Doan, Ahmed Elmagarmid, Murat Kantarcioglu, Gunther Schadow, Dan Suciu, and Jaideep Vaidya.  "Privacy Preserving Data Integration and Sharing,"  The 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD'2004),  2004,  p. 19.

Christopher Re Nilesh Dalvi Dan Suciu.  "Efficient Top-k Query Evaluation on Probabilistic Data,"  23rd IEEE International Conference on Data Engineering (ICDE07),  2007,  p. 386.

Jaideep Vaidya and Chris Clifton.  "Privacy-Preserving Top-K Queries,"  The 21st International Conference on Data Engineering (ICDE 2005),  2005,  p. 545.

M. Ercan Nergiz and Chris Clifton.  "MultRelational k-Anonymity,"  The 23rd IEEE International Conference on Data Engineering (ICDE 2007), Istanbul, Turkey,  2007, 

M. Sayyadian, Y. Lee, A. Doan, A. Rosenthal.  "eTuner: Tuning Schema Matching Software Using Synthetic Scenarios,"  Proceedings of the International Conference on Very Large Databases,  2005,  p. 1.

Mehmet Ercan Nergiz and Chris Clifton.  "Thoughts on k-Anonymization,"  The Second International Workshop on Privacy Data Management held in conjunction with The 22nd International Conference on Data Engineering, Atlanta, USA April 8, 2006,  2006,  p. 1.

Mehmet Ercan Nergiz, Maurizio Atzori and Chris Clifton.  "Hiding the Presence of Individuals from Shared Databases,"  2007 ACM SIGMOD international conference on Management of data (SIGMOD '07),  2007,  p. 665.

Monica Scannapieco, Ilya Figotin, Elisa Bertino, and Ahmed K. Elmagarmid.  "Privacy preserving schema and data matching,"  2007 ACM SIGMOD international conference on Management of data,  2007,  p. 654.

Mummorthy Murugesan and Wei Jiang.  "Secure Content Validation,"  he Third International Workshop on Privacy Data Management in Conjunction with ICDE 2007,  2007, 

Murat Kantarcioglu and Chris Clifton.  "Security Issues in Querying Encrypted Data,"  The 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security,  2005,  p. 325.

N. Dalvi and D. Suciu.  "Management of Probabilistic Data (invited paper),"  PODS,  v.1,  2007,  p. 1.

N. Dalvi and D. Suciu.  "The Dichotomy of Conjunctive Queries on Probabilistic Structures,"  PODS,  v.1,  2007,  p. 1.

Nathan Bales, James Brinkley, E.S. Lee, Shobhit Mathur, Christopher Re, Dan Suciu.  "A Framework for XML-based Integration of Data, Visualization and Analysis in a Biomedical Domain,"  The 3rd International XML Database Symposium (XSym),  2005,  p. 1.

Nilesh Dalvi, Chris Re and Dan Suciu.  "Query Evaluation on Probabilistic Databases,"  IEEE Data Engineering Bulletin,  v.29,  2006,  p. 25.

Nilesh Dalvi, Dan Suciu.  "Answering Queries from Statistics and Probabilistic Views,"  International Conference on Very Large Databases (VLDB),  2005,  p. 1.

Nilesh N. Dalvi.  "Query Evaluation on a Database Given by a Random Graph,"  ICDT,  v.1,  2007,  p. 149.

R. McCann, B. AlShelbi, Q. Le, H. Nguyen, L. Vu, A. Doan..  "Maveric: Mapping Maintenance for Data Integration Systems,"  Proceedings of the International Conference on Very Large Databases,  2005,  p. 1.

Warren Shen, Pedro DeRose, Long Vu, AnHai Doan, Raghu Ramakrishnan.  "Source-aware Entity Matching: A Compositional Approach,"  23nd International Conference on Data Engineering (ICDE 2007),  2007,  p. 195.

Wei Jiang and Chris Clifton.  "AC-Framework for Privacy-Preserving Collaboration,"  2007 SIAM International Conference on Data Mining (SDM07),  2007, 


Next (Showing: 1 - 20 of 23).

 

Please report errors in award information by writing to: awardsearch@nsf.gov.

 

 

Print this page
Back to Top of page
  Web Policies and Important Links | Privacy | FOIA | Help | Contact NSF | Contact Web Master | SiteMap  
National Science Foundation
The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA
Tel: (703) 292-5111, FIRS: (800) 877-8339 | TDD: (800) 281-8749
Last Updated:
April 2, 2007
Text Only


Last Updated:April 2, 2007