Summary

Data Mining: Early Attention to Privacy in Developing a Key DHS Program Could Reduce Risks
GAO-07-293  February 28, 2007

The government's interest in using technology to detect terrorism and other threats has led to increased use of data mining. A technique for extracting useful information from large volumes of data, data mining offers potential benefits but also raises privacy concerns when the data include personal information. GAO was asked to review the development by the Department of Homeland Security (DHS) of a data mining tool known as ADVISE (Analysis, Dissemination, Visualization, Insight, and Semantic Enhancement). Specifically, GAO was asked to determine (1) the tool's planned capabilities, uses, and associated benefits and (2) whether potential privacy issues could arise from using it to process personal information and how DHS has addressed any such issues. GAO reviewed program documentation and discussed these issues with DHS officials.

ADVISE is a data mining tool under development intended to help DHS analyze large amounts of information. It is designed to allow an analyst to search for patterns in data--such as relationships among people, organizations, and events--and to produce visual representations of these patterns, referred to as semantic graphs. None of the three planned DHS implementations of ADVISE that GAO reviewed are fully operational. (GAO did not review uses of the tool by the DHS Office of Intelligence and Analysis.) The intended benefit of the ADVISE tool is to help detect threatening activities by facilitating the analysis of large amounts of data. DHS is currently in the process of testing the tool's effectiveness. Use of the ADVISE tool raises a number of privacy concerns. DHS has added security controls to the tool; however, it has not assessed privacy risks. Privacy risks that could apply to ADVISE include the potential for erroneous association of individuals with crime or terrorism and the misidentification of individuals with similar names. A privacy impact assessment would identify specific privacy risks and help officials determine what controls are needed to mitigate those risks. ADVISE has not undergone such an assessment because DHS officials believe it is not needed given that the tool itself does not contain personal data. However, the tool's intended uses include applications involving personal data, and the E-Government Act and related guidance emphasize the need to assess privacy risks early in systems development. Further, if an assessment were conducted and privacy risks identified, a number of controls could be built into the tool to mitigate those risks. For example, controls could be implemented to ensure that personal information is used only for a specified purpose or compatible purposes, and they could provide the capability to distinguish among individuals that have similar names to address the risk of misidentification. Because privacy has not been assessed and mitigating controls have not been implemented, DHS faces the risk that ADVISE-based system implementations containing personal information may require costly and potentially duplicative retrofitting at a later date to add the needed controls.

Subject Terms

Data collection
Data integrity
Data mining
Government information dissemination
Homeland security
Information disclosure
Internal controls
Privacy law
Privacy policies
Right of privacy
Risk assessment
Policy evaluation
Counterterrorism