Initial Results from a Nationwide BigMatch Matching of 2000 Census Data
Michael Ikeda and Edward Porter
KEY WORDS: Census Unduplication, Across Response Matching, Record Linkage
ABSTRACT
A nationwide unduplication operation is being considered for the 2010 Census. One
potential problem is the possibility of finding large numbers of false positives, especially when
matching above the county level. To help evaluate the extent of this problem, the Census
Bureau's BigMatch program performed a matching of person records across all Census addresses,
using data from the 2000 Census.
This report provides an overview of the matching methodology and of the results of an
exploratory analysis of the matching output. As expected, most of the problem with apparent
false matches seems to be concentrated in the most common surnames and the most common
Hispanic surnames, especially for matches outside the state. In contrast, for given names there
does not appear to be a strong effect of name frequency on false matches.
CITATION: