Census Bureau

RECURSIVE ANALYSIS OF LINKED DATA FILES

William E. Winkler and F. Scheuren, George Washington University

KEY WORDS: Edit, Imputation, Record Linkage, Regression Analysis, Recursive Processes

ABSTRACT

This paper demonstrates a methodology for analyzing two or more files when the only common information is name and address that is subject to significant error. Such a situation might arise with lists of businesses. We assume that a small proportion of records can be accurately matched. With the matched pairs we build an edit/imputation model and add predicted quantitative values, via a regression analysis to each file. Matching is then repeated with the common quantitative data and with name and address information. If necessary, the edit/impute, regression, and matching steps can be repeated in a recursive fashion. In large measure the ideas of Neter, Maynes, and Ramanathan (1965) are revised but with new tools.