Approximate String Comparison and its Effect
on an Advanced Record Linkage System
Edward H. Porter and
William E. Winkler, Bureau of the Census
KEY WORDS: string comparator, bigram, assignment algorithm, EM algorithm, latent class.
ABSTRACT
This paper examines various methods of string comparison for dealing with typographical error, models
their relationship to the main likelihood ratio used in the Fellegi-Sunter decision rule, and shows how they
improve matching performance.