content= The U.S. Census Bureau has developed SPEER software that applies the Fellegi-Holt editing method to economic establishment surveys under ratio edit and a limited form of balancing. It is known that more than 99% of economic data only require these basic forms of edits. If implicit edits are available, then Fellegi-Holt methods have the advantage that they determine the minimal number of fields to change (error localize) so that a record satisfies all edits in one pass through the data. In most situations, implicit edits are not generated because the generation requires days-to-months of computation. In some situations when implicit edits are not available, Fellegi-Holt systems use pure integer programming methods to solve the error localization problem directly and slowly (1-100 seconds per record). With only a small subset of the needed implicit edits, the current version of SPEER (Draper and Winkler 1997,upwards of 1000 records per second) applies ad hoc heuristics that finds error-localization solutions that are not optimal for as much as five percent of the edit-failing records. To maintain the speed of SPEER and do a better job of error localization, we apply the Fourier-Motzkin method to generate a large subset of the implied edits prior to error localization. In this paper, we describe the theory, computational algorithms, and results from evaluating the feasibility of this approach. The U.S. Census Bureau

Error Localization and Implied Edit Generation for Ratio and Balancing Edits

Maria Garcia

KEY WORDS: editing, error localization, Fellegi-Holt Model

ABSTRACT

The U.S. Census Bureau has developed SPEER software that applies the Fellegi-Holt editing method to economic establishment surveys under ratio edit and a limited form of balancing. It is known that more than 99% of economic data only require these basic forms of edits. If implicit edits are available, then Fellegi-Holt methods have the advantage that they determine the minimal number of fields to change (error localize) so that a record satisfies all edits in one pass through the data. In most situations, implicit edits are not generated because the generation requires days-to-months of computation. In some situations when implicit edits are not available, Fellegi-Holt systems use pure integer programming methods to solve the error localization problem directly and slowly (1-100 seconds per record). With only a small subset of the needed implicit edits, the current version of SPEER (Draper and Winkler 1997,upwards of 1000 records per second) applies ad hoc heuristics that finds error-localization solutions that are not optimal for as much as five percent of the edit-failing records. To maintain the speed of SPEER and do a better job of error localization, we apply the Fourier-Motzkin method to generate a large subset of the implied edits prior to error localization. In this paper, we describe the theory, computational algorithms, and results from evaluating the feasibility of this approach. ">

Source: U.S. Census Bureau, Statistical Research Division

Created: 02-SEP-2003
Last revised: September 03 2003