2007 Economic Census : Data Processing and Treatment of Nonresponse

Data Processing and Treatment of Nonresponse

To prepare 2007 Economic Census data for release to the public, the data are be processed in three primary ways:

Data Edits - to detect reporting errors and other problems
Nonresponse Imputation - to estimate missing data
Tabulation and Analytical Processing - to tabulate and analyze summary data and prevent disclosure of respondents’ identities

Data Edits

Data captured in an economic census must be edited to identify and correct reporting errors and other problems. The data also must be adjusted to account for missing items and for businesses that do not respond. Data edits detect and validate data by considering factors such as historical reporting, industry/geographic ratios and averages and proper classification for a given record.

Computer programs subject the respondents’ responses to a series of data edit programs. They assign a valid kind-of-business or industry classification code to the establishment. Assigning a valid industry classification code depends on computer evaluation of the responses to specific items on the census report forms.

These items include:

self-designated kind-of-business check-box classifications,
responses to product lines sold by a retail establishment,
products manufactured by a plant and
entries written in by the respondent explaining the establishment’s activities.

If critical information is missing, the record is flagged and fixed before further processing occurs.

If critical information is available, the edit assigns the correct classification code. After classification codes are assigned, a "verification" operation is performed to validate the industry, geography and ZIP Codes.

The data edits also evaluate the response data for consistency and validity—for example, assuring that employment data are consistent with payroll or sales/receipts data. Response data is evaluated by industry. Additional checks compare current year data to data reported in previous censuses or from administrative sources.

Nonresponse Imputation

Nonresponse is handled by estimating, or imputing, missing data.

There are two types of nonresponse:

Unit nonresponse occurs when no data have been collected for the respondent.
Item nonresponse occurs when some but not all data have been collected for the respondent.

Title 13 of the United States Code states that respondents are required to answer all questions to the best of their ability. Reported data are important because they represent the answers of similar establishments in a given industry. Incomplete forms, unclear data or nonresponse can affect data analyses and the quality of the published data.

Problems that arise from missing data include:

Analyses of data sets with missing data are more problematic than analyses of complete data sets.
Lack of consistency among analyses because analysts compensate for missing data in different ways and their analyses may be based on different subsets of data.
The presence of nonresponse is unlikely to be completely random; thus, estimates of imputed parameters may be biased.

Although economic census nonresponse accounts for less than five percent of published figures, it is a significant source of nonsampling error.

Note: If a data cell contains too much imputation, the value will be suppressed with an ‘S’ flag.

Tabulation and Analytical Processing

Individual establishment records are tabulated in different ways based on data product and analytical needs. Tabulations include data summed by industry, specified geographic areas, establishment-size, products produced, materials used, fuels used and product lines sold.

The tabulations are subject to disclosure analysis prior to macro-analysis.

During macro-analysis:

units of measure are converted from collected to disseminated units,
a variety of data flags and symbols are set and
data fields are renamed for dissemination.

2007 Economic Census

Data Processing and Treatment of Nonresponse

Data Edits

Nonresponse Imputation

Tabulation and Analytical Processing

Also in this section: