State and Local Government Employment and Payroll
March 2004

 
__
__


Data Processing
  Editing is a process that ensures survey data are accurate, complete, and consistent. Efforts are made at all phases of collection, processing, and tabulation to minimize errors.

Although some edits are built into the Internet data collection instrument and the data entry programs, the majority of the edits are performed after the case has been loaded into the Census Bureau’s database. Edits consist primarily of two types: consistency and a ratio of the current year’s reported value to the prior year’s value.

The consistency edits check the logical relationships of data items reported on the form. For example, if a value exists for employees for a function then a value must exist for payroll also. If part-time employees and payroll are reported then part-time hours must be reported and vice versa.

The current year/prior year edits compare data for the number of employees, the function reported for the employees, and the average salary between reporting years. If data falls out of acceptable tolerance levels, the item is flagged for review. Some additional checks are made comparing data from the Annual Finance Survey to data reported on the Annual Survey of Government Employment and Payroll to verify that employees reported on the Annual Survey of Government Employment and Payroll at a particular function have a corresponding expenditure on the Finance Survey.

For both types of edits, the edit results are reviewed by analysts and adjusted when needed. When the analyst is unable to resolve or accept the edit failure, contact is made with the respondent to verify or correct the reported data.

    Imputation:
  Not all respondents answer every item on the questionnaire. There are also questionnaires that are not returned despite efforts to gain a response. Imputation is the process of filling in missing or invalid data with reasonable values in order to have a complete data set.

For general purpose governments and for schools, the imputations were based on recent historical data from either a prior year annual survey of the most recent Census of Governments, if it was available. These data were adjusted by a growth rate that was determined by the growth of units that were similar (in size, geography, and type of government) to the nonrespondent. If there was no recent historical data available, the imputations were based on the data from a randomly selected donor that was similar to the nonrespondent. This donor’s data was adjusted by dividing each data item by the population (or enrollment) of the donor and multiplying the result by the nonrespondent’s population (or enrollment). For special districts, if prior year data are available, the data are brought forward with a national level growth rate applied. Otherwise, the data are imputed to be zero. In cases where good secondary data sources exist, the data from those sources were used.

Beginning with the 2002 Census, individual unit imputed data are no longer available on the data files released to the public.

    Estimation:
  Estimation is the process by which sample data are used to indicate the value of an unknown quantity in a population. In the publications for employment statistics, totals and per capita ratios are published. A simple unbiased estimate for each variable can be obtained from the data by multiplying the value of each variable by its weight.

Generally, the value of each variable from the 2002 Census of Governments was used to adjust the current year sample estimate by a factor, which accounts for how much the sample under- or over-estimated the census total. This factor may reduce the variability of the estimate. However, there were some exceptions. The simple unbiased estimate was used for all of the variables in four small states (Delaware, Hawaii, Nevada, Rhode Island), Washington, D.C., and any cases where there were fewer than 20 units with a weight greater than 1 contributing to the estimate.

    Variance:
  Data that are derived from the annual sample survey are subject to sampling error. The statistics in this report that are based wholly or partly on data from the sample are apt to differ from the results of a Census covering all governments. Estimates based on a sample survey are subject to sampling variability. The particular sample used is one of a large number of all possible samples of the same size that could have been selected using the same sample design. Each of the possible samples would yield somewhat different results.

The standard error is a measure of the variation among the estimates from all possible samples and thus is a measure of the precision with which an estimate from a particular sample approximates the average results of all possible samples. Each viewable table contains a column that gives users the coefficients of variation (or relative standard error) that have been computed for these estimates. The coefficient of variation is the estimated standard error expressed as a percent of the estimated total or proportion.

State government employment and payroll data are not subject to sampling. Consequently, state-local aggregates shown here for individual states are more reliable (on a relative standard error basis) than the local government estimates they include.

The CVs presented in the tables can be used to derive the standard error of the estimate. The standard error can then be used to derive interval estimates with prescribed levels of confidence that the interval includes the average results of all samples:

a. intervals defined by one standard error above and below the sample estimate will contain the true value about 68 percent of the time;

b. intervals defined by 1.6 standard errors above and below the sample estimate will contain the true value about 90 percent of the time;

c. intervals defined by two standard errors above and below the sample estimate will contain the true value about 95 percent of the time.

The user can calculate the standard error by multiplying the CV presented in the tables by the corresponding estimate. The CVs presented in the tables are in percentage form and must be divided by 100 before being multiplied by the estimate. This standard error estimate can then be used to get a 90 percent interval estimate by multiplying it by 1.6 and adding the result to the estimated total to get the upper bound and subtracting it from the estimated total to get the lower bound.


__
 
Source: U.S. Census Bureau, Governments Division,
Created: 03-25-2005
Last revised: June 21 2005