Click here to skip navigation
OPM.gov Home  |  Subject Index  |  Important Links  |  Contact Us  |  Help

U.S. Office of Personnel Management - Ensuring the Federal Government has an effective civilian workforce

Advanced Search

Federal Employment Statistics

2002 Publication of Employment by Geographic Area


APPENDIX II
DATA SOURCES, SURVEY METHODOLOGY & ERROR ANALYSIS

ACKNOWLEDGMENTS

The project manager for this report was statistician Christine E. Steele. The survey and publication were prepared under the direction of Deputy Associate Director Nancy H. Kichak and Group Manager Gary A. Lukowski. We gratefully acknowledge the contributions of Zoraida V. Arledge, Eulus S. Moore, Randall T. Matke, Matthew S. Walters, and Carol W. Goodroe.

INTRODUCTION

Associated with every statistical survey are different types of errors affecting each stage of the survey, from initial data collection to the printing of the publication. This Appendix covers areas of potential errors arising from various survey operations. When possible, we estimated the size of the error. When no information was available about the sources and potential impact of error, we noted it. We did not attempt to determine how the errors interact. The user of this publication should consider the various data limitations discussed below.

DATA SOURCES

Most of the data in this publication came from the Central Personnel Data File; Federal agencies reported the additional data. These two data sources are described briefly below.

Central Personnel Data File. The Central Personnel Data File (CPDF), started in 1972 and maintained by the U.S. Office of Personnel Management (OPM), is an automated system of individual records for most Federal civilian employees. It is updated quarterly from data submitted by agency input of employment status files. We edit all input files for validity, and insert functional blanks into data fields with errors. In December 2002, CPDF contained 1,843,738 active records on non-Postal Federal civilian employees.

Non-CPDF Data Collection. Non-CPDF data were collected through three procedures. First, the U.S. Postal Service (USPS) and Tennessee Valley Authority submitted individual employee records on magnetic tape; we converted these data to the CPDF record format. Second, most of the other non-CPDF agencies and the CPDF agencies with foreign nationals employed overseas submitted summarized data on OPM Form 1312 with geographic location, pay system category, work schedule (if located in United States), and U.S. citizenship (if located overseas). Third, we generated geographic survey data for the Congress, Architect of the Capitol, Botanic Gardens, Library of Congress, General Accounting Office, Congressional Budget Office, Office of Compliance, U.S. Court of Appeals for Veterans Claims, Commission on Security and Cooperation in Europe, Judicial Branch, White House Office, Office of the Vice President, Office of Policy Development, Board of Governors of the Federal Reserve System, Postal Rate Commission, Panama Canal Commission, Vietnam Education Foundation, and White House Commission on the National Moment of Remembrance, mostly from their December 2002 Monthly Report of Federal Civilian Employment (SF 113-A).


DATA CORRECTIONS

Automated Edits. Records for all active employees were extracted from the CPDF file excluding seasonal employees (other than the 13,589 teachers and staff with the Department of Defense Education Activity which were included). Special extracts were prepared from CPDF to reflect the seasonal workforces (see pages 12-14).

Special processing was done for the CPDF data by matching CPDF records with unspecified geographic location against the CPDF error files (from CPDF edits) to substitute back in the original geographic location and U.S. citizenship code. These geographic locations were rejected again by the subsequent geographic survey edits but we had data which could be corrected (cannot correct unspecified data).

After the above special processing, CPDF records along with all the other data submitted for the survey were edited by an automated procedure to meet criteria specific to this report. The edits identify and reject all employee records with unspecified or invalid data for geographic location, U.S. citizenship (for overseas locations), work schedule (for USA locations), pay plan, agency, and agency subelement (for Defense Department in the main survey and for other agencies in the special analysis of the seasonal workforce).

Corrections. We analyzed the error listings and made 15,980 corrections to increase the coverage of the report and insure the most accurate representation of the status of agencies and employment. No corrections were needed for agency or U.S. citizenship codes. Defense subelement was corrected for three records. The 432 CPDF records with unspecified pay plan were assigned to "other" pay plans. Unspecified work schedule was changed to full-time for three employee records.

The unspecified State code 99 was assigned for 4 unspecified or erroneous geographic locations. State code was corrected for 2 records. U.S. Territory codes were corrected for 12 while 599 erroneous foreign country codes in agency reports were corrected. County codes were updated for 12,589 (mostly U.S. Postal Service). City errors were corrected for 88 with name available in agency reports; the other 2,248 city or place code errors (mostly U.S. Postal Service) were changed to unspecified.

BENCHMARKS

Table I below shows the distribution of records used for the December 2002 survey (after making all data corrections) compared to benchmarks (BM).

Table I: Geographic Survey Benchmarks

Source

Survey

Benchmark

% of BM

CPDF

1,816,839

1,850,827

98.16

USPS

881,491

820,749

107.40

Agencies

75,646

77,725

97.33

Noncitizens
Overseas

25,691

25,879

99.27

Total

2,799,667

2,775,180

100.88

This survey achieved 100.88 percent coverage worldwide when benchmarked against December 2002 data collected on the Monthly Report of Federal Civilian Employment (Standard Form 113-A). Data in this survey were benchmarked against the December 2002 113-A data adjusted to geographic survey coverage by including all intermittent employees.

Some agency survey totals were more than 5 percent different from their benchmark totals. Most were small agencies with small actual differences which yielded percentage differences greater than 5 percent. Treasury, Agriculture, Interior, and Small Business Administration survey totals were more than 5 percent lower than their benchmark totals because the survey excludes their seasonal employees. U.S. Postal Service data includes inactives.

After completion of all the editing and data corrections, we merged the accepted CPDF and non-CPDF data and improved place names for 343 records in five locations. The statistical table outputs were then generated. Checks were made for consistency among the reports for the current survey. The 2002 data were also compared with the 2000 survey data. Differences in category totals more than 5 percent above or below the overall change were investigated to determine the cause of 2000 to 2002 changes.

SOURCES OF ERROR

Data Collection. Each quarter, all Federal agencies participating in the CPDF system update their agency files to document their employees' status. Timely file submission by the agencies may be prevented by the loss of automated input in transit to our agency or by system problems within the submitting agency.

To measure the completeness of agency employment reflected by the December 2002 Central Personnel Data File, the number of employee records by agency extracted for the survey was compared with the employment counts by agency (including all intermittents and excluding foreign nationals employed overseas) from an independent reporting system, the Monthly Report of Federal Civilian Employment (SF 113-A system). Comparing the 1,816,839 CPDF records available for the survey against the 1,850,827 adjusted 113-A data for the same agencies yielded a 1.8 percent difference. This difference is due to the 26,897 seasonals not in the main survey, definitional differences (such as effective dates) between the CPDF and 113-A populations, and miscellaneous reporting errors and processing errors in both the CPDF and 113-A data collection systems.

Survey Processing Errors. All known errors in the edit programs, edit lookup tables, and report programs were corrected.

Excluded Records. The main survey excluded these employee records: 26,897 seasonal employees as of December 31, 2002 (other than the Department of Defense Education Activity).

Data Quality. A September 1994 CPDF Accuracy Survey estimated the percentage of valid (passes CPDF system edits) but inaccurate (differs from the Official Personnel Folder record) data contained in the CPDF. The following Table II shows Governmentwide estimates for the key data elements included in the Geographic Survey.

Table II: Data Element Error Rates (September 1994 CPDF Accuracy Survey)

Data Element

Percentage Error

Agency

0.7

Pay Plan

0.7

Work Schedule

0.7

Location

0.0

Pay Status

0.0

U.S. Citizenship

0.0

The results of the most recent CPDF accuracy evaluation show the percentage of valid but inaccurate data is about 1 percent for agency (all errors in agency subelement), pay plan (used for assigning pay system category), and work schedule and zero percent for geographic location, pay status (used for extracting CPDF employees in pay status), and U.S. citizenship.

The results of that CPDF data element evaluation cannot be directly applied to this Geographic Survey data because of timeframe and definitional differences in the survey populations; however, these presented results give an overall indication of the quality of the data in this publication. The quality of the non-CPDF data has not been measured.

Data Definition Changes. Appendix I covers the changes in definitions and coverage of agency, geographic location, and pay system since the previous survey; these changes can affect 2000 to 2002 data comparisons.

Redefinitions of a data element are a source of error that can be difficult to measure. New work schedule codes for seasonal employees were effective October 1981 and new work schedule codes for on-call employees were effective November 1985. Agencies had to identify these employees and submit work schedule code corrections for them. Further-more, the agencies have to process personnel actions at the beginning and end of each work season to activate and de-activate seasonal employees; failure to do so is another potential source of error.

Previous page Table of Contents