text-only page produced automatically by LIFT Text
Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation Division of Science Resources Statistics

Data Collection and Processing Procedures

 

SESTAT and CPS use different procedures to collect and process data. A detailed comparison of the procedures used for the two survey systems appears in appendix A. This section presents highlights of the comparison, including both differences and similarities in the data collection procedures and data processing procedures, and in the collection of data on academic degree, employment status, occupation, and other respondent categories.

Top of page. Back to Top

Data Collection Procedures

  • All SESTAT sample members are asked to provide self-reports. CPS relies heavily on proxy reports; for example, a household respondent may provide information on each eligible household member. Interviewers are encouraged to ask individual household members to self-report on labor force participation. However, interviewers work under tight time constraints and are just as strongly encouraged to collect as much information as possible in one contact. Typically, just under one-half of the CPS data collected on labor force participation are provided by proxies.
  • The two survey systems use different modes of data collection. SESTAT data are collected primarily by mail and telephone, with some follow-up by personal interview. In the 1997 SESTAT, 61% of responses were obtained by mail, 37% by telephone, and 2% by personal interview. Most CPS data are collected through personal interviews during the first and fifth months of participation and through telephone interviews during the other months of participation.
  • SESTAT surveys collected by telephone and personal interviews use computer-assisted telephone and personal interviewing (CATI and CAPI). All CPS data are collected with computer-assisted interviewing. The CATI and CAPI systems used on both SESTAT and CPS conduct internal consistency checks during the survey administration. These computer-generated edit checks produce edit screens that ask the respondent to resolve or clarify discrepancies in the responses.
  • CPS uses "dependent interviewing," in which responses to selected questions collected on each household member during a prior month are used during subsequent rounds of data collection. SESTAT does not use dependent interviewing.
  • Both SESTAT and CPS require respondents to focus on a 1-week period of time (i.e., the reference week) as they answer the survey questions. In 1997, the SESTAT reference week was the week of 15 April 1997. The CPS reference week is always the week of the month that includes the 12th day of the month. This report used CPS data from the week of 12 April 1997.
  • The length of the data collection periods scheduled for the two survey systems are different. CPS collects data for about 1 week after the reference week. SESTAT surveys last for several months after the reference week.
  • Both survey systems have low-item nonresponse rates, especially for employment and occupation items. SESTAT has zero-item nonresponse rates for the questions on working during the reference week, looking for work during the reference week, and occupation because SESTAT defines these items as critical completes that must be answered to include the questionnaire responses in the final data system. The remaining SESTAT question used to determine labor force status, whether the respondent is on layoff from a job, generally has an item nonresponse rate of 1% or less. The item nonresponse rates for January 1997 CPS data are 0.3% for labor force status and 1.7% for occupation.
  • Both SESTAT and CPS include respondents whose primary language is not English. Most SESTAT respondents are graduates of U.S. colleges and therefore have some English language skills. By comparison, CPS includes some respondents with limited English skills and is likely to capture more individuals without U.S. degrees than SESTAT. Language problems in CPS are expected to be more of an issue for respondents without bachelor's degrees than for respondents with bachelor's or higher degrees. CPS uses interviewers who live in the geographic area in which they interview, and some of these interviewers collect data in languages other than English.
  • Interviewers for both SESTAT and CPS receive extensive hands-on training on questionnaire administration, whereas respondents who self-administer the SESTAT questionnaire receive no training.

Top of page. Back to Top

Data Processing Procedures

  • SESTAT and CPS implement similar data processing procedures. Data from CATI and CAPI interviews are "examined" during the interviews through the use of programmed range checks and internal consistency checks. Both survey systems conduct postcollection editing using computerized systems.
  • Although SESTAT and CPS follow many of the same steps in data processing, the techniques and rules for resolving problem cases vary. For example, SESTAT counts as a "noninterview" all cases that are missing one or more critical complete items (after attempted telephone follow-up), but CPS has no such rule. Furthermore, the two survey systems have important differences in the coding of occupation, as described in the section "Occupation Data."

Top of page. Back to Top

Academic Degree Data

  • To be eligible for SESTAT, a sample member must have completed a bachelor's or higher degree in any field. CPS includes respondents with and without bachelor's or higher degrees.
  • CPS collects the highest level of school or highest degree completed but does not collect any other information about completed degrees, such as field of study. SESTAT collects the college/university, degree level, date, and field of study for degrees at the bachelor's or higher degree level. SESTAT also collects some information on associate's degrees as well as other educational activities.
  • For degree level, both CPS and SESTAT use the same categories of bachelor's, master's, doctorate, and other professional degree, so these data are expected to be consistent across survey systems.

Top of page. Back to Top

Employment Status Data

  • Both survey systems gather data on workforce participation, including principal and secondary jobs, during the survey reference week. Although both survey systems ask similar questions about working for pay or profit during the survey reference week, the battery of questions used to determine labor force status are not the same on the two survey systems.
  • The definition of "employed" is similar but not identical in the two survey systems. The main difference is that CPS specifically asks about work on a family business or farm and classifies the individual as employed if he or she is working 15 hours or more per week or receiving profits, whereas SESTAT instructs respondents to include self-employment without any limitation of number of hours worked.
  • The two survey systems differ in the definition of "unemployed." For the SESTAT labor force variable, an individual who is not working is classified as unemployed if (1) the person is on layoff from a job or (2) the person was looking for work during the 4 weeks preceding the reference week. In CPS, an individual who is not working is classified as unemployed if (1) the person is on layoff from a job and has been given a date to return to work or has been given any indication of being recalled to work within the next 6 months or (2) the person has been trying to find work during the last 4 weeks and lists a job search method that could have brought him or her into contact with a potential employer.
  • Both SESTAT and CPS collect information on full-time or part-time employment status during the survey reference week. In both survey systems, the full-time or part-time status can be determined for either principal job alone or all jobs combined.

Top of page. Back to Top

Occupation Data

  • In SESTAT surveys, the respondent is asked to provide both a verbatim description of the occupation and a self-selected occupation code. With CPS, the industry and occupation information are collected using open-ended questions and dependent interviewing.
  • SESTAT occupation coders review the respondent's self-selected code and occupation description, along with many other questionnaire items related to the respondent's job and education, to assign the best code for the occupation. Coders are instructed not to change self-selected codes unless sufficient evidence exists to indicate that the respondent has made a mistake and the information provided allows the assignment of a better code. CPS coders do not have a respondent self-selected code; instead, they assign codes based on the occupation description, job duties, and industry.
  • In the SESTAT survey system, occupation data are collected independently during each survey cycle without dependent interviewing. However, SESTAT coders on follow-up surveys are instructed to consider the best occupation code assigned in the previous cycle under certain conditions. In CPS, dependent interviewing for the industry and occupation questions is used for households that were included in the sample the previous month. Respondents who say they have the same employer and job duties as in the previous month are asked to verify the previous month's job description. If the job description is verified as correct, the previous month's occupation code is brought forward and no occupation coding is conducted. Another important difference is that in CPS the previous data collected are generally 1 month old and in SESTAT the previous data are 2 years old.
  • Different occupational taxonomies are used in the two survey systems. Because both taxonomies were developed from the 1980 Standard Occupational Classification maintained by the Bureau of Labor Statistics, the two taxonomies are generally consistent. However, whereas the SESTAT system uses broad categories for non-S&E jobs and more specific categories for S&E jobs, the CPS data are coded in both detailed and broad classifications for all jobs.

Top of page. Back to Top

Other Respondent Classifications

  • Respondent characteristics that can be used in analysis include sex, age, race, and ethnicity. SESTAT and CPS collect these data using slightly different methods. The main difference is that CPS collects data by proxy, whereas SESTAT does not use the proxy collection method.
  • Both survey systems collect date of birth, which is coded as "age" for analysis. Both survey systems have procedures for resolving inconsistencies between information collected during the current survey cycle and information collected in previous cycles for date of birth.
  • Both SESTAT and CPS ask about race and Hispanic origin in separate questions. The SESTAT data for race and ethnicity come from the sampling frames or the baseline surveys, which include the 1990 decennial census long form for NSCG cases sampled from the 1990 decennial census, the Survey of Earned Doctorates (SED) for the SDR cases (with verification of responses during the 1993 SDR), and the NSRCG survey each cycle. The questions used to collect race and ethnicity differ slightly between these surveys.
  • The CPS race question is very similar to the NSRCG race question. CPS also collects the verbatim race responses provided by the respondents but edits any such responses back into the four main racial groups.
  • The three SESTAT sources of Hispanic origin data (1990 decennial census, SED, and NSRCG) ask directly whether the respondent is of Hispanic origin (with slightly different wording). In CPS, respondents are asked to select their origin (or the origin of some other household member) from a "flash card" that lists 20 ethnic origins. Individuals of Hispanic origin are those who indicated that their origin was Mexican American, Chicano, Mexican (Mexicano), Puerto Rican, Cuban, Central or South American, or other Hispanic.

Top of page. Back to Top

Conclusion

Differences between the SESTAT and CPS data collection procedures might influence the survey estimates, as discussed in detail in appendix A. However, the reinterview studies conducted on the SESTAT surveys did not find large differences between the initial interview and the second interview. Although this does not indicate that the respondent-reported information is correct, it does indicate consistency and can be used as a measure of data reliability.

SESTAT and CPS have similar data processing steps and procedures. However, the techniques and rules for resolving problem cases are different for the two survey systems. Some of the main differences are in the coding of occupation. Both CPS and SESTAT collect verbatim occupation descriptions using similar questions, and the coding taxonomies used in the two surveys are generally consistent with each other. However, the rest of the occupation coding process is different. These different processes could result in differences in the data for the two survey systems.


 
Comparison of the National Science Foundation's Scientists and Engineers Statistical Data System (SESTAT) with the Bureau of Labor Statistics' Current Population Survey (CPS)
Working Paper | SRS 07-205 | August 2007
National Science Foundation Division of Science Resources Statistics (SRS)
The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA
Tel: (703) 292-8780, FIRS: (800) 877-8339 | TDD: (800) 281-8749
Text Only
Last Updated:
Jul 10, 2008