Annual Demographic Survey (March Supplement)

Source and Accuracy of the Data for the March 2001
Current Population Survey Microdata File


Table of Contents

SOURCE OF DATA
Basic CPS
March Supplement
Sample design
Sample redesign
Estimation procedure

ACCURACY OF THE ESTIMATES
Sampling error
Nonsampling error
Nonresponse
Coverage
Comparability of data
A Nonsampling error warning
Standard errors and their use
Estimating standard errors
Generalized Variance Parameters
Standard errors of estimated numbers
Standard errors of estimated percentages
Standard error of a difference
Standard error of an average for grouped data
Standard error of a ratio
Standard error of a median
Standard error of estimated per capita deficit
Accuracy of state estimates
Computation of standard errors for state estimates
Computation of a factor for groups of states
Computation of standard errors for data for combined years

Tables
CPS Coverage Ratios
Parameters for Computation of Standard Errors for Labor Force Characteristics: March 2001
a and b Parameters for Standard Error Estimates for People and Families: March 2001
Factors for State Standard Errors and Parameters and State Populations: 2001


SOURCE OF DATA

The data for this survey came from the March 2001 Current Population Survey (CPS), conducted by the Census Bureau. The March survey uses two sets of questions, the basic CPS and the supplement.

Basic CPS. The monthly CPS collects primarily labor force data about the civilian noninstitutional population. Interviewers ask questions concerning labor force participation about each member 15 years old and over in every sample household.

March Supplement. In March 2001, the interviewers asked additional questions to supplement the basic CPS questions. These additional questions covered the following topics:

Sample design. The present CPS sample was selected from the 1990 Decennial Census files with coverage in all 50 states and the District of Columbia. The sample is continually updated to account for new residential construction. To obtain the sample, the United States was divided into 2,007 geographic areas. In most states, a geographic area consisted of a county or several contiguous counties. In some areas of New England and Hawaii, minor civil divisions are used instead of counties. These 2,007 geographic areas were then grouped into 754 strata, and one geographic area was selected from each stratum. About 50,000 occupied households are eligible for interview every month out of these 754 areas. Interviewers are unable to obtain interviews at about 3,200 of these units. This occurs when the occupants are not found at home after repeated calls or are unavailable for some other reason.

To obtain more reliable data for the Hispanic (Hispanics may be of any race) population, the March CPS sample was increased by about 2,500 eligible housing units. These housing units were interviewed the previous November and found to contain at least one sample person of Hispanic ancestry. In addition, the sample included people in the armed forces living off post or with their families on post.

Sample redesign. Since the introduction of the CPS, the Census Bureau has redesigned the CPS sample several times. These redesigns have improved the quality and accuracy of the data and have satisfied changing data needs. The most recent changes were phased in and implementation was completed in July 1995.

Estimation procedure. This survey's estimation procedure adjusts weighted sample results to agree with independent estimates of the civilian noninstitutional population of the United States by age, sex, race, Hispanic/non-Hispanic ancestry, and state of residence. The adjusted estimate is called the post-stratification ratio estimate. The independent estimates are calculated based on information from four primary sources:

The estimation procedure for the March supplement included a further adjustment so husband and wife of a household received the same weight. The independent population estimates include some, but not all, of undocumented immigrants.

ACCURACY OF THE ESTIMATES

A sample survey estimate has two types of error: sampling and nonsampling. The accuracy of an estimate depends on both types of error. The nature of the sampling error is known given the survey design. The full extent of the nonsampling error, however, is unknown.

Sampling error. Since the CPS estimates come from a sample, they may differ from figures from a complete census using the same questionnaires, instructions, and enumerators. This possible variation in the estimates due to sampling error is known as "sampling variability."

Nonsampling error. All other sources of error in the survey estimates are collectively called nonsampling error. Sources of nonsampling error include the following:

Two types of nonsampling error that can be examined to a limited extent are nonresponse and coverage.

Nonresponse. The effect of nonresponse cannot be measured directly, but one indication of its potential effect is the nonresponse rate. For the March 2001 basic CPS, the nonresponse rate was 8.03%. The nonresponse rate for the supplement was an additional 8.5%, for a total supplement nonresponse rate of 15.85%.

Coverage. The concept of coverage in the survey sampling process is the extent to which the total population that could be selected for sample "covers" the survey's target population. CPS undercoverage results from missed housing units and missed people within sample households. Overall CPS undercoverage is estimated to be about 8 percent. CPS undercoverage varies with age, sex, and race. Generally, undercoverage is larger for males than for females and larger for Blacks and other races combined than for Whites. As described previously, ratio estimation to independent age-sex-race-Hispanic population controls partially corrects for the bias due to undercoverage. However, biases exist in the estimates to the extent that missed people in missed households or missed people in interviewed households have different characteristics from those of interviewed people in the same age-sex-race-ancestry-state group.

A common measure of survey coverage is the coverage ratio, the estimated population before post-stratification divided by the independent population control. Table 1 shows CPS coverage ratios for age-sex-race groups for a typical month. The CPS coverage ratios can exhibit some variability from month to month. Other Census Bureau household surveys experience similar coverage.

.

Table 1. CPS Coverage Ratios

 

 

Non-Black



Black



All People

Age

M

F

M

F

M

F

Total

0-14

0.929

0.964

0.850

0.838

0.916

0.943

0.929

15

0.933

0.895

0.763

0.824

0.905

0.883

0.895

16-19

0.881

0.891

0.711

0.802

0.855

0.877

0.866

20-29

0.847

0.897

0.660

0.811

0.823

0.884

0.854

30-39

0.904

0.931

0.680

0.845

0.877

0.920

0.899

40-49

0.928

0.966

0.816

0.911

0.917

0.959

0.938

50-59

0.953

0.974

0.896

0.927

0.948

0.969

0.959

60-64

0.961

0.941

0.954

0.953

0.960

0.942

0.950

65-69

0.919

0.972

0.982

0.984

0.924

0.973

0.951

70+

0.993

1.004

0.996

0.979

0.993

1.002

0.998

15+

0.914

0.945

0.767

0.874

0.898

0.927

0.918

0+

0.918

0.949

0.793

0.864

0.902

0.931

0.921

Comparability of data. Data obtained from the CPS and other sources are not entirely comparable. This results from differences in interviewer training and experience and in differing survey processes. This is an example of nonsampling variability not reflected in the standard errors. Therefore, caution should be used when comparing results from different sources.

A number of changes were made in data collection and estimation procedures beginning with the January 1994 CPS. The major change was the use of a new questionnaire. The questionnaire was redesigned to measure the official labor force concepts more precisely, to expand the amount of data available, to implement several definitional changes, and to adapt to a computer-assisted interviewing environment. The March supplemental income questions were also modified for adaptation to computer-assisted interviewing, although there were no changes in definitions and concepts. See Appendiz C of Report P - 60 No. 188 on "Conversion to a Computer Assisted Questionnaire" for a description of these changes and the effect they had on the data. Due to these and other changes, one should use caution when comparing estimates from data collected before 1994 with estimates from data collected in 1994 and later.

Caution should also be used when comparing data from this microdata file, which reflects 1990 census-based population controls, with microdata files from March 1993 and earlier years, which reflect 1980 census-based population controls. Although this change in population controls had relatively little impact on summary measures such as averages, medians, and percentage distributions, it did have a significant impact on levels. For example, use of 1990 based population controls results in about a one percent increase in the civilian noninstitutional population and in the number of families and households. Thus, estimates of levels for data collected in 1994 and later years will differ from those for earlier years by more than what could be attributed to actual changes in the population. These differences could be disproportionately greater for certain subpopulation groups than for the total population.

Caution should also be used when comparing Hispanic estimates over time. No independent population control totals for people of Hispanic ancestry were used before 1985.

Based on the results of each decennial census, the Census Bureau gradually introduces a new sample design for the CPS (For detailed information on the 1990 sample redesign, se the Department of Labor, Bureau of Labor Statistics report, Employment and Earnings, Volume 41 Number 5, May 1994). During this phase-in period, CPS data are collected from sample designs based on different censuses. While most CPS estimates were unaffected by this mixed sample, geographic estimates are subject to greater error and variability. Users should exercise caution when comparing estimates across years for metropolitan/ nonmetropolitan categories.

A Nonsampling error warning. Since the full extent of the nonsampling error is unknown, one should be particularly careful when interpreting results based on small differences between estimates. Even a small amount of nonsampling error can cause a borderline difference to appear significant or not, thus distorting a seemingly valid hypothesis test. Caution should also be used when interpreting results based on a relatively small number of cases. Summary measures probably do not reveal useful information when computed on a base (subpopulation) smaller than 75,000.

For additional information on nonsampling error including the possible impact on CPS data when known, refer to:

Standard errors and their use. The sample estimate and its standard error enable one to construct a confidence interval. A confidence interval is a range that would include the average result of all possible samples with a known probability. For example, if all possible samples were surveyed under essentially the same general conditions and the same sample design, and if an estimate and its standard error were calculated from each sample, then approximately 90 percent of the intervals from 1.645 standard errors below the estimate to 1.645 standard errors above the estimate would include the average result of all possible samples.

A particular confidence interval may or may not contain the average estimate derived from all possible samples. However, one can say with specified confidence that the interval includes the average estimate calculated from all possible samples.

Standard errors may be used to perform hypothesis testing. This is a procedure for distinguishing between population parameters using sample estimates. The most common type of hypothesis is that the population parameters are different. An example of this would be comparing the percentage of Whites with a college education to the percentage of Blacks with a college education.

Tests may be performed at various levels of significance. A significance level is the probability of concluding that the characteristics are different when, in fact, they are the same. For example, to conclude that two parameters are different at the 0.10 level of significance, the absolute value of the estimated difference between characteristics must be greater than or equal to 1.645 times the standard error of the difference.

The Census Bureau uses 90 percent confidence intervals and 0.10 levels of significance to determine statistical validity. Consult standard statistical texts for alternative criteria.

Estimating standard errors. To estimate the standard error of a CPS estimate, the Census Bureau uses replicated variance estimation methods. These methods primarily measure the magnitude of sampling error. However, they do measure some effects of nonsampling error as well. They do not measure systematic biases in the data due to nonsampling error. (Bias is the average of the differences, over all possible samples, between the sample estimates and the true value.)

Generalized Variance Parameters. Consider all of the possible estimates of characteristics of the population that are of interest to data users. Now consider all of the subpopulations such as racial groups, age ranges, etc. Finally, consider every possible comparison or ratio combination. The list would be completely unmanageable. Similarly, a list of standard errors to go with every estimate would be unmanageable.

Through experimentation, we have found that certain groups of estimates have similar relationships between their variances and expected values. We provide a generalized method for calculating standard errors for any of the characteristics of the population of interest. The generalized method uses parameters for groups of estimates. These parameters are in Table 2, for basic CPS monthly labor force estimates, and Table 3, for March supplement data, including the Hispanic supplement.

Standard errors of estimated numbers. The approximate standard error, sx , of an estimated number from this microdata file can be obtained using this formula:



Formula 1

Here x is the size of the estimate and a and b are the parameters in Table 2 or 3 associated with the particular type of characteristic. When calculating standard errors for numbers from cross-tabulations involving different characteristics, use the factor or set of parameters for the characteristic which will give the largest standard error.

For information on calculating standard errors for labor force data from the CPS which involve quarterly or yearly averages see "Explanatory Notes and Estimates of Error: Household Data" in Employment and Earnings, a monthly report published by the Bureau of Labor statistics.

Illustration No. 1

Suppose you want to calculate the standard error and a 90 percent confidence interval of the number of unemployed females in the civilian labor force when the number of unemployed females in the civilian labor force is about 2,835,000. Use Formula (1) and the appropriate parameters from Table 2 to get

Number, x

2,835,000

a parameter

-0.000033

b parameter

2,693

standard error

86,000

90% conf. int.

2,694,000 to 2,976,000

where the standard error is calculated as

and the 90 percent confidence interval is calculated as 2,835,000 +1.645 x 86,000.

A conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.

Illustration No. 2

Suppose you want to calculate the standard error and a 90 percent confidence interval for the number of people aged 25 and over who held a bachelor's degree, when they numbered about 29,840,000. Use the appropriate parameters from Table 3 and Formula (1) to get:

Number, x

29,840,000

a parameter

-0.000011

b parameter

2,369

standard error

247,000

90% conf. int.

29,434,000 to 30,246,000

where the standard error is calculated as

and the 90 percent confidence interval is calculated as 29,840,000 + 1.645 x 247,000.

A conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.

Standard errors of estimated percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends on the size of the percentage and its base. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the percentages, particularly if the percentages are 50 percent or more. When the numerator and denominator of the percentage are in different categories, use the factor or parameter from Table 2 or 3 indicated by the numerator.

The approximate standard error sx,p of an estimated percentage can be obtained by using the following formula:



Formula 2

Here x is the total number of people, families, households, or unrelated individuals in the base of the percentage, p is the percentage (0 < p <100) and b is the parameter in Table 2 or 3 associated with the characteristic in the numerator of the percentage.

Illustration No. 3

Suppose you want to calculate the standard error and confidence interval for the percentage of people aged 25 and over with a bachelor's degree who were Black when there were about 29,840,000 people aged 25 and over with a bachelor's degree, of which about 7.6 percent were Black. Use the appropriate parameter from Table 3 and Formula (2) to get:

Percentage, p

7.6

Base, x

29,840,000

b parameter

2,680

standard error

0.25

90% conf. int.

7.19 to 8.01

where the standard error is calculated as

and the 90 percent confidence interval for the percentage of people aged 25 and over with a bachelor's degree who were Black is calculated as 7.6 + 1.645 x 0.25.

Standard error of a difference. The standard error of the difference between two sample estimates is approximately equal to



Formula 3

where sx and sy are the standard errors of the estimates, x and y. The estimates can be numbers, percentages, ratios, etc. This will represent the actual standard error quite accurately for the difference between estimates of the same characteristic in two different areas, or for the difference between separate and uncorrelated characteristics in the same area. However, if there is a high positive (negative) correlation between the two characteristics, the formula will overestimate (underestimate) the true standard error.

For information on calculating standard errors for labor force data from the CPS which involve differences in consecutive quarterly or yearly averages, consecutive month-to-month differences in estimates, and consecutive year-to-year differences in monthly estimates see "Explanatory Notes and Estimates of Error: Household Data" in Employment and Earnings, a monthly report published by the Bureau of Labor Statistics.

Illustration No. 4

Suppose you want to calculate the standard error and a 90 percent confidence interval for the difference in numbers of females and males living in the West (the West region includes Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, and Wyoming) when they numbered about 31,711,000 and 30,815,000, respectively. Use the appropriate parameters from Table 3 and Formulas (2) and (3) to get:

 

x

y

difference

Estimate

31,711,000

30,815,000

896,000

a parameter

-0.000029

-0.000029

-

b parameter

7,791

7,791

-

Standard error

467,000

461,000

656,000

90% conf. int.

30,943,000 to 32,479,000

30,057,000 to 31,573,000

-183,000 to 1,975,000

where the standard error of the difference is calculated as

and the 90 percent confidence interval around the difference is calculated as 896,000 + 1.645 x 656,000.

Since the 90 percent confidence interval contains zero, we cannot conclude, at the 10 percent significance level, that the number of females living in the West is different from the number of males.

Illustration No. 5

Suppose you want to calculate the standard error and a 90 percent confidence interval of the difference between the percentage of males and females age 15 and over employed in agriculture (farming, forestry, and fishing). Suppose 2,515,000 of 71,237,000 employed males age 15 and over, or 3.5 percent, were employed in agriculture and 729,000 of 63,102,000 employed females aged 15 and over, or 1.2 percent were employed in agriculture. Use the appropriate parameters from Table 2 and Formulas (2) and (3) to get

 

x

y

difference

Percentage

3.5

1.2

2.3

Number, x

71,237,000

63,102,000

-

b parameter

2,989

2,989

-

Standard error

0.12

0.07

0.14

90% conf. int.

3.30 to 3.70

1.08 to 1.32

2.07 to 2.53

where the standard error of the difference is calculated as

and the 90 percent confidence interval around the difference is calculated as 2.3 + 1.645 x 0.14.

Since this interval does not include zero, we can conclude with 90 percent confidence that the percentage of agriculturally employed females age 15 and over is less than the percentage of agriculturally employed males age 15 and over.

Standard error of an average for grouped data. The formula used to estimate the standard error of an average for grouped data is



Formula 4

In this formula, y is the size of the base of the distribution and b is a parameter from Table 2 or 3. The variance, S2, is given by the following formula:



Formula 5

where , the average of the distribution, is estimated by



Formula 6

where



Formula 7

Standard error of a ratio. Certain estimates may be calculated as the ratio of two numbers. The standard error of a ratio, x/y, may be computed using



Formula 8

The standard error of the numerator, sx , and that of the denominator, s y , may be calculated using formulas described earlier. In Formula (8), r represents the correlation between the numerator and the denominator of the estimate.

For one type of ratio, the denominator is a count of families or households and the numerator is a count of people in those families or households with a certain characteristic. If there is at least one person with the characteristic in every family or household, use 0.7 as an estimate of r. An example of this type is the average number of children per family with children.

For all other types of ratios, r is assumed to be zero. If r is actually positive (negative), then this procedure will provide an overestimate (underestimate) of the standard error of the ratio. Examples of this type are: average number of children per family and the poverty rate.

NOTE: For estimates expressed as the ratio of x per 100 y or x per 1,000 y, multiply Formula (8) by 100 or 1,000, respectively, to obtain the standard error.

Illustration No. 6

Suppose you want to calculate the standard error and a 90 percent confidence interval for the ratio of males, x, to females, y, who make at least $50,000. Suppose there are 19,483,000 males who make at least $50,000 and about 8,140,000 females make the same, giving a ratio of x to y equal to 2.39.

Use the appropriate parameters from Table 3 to get:

 

x

y

ratio

Estimate

19,483,000

8,140,000

2.39

a parameter

-0.000011

-0.000011

-

b parameter

2,454

2,454

-

Standard error

209,000

139,000

0.05

90% conf. int.

19,139,000 to 19,827,000

7,911,000 to 8,369,000

2.31 to 2.47

where the estimate of the standard error is calculated using Formula (8) and r = 0:

and the 90 percent confidence interval is calculated as 2.39 + 1.645 x 0.05.

Standard error of a median. The sampling variability of an estimated median depends on the form of the distribution and the size of the base. One can approximate the reliability of an estimated median by determining a confidence interval about it. (See Standard errors and their use for a general discussion of confidence intervals.)

Estimate the 68 percent confidence limits of a median based on sample data using the following procedure.

  1. Determine, using Formula (2), the standard error of the estimate of 50 percent from the distribution.

  2. Add to and subtract from 50 percent the standard error determined in step 1. These two numbers are the percentage limits corresponding to the 68 percent confidence about the estimated median.

  3. Using the distribution of the characteristic, determine upper and lower limits of the 68 percent confidence interval by calculating values corresponding to the two points established in step 2.

    Use the following formula to calculate the upper and lower limits.



    Formula 9

    where

  4. Divide the difference between the two points determined in step 3 by two to obtain the standard error of the median.

Note: Median incomes and their standard errors calculated as below may differ from those in published tables showing income since narrower income intervals were used in those calculations.

Illustration No. 7

Suppose you want to calculate the standard error of the median income for families with the following distribution.

Income level

Number of families

Cumulative Number of Families

Cumulative Percent of Families

Under $5,000

  1,740,000

1,740,000

2.4%

$5,000 to $9,999

  2,404,000

4,144,000

5.8%

$10,000 to $14,999

  3,485,000

7,629,000

10.6%

$15,000 to $24,999

  8,678,000

16,307,000

22.6%

$25,000 to $34,999

  8,550,000

24,857,000

34.5%

$35,000 to $49,999

11,861,000

36,718,000

51.0%

$50,000 to $74,999

15,236,000

51,954,000

72.1%

$75,000 and over

20,076,000

72,030,000

100.0%

 

Total number of families

70,030,000

Median income

     $48,950

  1. Using Formula (2) with b = 2,241, the standard error of 50 percent on a base of 72,030,000 is about 0.3 percent.

  2. To obtain a 68 percent confidence interval on an estimated median, add to and subtract from 50 percent the standard error found in step 1. This yields percent limits of 49.7 and 50.3.

  3. The lower and upper limits for the interval in which the median falls are $35,000 and $50,000, respectively.

    Then, by addition, the estimated numbers of families with an income greater than or equal to $35,000 and $50,000 are 47,173,000 and 35,312,000, respectively.

    Using Formula (9), the upper limit for the confidence interval of the median is found to be about

    Similarly, the lower limit is found to be about

    Thus, a 68 percent confidence interval for the median income for families is from $48,800 to $49,400.

  4. The standard error of the median is, therefore,

Standard error of estimated per capita deficit. Certain average values in this report represent the per capita deficit for households of a certain class. The average per capita deficit is approximately equal to



Formula 10

where

To approximate standard errors for these averages, use the formula



Formula 11

In Formula (11), r represents the correlation between p and h.

For one type of average, the class represents households containing a fixed number of people. For example, h could be the number of three-person households. In this case, there is an exact correlation between the number of people in households and the number of households. Therefore, r = 1 for such households.

For other types of averages, the class represents households of other demographic types, for example, households in distinct regions, households in which the householder is of a certain age group, and owner-occupied and tenant-occupied households. In this and other cases in which the correlation between p and h is not perfect, use 0.7 as an estimate of r.

Accuracy of state estimates. The redesign of the CPS following the 1980 census provided an opportunity to increase efficiency and accuracy of state data. All strata are now defined within state boundaries. The sample is allocated among the states to produce state and national estimates with the required accuracy while keeping total sample size to a minimum. Improved accuracy of state data was achieved with about the same sample size as in the 1970 design.

Since the CPS is designed to produce both state and national estimates, the proportion of the total population sampled and the sampling rates differ among the states. In general, the smaller the population of the state the larger the sampling proportion. For example, in Vermont approximately 1 in every 400 households was sampled each month. In New York the sample was about 1 in every 2,000 households. Nevertheless, the size of the sample in New York is four times larger than in Vermont because New York has a larger population.

Computation of standard errors for state estimates. Standard errors for a state may be obtained by computing national standard errors, using formulas described earlier, and multiplying these by the appropriate f factor from Table 4. An alternative method for computing standard errors for a state is to multiply the a and b parameters in Table 2 or 3 by f2 and then use these adjusted parameters in the standard error formulas.

Illustration No. 8

Suppose you want to calculate the standard error for the percentage of people 18 years old and over living in the state of New York who had completed a bachelor's degree or more. Suppose about 3,607,300 (26.3 percent) people had completed at least a bachelor's degree when there were about 13,716,000 people aged 18 and over living in New York. Following the first method mentioned above, use the appropriate parameter from Table 3 and Formula (2) to get:

Percentage, p

26.3

Base, x

13,716,000

b parameter

2,369

Standard error

0.58

Table 4 shows the f factor for New York to be 0.94. Thus, the standard error on the estimate of the percentage of people 18 and older in New York state who had completed college is approximately 0.55 = 0.94 x 0.58.

Following the alternative method mentioned above, obtain the needed state parameter by multiplying the parameter in Table 3 by the f2 factor in Table 4 for the state of interest. For example, for educational attainment for total or white in New York this gives b = 2,369 x 0.89 = 2,108. The standard error of the estimate of the percentage of people 18 and older in New York state who had completed college can then be found by using formula (2), the base of 13,716,000, and the new b parameter, 2,108. This gives a standard error of 0.55.

Computation of a factor for groups of states. The factor adjusting standard errors for a group of states may be obtained by computing a weighted sum of the squared factors for the individual states in the group and taking the square root of the result. Depending on the combination of states, the resulting figure can be an overestimate.

The squared factor for a group of n states is given by



Formula 12

where POPi is the state population and fi2is obtained from Table D. The 2001 civilian noninstitutionalized population from the CPS for each state is also given in Table D.

Illustration No. 9

Suppose the f 2 factor for the state group Illinois-Indiana-Michigan was required. The appropriate factor would be

Multiply the a and b parameters by f2, 1.06, to obtain parameters for the state group, or use the original parameters and multiply the resulting standard errors by f, 1.03.

Computation of standard errors for data for combined years. Sometimes estimates for multiple years are combined to improve precision. For example, suppose is an average derived from n consecutive years' data,
i.e., where the xi are the estimates for the individual years.

Use the formulas described previously to estimate the standard error, sx , of each year's estimate. Then the standard error of is



Formula 13

where



Formula 14

The correlation between consecutive years, r, is 0.35 for non-Hispanic households and 0.55 for Hispanic households. Correlation between nonconsecutive years is zero. The correlations were derived for income estimates but they can be used for other types of estimates where the year-to-year correlation between identical households is high.

Illustration No. 10

Suppose you want to calculate the standard error of the average number of children without health insurance for 1993-1995 when the average is 7,147,000 and the standard errors for the individual years are 213,000, 217,000, and 216,000.

Using Formula (14), the standard error for the three years combined data is

Therefore, the standard error of the average, using Formula (13), is

 

 

Table 2. Parameters for Computation of Standard Errors for Labor Force Characteristics: March 2001

Characteristic

a

b

Labor Force and Not In Labor Force Data Other than Agricultural Employment and Unemployment

 

 

Total or White

Men

Women

Both sexes, 16 to 19 years

-0.000008

-0.000035

-0.000033

-0.000244

1,586

2,927

2,693

3,005

Black

Men

Women

Both sexes, 16 to 19 years

-0.000154

-0.000336

-0.000282

-0.001531

3,296

3,332

2,944

3,296

Hispanic Ancestry

Men

Women

Both sexes, 16 to 19 years

-0.000187

-0.000363

-0.000380

-0.001822

3,296

3,332

2,944

3,296

Unemployment

 

 

Total or White

Men

Women

Both sexes, 16 to 19 years

-0.000017

-0.000035

-0.000033

-0.000244

3,005

2,927

2,693

3,005

Black

Men

Women

Both sexes, 16 to 19 years

-0.000154

-0.000336

-0.000282

-0.001531

3,296

3,332

2,944

3,296

Hispanic Ancestry

Men

Women

Both sexes, 16 to 19 years

-0.000187

-0.000363

-0.000380

-0.001822

3,296

3,332

2,944

3,296

Agricultural Employment

0.001345

2,989

NOTE: These parameters are to be applied to basic CPS monthly labor force estimates.

For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Blacks and Hispanics.

Table 3. a and b Parameters for Standard Error Estimates for People and Families: March 2001

 

Characteristics

Total or White

Black

Hispanic

a

b

a

b

a

b

PEOPLE

Educational Attainment

-0.000011

2,369

-0.000103

2,680

-0.000077

1,811

Employment Characteristics

-0.000008

1,586

-0.000154

3,296

-0.000187

3,296

People by Family Income

-0.000023

4,901

-0.000215

5,611

-0.000239

5,611

Income

-0.000011

2,454

-0.000108

2,810

-0.000120

2,810

Health Insurance

-0.000008

2,191

-0.000074

2,661

-0.000058

1,959

Marital Status, Household and Family Characteristics,

- Some household members

-0.000019

5,211

-0.000209

7,486

-0.000221

7,486

- All household members

-0.000023

6,332

-0.000309

11,039

-0.000327

11,039

Mobility Characteristics (Movers)

- Educational Attainment, Labor Force, Marital Status, Household, Family, and Income

-0.000011

2,869

-0.000080

2,869

-0.000085

2,869

US, County, State, Region or MSA

-0.000029

7,791

-0.000218

7,791

-0.000230

7,791

Below Poverty

- Total

-0.000038

10,380

-0.000290

10,380

-0.000307

10,380

- - Male

-0.000077

10,380

-0.000623

10,380

-0.000596

10,380

- - Female

-0.000074

10,380

-0.000543

10,380

-0.000592

10,380

- Age

- - Under 15

-0.000132

8,002

-0.000826

8,002

-0.000774

8,002

- - Under 18

-0.000110

8,002

-0.000691

8,002

-0.000602

8,002

- - 15 and over

-0.000048

10,380

-0.000398

10,380

-0.000442

10,380

- - 15 to 24

-0.000101

3,927

-0.000672

3,927

-0.000459

3,927

- - 25 to 44

-0.000048

3,927

-0.000364

3,927

-0.000257

3,927

- - 45 to 64

-0.000063

3,927

-0.000589

3,927

-0.000263

3,927

- - 65 and over

-0.000120

3,927

-0.001407

3,927

-0.000749

3,927

Unemployment

-0.000017

3,005

-0.000154

3,296

-0.000187

3,296

 

FAMILIES, HOUSEHOLDS, OR UNRELATED INDIVIDUALS

           

Income

-0.000010

2,241

-0.000094

2,447

-0.000104

2,447

Marital Status, Household and Family Characteristics, Educational Attainment, Population by Age and/or Sex

-0.000010

2,068

-0.000072

1,871

-0.000080

1,871

Poverty

0.000102

2,442

0.000102

2,442

0.000102

2,442

 

NOTES:These parameters are to be applied to March supplemental data including the Hispanic supplement.

For nonmetropolitan characteristics multiply a and b parameters by 1.5.. If the characteristic of interest is total state population, not subtotaled by race or ancestry, the a and b parameters are zero.

For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Blacks and Hispanics.

Table 4. Factors for State Standard Errors and Parameters and State Populations: 2001

State

f

f2

Population

Alabama

1.01

1.01

3,412,000

Alaska

0.39

0.15

441,000

Arizona

0.98

0.97

3,671,000

Arkansas

0.77

0.59

1,992,000

California

1.14

1.29

25,801,000

Colorado

0.96

0.93

3,184,000

Connecticut

1.00

1.00

2,542,000

Delaware

0.47

0.22

593,000

District of Columbia

0.40

0.16

412,000

Florida

0.98

0.97

12,087,000

Georgia

1.19

1.40

6,043,000

Hawaii

0.59

0.35

891,000

Idaho

0.52

0.27

964,000

Illinois

1.00

1.00

9,228,000

Indiana

1.17

1.38

4,548,000

Iowa

0.84

0.71

2,197,000

Kansas

0.81

0.65

2,009,000

Kentucky

0.96

0.92

3,101,000

Louisiana

0.97

0.95

3,295,000

Maine

0.61

0.37

1,007,000

Maryland

1.17

1.38

4,044,000

Massachusetts

0.90

0.81

4,820,000

Michigan

0.96

0.93

7,579,000

Minnesota

1.05

1.11

3,681,000

Mississippi

0.80

0.64

2,096,000

Missouri

1.17

1.37

4,189,000

Montana

0.45

0.20

696,000

Nebraska

0.65

0.42

1,259,000

Nevada

0.66

0.44

1,440,000

New Hampshire

0.62

0.38

949,000

New Jersey

0.91

0.82

6,312,000

New Mexico

0.63

0.40

1,327,000

New York

0.94

0.89

14,192,000

North Carolina

0.97

0.94

5,846,000

North Dakota

0.40

0.16

477,000

Ohio

1.01

1.02

8,637,000

Oklahoma

0.85

0.73

2,570,000

Oregon

0.93

0.86

2,625,000

Pennsylvania

0.98

0.96

9,295,000

Rhode Island

0.55

0.30

756,000

South Carolina

1.00

1.01

3,059,000

South Dakota

0.41

0.17

556,000

Tennessee

1.16

1.34

4,309,000

Texas

1.10

1.21

15,327,000

Utah

0.66

0.43

1,544,000

Vermont

0.42

0.18

475,000

Virginia

1.22

1.48

5,351,000

Washington

1.21

1.47

4,470,000

West Virginia

0.62

0.39

1,444,000

Wisconsin

1.11

1.23

4,057,000

Wyoming

0.35

0.12

374,000

NOTE: For foreign-born and noncitizen characteristics for Total and White, the a and b parameters should be multiplied by 1.3. No adjustment is necessary for foreign-born and noncitizen characteristics for Blacks and Hispanics.

Annual Demographic Survey (March CPS Supplement) 2001 Data Quality Page

CPS Main Page


Source: U.S. Census Bureau
Author: Thomas Moore III-Census/DSMD
Contact: (ask.census.gov) CPS Help-Census/DSD/CPSB
Last revised: September 24, 2001
URL: http://www.bls.census.gov/cps/ads/2001/ssrcacc.htm