This appendix addresses the most frequently encountered mathematical operations when using this book. The first segment involves manipulation of the data. The second segment addresses the reliability of the estimates. For more information on manipulating the data, please consult a mathematics or statistics textbook. For more information on calculating the reliability of the estimates, please consult the technical documentation for the March 2007 Survey at http://www.census.gov/apsd/techdoc/cps/cpsmar07.pdf.
First, divide the percentage by 100. Then multiply that decimal by the total population.
Note: This procedure cannot be used on medians or some means presented in this publication.
This is also known as getting a cumulative distribution from a frequency distribution. Add percentages in the frequency distribution (column) until you exceed the percentile limit you want. Then interpolate within that last interval to estimate your desired percentile (see example below).
Social Security (dollars) | Percent | Social Security (dollars) | Total percent |
---|---|---|---|
1–499 | 0.1 | < 500 | 0.1 |
500–999 | 0.2 | < 1,000 | 0.3 |
1,000–1,499 | 0.4 | < 1,500 | 0.7 |
1,500–1,999 | 0.3 | < 2,000 | 1 |
2,000–2,499 | 0.5 | < 2,500 | 1.5 |
2,500–2,999 | 0.5 | < 3,000 | 2 |
3,000–3,499 | 0.6 | < 3,500 | 2.6 |
3,500–3,999 | 0.6 | < 4,000 | 3.2 |
4,000–4,499 | 0.6 | < 4,500 | 3.8 |
4,500–4,999 | 1.1 | < 5,000 | 4.9 |
5,000–5,999 | 2.4 | < 6,000 | 7.3 |
6,000–6,999 | 2.9 | < 7,000 | 10.2 |
Because the figures in this report are based on a sample of the older population, all reported statistics (counts, percentages, and medians) are only estimates of population parameters and may deviate somewhat from their true values—that is, from the values that would have been obtained from a complete census using the same questionnaires, instructions, and interviewers.
The standard error is primarily a measure of sampling variability—that is, it measures the variations that occur by chance because a sample rather than the entire population is surveyed. As calculated for this report, the standard error also partly measures the effect of response and enumeration errors but does not measure systematic biases in the data. The chances are about 68 out of 100 that an estimate for the sample would differ from a complete census figure by less than the standard error. The chances are about 95 out of 100 that the difference would be less than twice the standard error.
The reliability of an estimated percentage, computed by using sample data for both numerator and denominator, depends on both the size of the percentage and the size of the total on which the percentage is based. The approximate standard error Sx of an estimated percentage can be obtained using the formula
Here x is the total number of persons, families, or households (the base of the percentage), p is the percentage, and b is the parameter from the following table associated with the characteristic in the numerator of the percentage.
Characteristics | Total or white | Black | Asian | Hispanic |
---|---|---|---|---|
Below poverty level | 1,998 | 1,998 | 1,998 | 1,998 |
All income levels | 1,249 | 1,430 | 1,430 | 1,430 |
Use of this formula in calculating the standard error of a single percentage is illustrated as follows:
An estimated 38.5 percent of units aged 65 or older had total money income of $30,000 or more in 2006 (Table 3.A1). Because the base of this percentage is approximately 27,421,000—the number of units aged 65 or older—the standard error of the estimated 38.5 percent is approximately 0.3 percent. The chances are 68 out of 100 that the estimate would have shown a figure that differed from one resulting from a complete census by less than 0.3 percent. The chances are 95 out of 100 that the estimate would have shown a figure differing from one after a complete census by less than 0.6 percent—that is, this 95 percent confidence interval would range from 37.9 percent to 39.1 percent.
For a difference between two sample estimates, the standard error is approximately equal to the square root of the sum of the squares of the standard errors of each estimate considered separately. This formula will represent the actual standard error quite accurately for the difference between separate and uncorrelated characteristics. If, however, there is a high positive correlation between the two characteristics, the formula will overestimate the true standard error.
A comparison of the difference in the percentage of units aged 62 to 64 and 65 or older who had total money income of $30,000 or more in 2006 illustrates how to calculate the standard error of a difference between two percentages:
38.5 percent of the 27,421,000 units aged 65 or older and 59 percent of the 5,433,000 units aged 62 to 64 had total money income of $30,000 or more in 2006 (Table 3.A1)—a difference of 20.5 percentage points. The standard errors of those percentages are 0.3 and 0.7, respectively. The standard error of the estimated difference of 20.5 percentage points is about
The chances are 68 out of 100 that the difference is between 19.7 and 21.3 percentage points and 95 out of 100 that it is between 18.5 and 22.1 percentage points. Because the confidence interval around the difference does not include zero, there is a statistically significant difference between the proportions of units who are aged 62 to 64 and those who are aged 65 or older with income of $30,000 or more.
The sampling variability of an estimated median depends on the distribution as well as on the size of the base. Confidence limits of a median based on sample data may be estimated as follows: (1) using the appropriate base, the standard error of a 50 percent characteristic is determined; (2) the standard error determined in step 1 is added to and subtracted from 50 percent; and (3) the confidence interval around the median corresponding to the two points estimated in step 2 is then read from the distribution of the characteristic. A two-standard-error confidence limit may be determined by finding the values corresponding to 50 percent plus and minus twice the standard error. This procedure may be illustrated as follows:
The median total money income of the estimated 27,421,000 units aged 65 or older was $23,194 in 2006 (Table 3.A1). The standard error of 50 percent of those units expressed as a percentage is about 0.34 percent. As interest usually centers on the confidence interval for the median at the two-standard-error level, it is necessary to add and subtract twice the standard error obtained in step 1 from 50 percent. This procedure yields limits of approximately 49.3 percent and 50.7 percent. By interpolation, 49.3 percent of units aged 65 or older had total money income below $22,856, and 50.7 percent had total money income below $23,533. Thus, the chances are about 95 out of 100 that the census would have shown the median to be greater than $22,856 but less than $23,533.