U.S. Census Bureau

 Small Area Income & Poverty Estimates

 Model-based Estimates for States, Counties, & School Districts


1998 State-Level Estimation Details

The 1998 state estimates of poverty and income were released in August 2001. The 1998 county estimates of poverty and income were released in November 2001. For an overview of the changes in methodology between this release and the previous release see Estimation Procedure Changes.

Several features of the 1998 state estimates should be noted. A brief discussion of these features follows. The models are then presented.

Bayesian Estimation Techniques. The models SAIPE used to estimate 1998 income and poverty at the state level employ both direct survey-based estimates of 1998 income and poverty from the March 1999 CPS and regression predictions of income and poverty based on administrative records and prior (1990) census data. We combine the regression predictions with the direct sample estimates using Bayesian techniques. The Bayesian techniques weight the contribution of the two components (regression predictions and direct estimates) on the basis of their relative precision.

The regression model used to develop the regression predictions is postulated for the true, unobserved poverty ratios or median income, but it is fitted to the CPS direct estimates allowing for the sampling error in the data. If the variance of the error term in this regression model (the model error variance) were known, then the Bayesian estimate for each state would be a weighted average (shrinkage estimate) of the state's regression prediction and direct CPS estimate. The two weights in this average add to 1.0, with the weight on the direct estimate computed as the model error variance divided by the total variance (model error variance plus sampling error variance). In this average, the larger the sampling variance of a direct sample estimate, the smaller its contribution to the shrinkage estimate, and the larger the contribution from the regression prediction. Since the model error variance is unknown, the Bayesian approach averages the shrinkage estimates computed over a plausible range of values of the model error variance, weighting the results for each of these values according to the posterior (conditional on the data) probability distribution of the model error variance developed from the Bayesian calculations. The result is generally very close to what one gets by estimating the model error variance by the mean of its posterior distribution, and computing the corresponding shrinkage estimate. Technical details of the Bayesian approach are discussed in the paper, "Accounting for Uncertainty About Variances In Small Area Estimation," (Bell 1999) in the Published Papers section of this web site.

Poverty Ratios and Numbers of Poor People. Deriving state-level estimates of the numbers of poor people of various ages involves two steps. The first step is the use of the Bayesian estimation techniques just discussed, which are applied to CPS direct state estimates of "poverty ratios." The second step is to multiply the resulting model-based poverty ratio estimates by corresponding demographic population estimates to convert the results to estimates of the numbers of poor people of various ages.

The poverty ratios used as the dependent variables in the regression models have the CPS direct estimated number poor of the given age in the numerator and the CPS direct estimated noninstitutional population of the given ages in the denominator. These "poverty ratios" differ from official poverty rates which would use the CPS estimated poverty universes of the given age as the denominators. (For a discussion of the differences between the noninstitutional population and the poverty universe see Denominators for Model-Based State and County Poverty Rates). We use these poverty ratios instead of poverty rates because of the difficulty of constructing demographic estimates of the poverty universes.

We use CPS estimated numbers in both the numerator and denominator of the poverty ratios because positive correlation between the two estimates generally makes the resulting poverty ratio estimate more precise than one obtained with a CPS estimated numerator and a demographic population estimate in the denominator. We multiply the model-based poverty ratio estimates by demographic population estimates, however, because the demographics estimates are deemed more reliable than CPS direct population estimates, which contain substantial sampling error for most states. The CPS controls survey weights only to estimates of the population age 16 and over at the state level, and we are making estimates for more specific age groups.

While we have multiplied model-based poverty ratio estimates by population estimates at the state level, we have not addressed the county-level estimation in the same way, because the estimates of the populations of counties by age are likely to be much less stable than the state population estimates, and little is known about their error structure. Thus, for counties, we directly model (logarithms of) CPS estimates of the number of poor people.

Controlling to the National Estimates. After converting the Bayesian estimates of poverty ratios to state estimates of numbers of poor, we control these estimates to the direct national estimate of number poor based on the CPS. We do not control estimates of state median household income to the national median because the estimation model does not produce the entire household income distribution, which would be required to do so.

Using Estimates from the Prior Census in the Models. The prior census estimates are used to construct regression predictor variables for each of the age-specific poverty ratio models and the median income model. For the 1998 poverty ratio models (for all age groups) and median income, the prior census estimates determine census residuals that are used as regression predictor variables in the models. These census residuals are obtained by fitting a cross-sectional model for 1989, using the 1990 census estimated age group poverty ratio (median household income) as the dependent variable and the 1989 values of the regression predictors from the administrative data plus an intercept term, as the independent variables. The residuals from these cross-sectional regressions identify states in which the selected predictors tend to either overestimate or underestimate poverty, as measured by the census.

The Poverty Ratio Models

The model of 1998 state poverty ratios employs the following predictors.

For further information on these variables, go to Information about Data Inputs.

The dependent variable is the 1998 state estimate of the ratio of the number poor for the relevant age group to the noninstitutional population of that age both estimated from the March 1999 CPS.

Estimating the Total Number of Poor People.

We derive the estimate of the total number of poor people in a state by summing the separate model-based estimates of the number of poor people by age (not limited to related children). The age groups with separate models were 1) people under 5 years of age, 2) people age 5 to 17 years, 3) people age 18 to 64 years, and 4) people age 65 years and over.

The Model For Median Household Income

The regression model for the 1998 median household income for states has the following predictor variables:

The dependent variable is the direct estimate of median household income in 1998 from the March 1999 CPS.


Source: U.S. Census Bureau, Data Integration Division, Small Area Estimates Branch
For assistance, please contact the Demographic Call Center Staff at 301-763-2422 or 1-866-758-1060 (toll free) or visit ask.census.gov for further information.