State and County Literacy Estimates

Estimates Home

Overview

Frequently Asked Questions

Estimation Approach

General Cautions

Model dependence
Comparisons between states, counties, or years
Uncertainty in estimates

Model dependence

Unlike standard direct survey estimates, the indirect estimates of the percentages of adults lacking Basic Prose Literacy Skills (BPLS) are generated by a statistical model, and are therefore dependent on the appropriateness of the modeling assumptions. To check on the effects of the modeling assumptions, alternative models were constructed using different prior distributions and different sets of auxiliary variables. These analyses supported the choice of the final model and indicated that the indirect estimates were not sensitive to the variants of the model that were investigated. However, users should still be aware that indirect estimates are always dependent on the particular model used to produce the estimates. See the SAE Technical Report.

Top

Comparisons between states, counties, or years

Ability to detect differences

Users should keep in mind that the credible intervals for the differences in county indirect estimates are often wide thus limiting the ability to statistically detect between-county differences and between-year differences for a given county. Only large differences are likely to be detectable. There are 43 states that contain differences between counties that can be detected.¹ Only about 1 percent of the 1992 and 2003 county-level differences are statistically detectable because of the level of precision of the individual county estimates, indicating that it is not possible to state that the percentages of adults lacking BPLS changed over time in most counties.

Multiple comparisons

When comparing estimates of the percentage of adults lacking BPLS for different states, counties, or years, users need to be highly sensitive about the number of comparisons made. When the credible interval for a difference does not include 0, there is a statistical risk that there is in fact not a true difference. As the number of comparisons conducted increases, so does the risk of making a Type 1 error that a false conclusion of a significant difference is made for one or more of the comparisons. To focus users on specific comparisons, the pairwise comparison tool is constructed to allow only one comparison at a time.

Exact credible intervals versus approximate credible intervals

Due to the enormous number of possible pairwise comparisons between counties across the nation (about 5 million comparisons involving more than 3,000 counties), exact credible intervals have been produced only for differences between any pair of counties that are within the same state for the 2003 NAAL. For all other comparisons, an approximation was used. A description of the approximation and its performance are included in the Estimation Approach.

Comparability with other published NALS and NAAL literacy estimates

Several changes were made to the 1992 data after their public release in order to improve their comparability with the 2003 data. For example, changes were made to the 1992 measurement scales to enable valid comparisons to the 2003 scales: Below Basic, Basic, Intermediate, and Proficient. In addition, the literacy scales used in the main NAAL and NALS analyses excluded adults who were not able to take the assessment because of a language barrier. However, these adults are included in the indirect estimates and are classified as lacking BPLS on the grounds that they can be considered to be at the lowest level of English literacy. For these reasons, the indirect estimates of the percentages of adults lacking BPLS are not comparable to the percentages of adults Below Basic in prose literacy in other NAAL or NALS published results.

Top

Uncertainty in estimates

Level of uncertainty

Considerable care was taken to identify a wide range of auxiliary variables that could be used in the modeling, and extensive efforts were made to select the sets of auxiliary variables that best predicted the percentages of adults lacking Basic Prose Literacy Skills (BPLS) for the 2003 and 1992 models. However, although the sets of variables did contribute significantly to the models, their predictive ability is limited. The state estimates are more precise, and gains in precision are achieved in the estimates for states participating in the SAAL and the SALS as a result of the increased sample sizes for these states.

It is important to take the prediction error in model-dependent indirect estimates into account in their interpretation. Users need to pay careful attention to the 95 percent credible interval bounds that are provided along with the indirect estimates to assess the range of uncertainty in the estimates. The credible intervals for the county indirect estimates have a median width of 14.5 percent for 2003 and 18.2 percent for 1992.

Credible intervals

Credible intervals have been computed to indicate the levels of uncertainty in the indirect estimates of the percentages of adults lacking Basic Literacy Skills (BPLS). A credible interval is a posterior probability interval, used in Bayesian statistics for purposes similar to those of a confidence interval in traditional statistics. A 95 percent credible interval for an estimate of the percentage of adults lacking BPLS in a county gives the range for which there is a probability of 0.95 that the interval contains the true percentage of adults lacking BPLS. For example, suppose a county's estimate for the percentage of adults lacking BPLS is 12 percent with a 95 percent credible interval of 5 to 25 percent (as in this example, the intervals are generally asymmetric around the estimate). This indicates that there is a probability of 0.95 that the actual value is between 5 and 25 percent.

Comparisons

Exact 95 percent credible intervals have been produced for the differences between the estimated percentages of adults lacking BPLS in 2003 for all pairs of states and all pairs of counties within the same state. However, exact credible intervals have not been computed for all other comparisons. For these other comparisons, the analysis has been restricted to determining whether or not the 95 percent credible interval contains 0, using the following approach:

If the credible interval for the estimate for county (or state) i does not overlap with the credible interval for the estimate for county (or state) j , then the credible interval for the difference does not contain 0.
If the credible interval for the estimate for county (or state) i is fully nested within the credible interval for the estimate for county (or state) j , then the credible interval for the difference contains 0.
If the credible intervals for the estimates for the two counties or two states partially overlap, then the determination is based on the calculation of an approximate credible interval.

For more detail on the comparison of estimates, see Estimation Approach .

Top

¹ The states without county-level pairwise differences include: Delaware, Hawaii, Maine, New Hampshire, Rhode Island, West Virginia, and Wisconsin.