| ||||||||||||
|
||||||||||||
Substate Substance Abuse Estimates from the 1999-2001 NSDUH |
This report includes substate region level estimates of 12 substance use measures (see Section B.1) using the combined data from the 1999, 2000, and 2001 National Surveys on Drug Use and Health (NSDUHs).
The survey-weighted hierarchical Bayes (SWHB) methodology used in the production of State estimates from the 1999-2003 surveys also was used in the production of the 1999-2001 substate estimates. The SWHB methodology is described by Folsom, Shah, and Vaish (1999). A brief discussion of the precision and validation of the estimates and interpretation of the predication intervals (PIs) is given in Section B.1. Section B.2 lists the 12 substance use measures for which substate-level small area estimates were produced. The list of predictors used in the 1999-2001 substate-level small area estimation (SAE) modeling is given in Section B.3. The improved methodology used to select relevant predictors is described in Section B.4. The goals of SAE modeling, general model description, and the implementation of SAE modeling remain the same and are described in Appendix E of the 2001 State report (Wright, 2003). A general model description is given in Section B.5.
Small area estimates obtained using the SWHB methodology are design consistent (i.e., for States or substates with large sample sizes, the small area estimates are close to the robust design-based estimates). The substate small area estimates when aggregated by using the appropriate population totals result in national small area estimates that are very close to the national design-based estimates. However, due to many reasons, such as internal consistency, it is desirable to have national small area estimates exactly match the national design-based estimates. Beginning in 2002, exact benchmarking was introduced (see Appendix A, Section A.4, in Wright & Sathe, 2005).
The primary purpose of this report is to give policy officials a better perspective on the range of prevalence estimates within and across States. Because the data were collected in a consistent manner by field interviewers who adhered to the same procedures and administered the same questions across all states and substate areas, the results are comparable across the 50 States and the District of Columbia.
The 95 percent PI associated with each estimate provides a measure of the accuracy of the estimate. It defines the range within which the true value can be expected to fall 95 percent of the time. For example, the prevalence of past month use of marijuana in Region 1 in Alabama is approximately 3.8 percent, and the 95 percent PI ranges from 2.8 to 5.1 percent. Therefore, the probability is 0.95 that the true value is within that range. The PI indicates the uncertainty due to both sampling variability and model bias. The key assumption underlying the validity of the PIs is that the State- and substate-level error (or bias) terms in the models behave like random effects with zero means and common variance components.
A comparison of the standard errors (SEs) among substate areas with small (n 500), medium (500 < n 1,000), and large (n > 1,000) sample sizes for the 12 measures in this report shows that the small area estimates behave in predictable ways. Regardless of whether the substate area is from one of the eight States with a large annual sample size (3,000 to 4,000) or one of the other States (n = 900 annually), the sizes of the PIs are very similar and are primarily a function of the sample size of the substate area and the prevalence estimate of the measure. Substate areas with large sample sizes had the smallest SEs.
For past month use of alcohol, where the national prevalence for all persons aged 12 or older was 47.3 percent (for 1999-2001), the average relative standard error (RSE) was about 3.6 percent for substate areas with a sample size greater than 1,000.2 For substate areas with sample sizes between 500 and 1,000 records, the average RSE was 5.1 percent; for sample sizes smaller than 500, the RSE average was 6.4 percent.
For past month use of marijuana (with a national prevalence of 5.1 percent), the average RSE was 10.5 percent for substate areas with large samples. For medium sample sizes, the average RSE was 14.0 percent, and for samples smaller than 500, the RSE was 16.3 percent. Substance measures with the lowest prevalence, such as past year use of cocaine (1.7 percent nationally), displayed the highest average RSE. For sample sizes greater than 1,000, the average RSE was 15.6 percent. For substate areas of medium sample sizes, the average RSE was 19.0 percent, and for samples smaller than 500, the average RSE was 20.2 percent.
The SAE methods used for substate regions in this report were previously validated for the NSDUH State-by-age group small area estimates (Wright, 2002). This validation exercise used direct estimates from pairs of large sample states (n = 7,200) as internal benchmarks. These internal benchmarks were compared with small area estimates based on random subsamples (n = 900) that mimicked a single year small State sample. The associated age groupspecific small area estimates were based on sample sizes targeted at n = 300. Therefore, validation of the State-by-age group small area estimates should lend some validity to the small sample size substate small area estimates reported here.
Further validation of the substate region small area estimates is being pursued. It may be possible to compare the NSDUH substate estimates with those from State-sponsored surveys having similar data collection procedures. Internal benchmarking to direct NSDUH estimates also is possible for seven of the largest sample substate areas. Pooling of substate areas with similar characteristics also could yield useful benchmarks.
Substate-level small area estimates were produced for the following set of 12 binary (0, 1) substance use measures, using the 1999-2001 NSDUHs:
Local area data used as potential predictor variables in the mixed logistic regression models were obtained from several sources, including Claritas, the U.S. Bureau of the Census, the Federal Bureau of Investigation (FBI) (Uniform Crime Reports), Health Resources and Services Administration (Area Resource File), the Substance Abuse and Mental Health Services Administration (SAMHSA) (National Survey of Substance Abuse Treatment Services [N-SSATS]), and the National Center for Health Statistics (mortality data). The list of sources of data used in the modeling is provided below.
To obtain a detailed list of predictors, please see Appendix A, Section A.2, of the 2002-2003 State estimates report (Wright & Sathe, 2005).
To produce small area estimates based on the pooled 1999-2001 NSDUH data, the fixed effect predictors were selected using the following methodology:
The model described here is similar to the logistic mixed hierarchical Bayes (HB) model that has been used successfully since the 1999 NSDUH to produce age group-specific small area estimates for the 50 States and the District of Columbia. The following model was used:
, D
where aijk is the probability of engaging in the behavior of interest (e.g., to use marijuana in the past month) for person-k belonging to age group-a in substate region-j of State-i. Let xaijk denote a pa×1 vector of auxiliary variables associated with age group-a and a denote the associated vector of regression parameters. The age group-specific vectors of auxiliary variables are defined for every block group in the Nation and also include person-level demographic variables, such as race/ethnicity and gender. The vectors of random effects i = (1i, ..., Ai) and vij = (v1ij, ... ,vAij) are assumed to be mutually independent with i ~ NA(0,D) and vij ~ NA(0,Dv), where A is the total number of individual age groups modeled (generally A = 4). For HB estimation purposes, an improper uniform prior distribution is assumed for a, and proper inverse Wishart prior distributions are assumed for D and Dv. The HB solution for aijk involves a series of complex Markov Chain Monte Carlo (MCMC) steps to generate values of the desired fixed and random effects from the underlying joint distribution. The basic process is described in Folsom et al. (1999), Shah, Barnwell, Folsom, and Vaish (2000), and Wright (2003).
Once the required number of MCMC samples for the parameters of interest are generated and tested for convergence properties (see Raftery & Lewis, 1992), the small area estimates for each age group by race/ethnicity by gender cell within a block group can be obtained. These block group-level small area estimates then can be aggregated using the appropriate population count projections to form State- and substate-level small area estimates for the desired age group(s). These small area estimates then are benchmarked to the national design-based estimates (see Appendix A, Section A.4, in Wright & Sathe, 2005).
Incidence rates are typically calculated as the number of new initiates of a substance during a period of time (such as in the past year) divided by the estimate of the number of person years of exposure (in thousands). The incidence definition in this report is the result of a simpler definition based on the model-based methodology and is as follows:
Average annual incidence rate = {(Number of marijuana initiates in past 24 months) /
[(Number of marijuana initiates in past 24 months * 0.5) +
Number of persons who never used marijuana]} / 2.
For details on calculating the average annual rate of first use of marijuana, see Appendix A, Section A.6, of the 2002-2003 State estimates report (Wright & Sathe, 2005).
This page was last updated on January 15, 2009. |
* Adobe™ PDF and MS Office™ formatted files require software viewer programs to properly read them.
Click here to download these FREE programs now
Highlights | Topics | Data | Drugs | Pubs | Short Reports | Treatment | Help | OAS |