Appendix E: State Estimation Methodology

This report includes estimates of 19 substance use measures. Twelve of the measures used the same definition for 1999 through 2001 and have estimates of change between 1999–2000 and 2000–2001, the difference of two 2-year moving averages. Six substance abuse and dependence measures used the same definition for 2000 and 2001, but not for 1999; therefore, only the estimates for 2000–2001 are provided. One new measure, serious mental illness (SMI), was introduced in 2001, and State estimates have been produced for that single year.

This appendix describes the methodology used to measure change in State estimates (Section E.1), the validation of that methodology (Section E.2), the validation of the estimates of prevalence levels based on the combined 1999–2000 National Household Survey on Drug Abuse (NHSDA) data (Section E.3), caveats regarding small area estimation (SAE) (Section E.4), and the general methodology (hierarchical Bayes) used to create the State estimates (Section E.5). Included at the end of this appendix are tables showing the State response rates for 1999–2001, the State sample sizes for 1999–2001, and the State sample sizes for the 2001 incentive experiment.

E.1. Measuring Change in State Estimates Between 1999–2000 and 2000–2001

The estimates of change in State estimates presented in this report are based on the 1999 through 2001 NHSDAs. State estimates for 1999–2000 and 2000–2001 were produced by combining State-level NHSDA data with local-area county and Census block group/tract-level predictor variables from the States for the two time periods. The SAE methodology for estimating change is described in this section, while Section E.5 provides a general overview of SAE methodology. The moving average State prevalence estimates displayed in Appendix A for the overlapping 1999–2000 and 2000–2001 time periods were obtained from independent applications of RTI's survey-weighted hierarchical Bayes (SWHB) methodology.

The State estimates for 1999–2000 are the model-based small area estimates previously published by the Substance Abuse and Mental Health Services Administration (SAMHSA) (see Wright, 2002a, 2002b). These estimates were derived by first fitting logistic mixed models to the pooled 1999–2000 survey dataset. These models fit separate fixed and random effects for each of four age groups. Each age group model had 51 State-level random effects and 300 substate region-level random effects. The fixed predictor variables for each age group were defined at five levels, namely, person-level demographics, 1990 decennial Census block group-level items, tract-level items, county variables, and State variables. The same fixed predictors were used for all 3 years (1999, 2000, and 2001) of data but annual updates were made when more current versions became available.

Having estimated the common fixed and random effects from the pooled 1999–2000 dataset, year-specific predicted probabilities of substance use were formed at the block group–b level for each of eight gender (2) by race/ethnicity (4) domains-d within each of four age groups-a.

Year specificity in the State estimates was induced by updating the fixed predictor variables annually and by using year-specific block group-level population projections for the 32 age by gender by race/ethnicity domains to weight together the domain-specific probabilities of use. These year-t population projections, [] were purchased from Claritas Inc. Letting denote the predicted probability of substance use for the age group-a by race/ethnicity by gender subpopulation-d in block group-b for year-t, then the age group-specific estimates for State i were computed as population-weighted averages of the form

Equation E3 , D

where the summation extends over all the block groups-b belonging () to the State i universe . Note that the domain-d summations extend over the eight age group-specific gender by race/ethnicity domains within each block group.

To produce the 1999–2000 pooled estimates, the common fixed and random effect estimates were first employed to form State estimates and for 1999 and 2000, respectively. These annualized State estimates were then combined as population-weighted averages of the form

, D

where Notation N sub ia at time t denotes the year-t specific population projection and is calculated as the sum of the year-t specific population projections for block-group b, age group a, and gender by race/ethnicity domain d (N sub bad at time t), summed over all 8 gender by race/ethnicity domains and all the block-groups in State i. The SWHB versions of these pooled estimates were computed as posterior means over 1,250 Gibbs samples drawn from the joint posterior distribution of the fixed and random effects. The 95 percent asymmetric prediction intervals (PIs) for these pooled 1999–2000 prevalence estimates were first formed as symmetric, approximately Gaussian, Bayes credible intervals on the log-odds scale. The end points of these log-odds symmetric intervals then were transformed back to the prevalence scale.

The State by age group prevalence estimates derived from the pooled 2000 and 2001 survey data were produced by refitting the logistic mixed models. In this independent refitting of the models, updated versions of the fixed predictors were used with the 2001 survey responses when updates were available. This refitting resulted in a new set of age group-specific fixed and random effects for the combined 2000 and 2001 surveys. As described previously, 1,250 Gibbs sample draws from the joint posterior distribution of these fixed and random effect parameters were used to calculate posterior means and 95 percent prediction intervals for the 2000 and 2001 State i by age group-a prevalence estimates .

The 2000 and 2001 models were fit independently of the previously fit 1999 and 2000 models. This independent analysis approach was followed because there was no desire to revise the previous estimates and the associated moving average change measures as the result of jointly modeling all 3 years of survey data. This approach does have a shortcoming when computing the Bayes significance level for an estimated moving average change measure. Specifically, one needs to estimate the posterior variance of a change measure defined as the log-odds ratio:

Equation E11 D

A change measure like the log-odds ratio is favored over the simple difference because the Bayes significance calculation is much less burdensome when the posterior distribution of the change measure is approximately Gaussian as is the case for but not for the simple difference. Calculating the posterior variance of can be accomplished by using the posterior variance statistics that were previously obtained from the independent Markov chain Monte Carlo (MCMC) chains.

To complete the variance calculation for , a correlation estimate for the two log-odds statistics is required. To approximate this correlation, the 1999–2000 and 2000–2001 models were fit simultaneously. This simultaneous fit yielded an MCMC sample of 1,250 draws from the joint posterior distribution of both sets of fixed and random effects. To accommodate this simultaneous fitting of the 1999–2000 and 2000–2001 models, a concatenated dataset containing both of the pooled samples was created. Because the PROC GIBBS software allows for separate logistic mixed models for a set of nonoverlapping subpopulations, it was possible to simultaneously fit eight age group (4) by dataset (2) models as if there were no overlap in the two datasets. This simultaneous solution yielded a set of 1,250 MCMC replicates for the two overlapping log-odds statistics. In these simultaneous models, the eight age group by dataset random effects for each State and for each substate region were allowed to have a general variance-covariance matrix. It was hoped that these random effect covariances between datasets would largely account for the 2000 survey overlap.

In the process of conducting the SAE change measure validation study (reported on in Section E.2), it was observed that the 95 percent prediction intervals for two of the SAE odds ratios, (namely, past month alcohol use and past month cigarette use) were approximately the same or wider than the 95 percent confidence intervals (CIs) for the associated design-based odds ratio estimates. These interval comparisons are displayed in Table E.1. It had also been previously noted that the prediction intervals for the two SAE-based log-odds statistics involved in the log-odds ratios were substantially narrower than the corresponding design-based intervals. Therefore, it was clear that the correlations between the two odds statistics over the MCMC samples were substantially smaller than their design-based counterparts. Table E.2 shows these underestimated correlations as compared with their design-based counterparts.

These model-based MCMC correlations were underestimated as a consequence of the faulty assumption that the eight age group by dataset subpopulations in the simultaneous models were nonoverlapping. The overlap associated with the 2000 survey data was not adequately accounted for by the random effect correlations. There is an alternative form of the odds ratio estimator that employs nonoverlapping subpopulations and provides for proper MCMC-based correlation estimation. This odds ratio for change is based on simultaneously fitting the three annual models to produce 1,250 MCMC samples from the joint posterior distribution of the triple , , and . For this simultaneous model, there are 12 age (4) by year (3) subpopulation-specific models, each with their own sets of fixed and random effects. In this case, the general covariance matrices for the State and substate random effects are 12 by 12 matrices corresponding to the 12 element (age group by year) vectors of random effects. The associated odds ratio is based on the pooled prevalences:

and

Note that the survey-weighted Bernoulli-type log likelihood employed in PROC GIBBS was appropriate for this simultaneous model because the 12 age group by year subpopulations were nonoverlapping. The purpose of using the more complex 2-year averaging scheme described previously was to minimize bias. If one assumes the fixed and random effects are common for the 2 years being pooled, this should yield small area estimates that are closer to the design-based estimates than the estimators above where year-specific parameters were assumed. For the odds ratio based on the averaged prevalence estimates, it is clear that the correlation between the two log-odds statistics should be high. This follows from the fact that is common to the two population-weighted averages. These correlation estimates based on more properly reflect the true correlations associated with the type of averages presented in the body of this report. Table E.3 is similar to Table E.1 except that the prediction intervals were obtained using the correlations from the alternative method. Table E.4 displays the correlations from the alternative method and the corresponding design-based correlations. Table s E.5 to E.8 contrast the Bayes significance levels for these two correlation estimators. Note that the revised significance estimates [p value(2)] are smaller than the original ones [p value(1)]; they are about 20 percent smaller for past month use of cigarettes, alcohol, and marijuana, and about 6 percent smaller for past year use of cocaine.

E.2. Validation of Methodology to Measure Change

To validate the SAE models for estimating change between the pooled 1999–2000 small area estimates and the pooled 2000–2001 small area estimates, the design-based estimates of change for the eight large sample States were used as internal benchmarks. The eight large sample States had 2-year sample sizes that ranged between 6,200 and 9,700. Estimates were produced for four outcome variables representative of a range of prevalence rates: past year use of cocaine, past month use of marijuana, past month use of cigarettes, and past month use of alcohol. The goal of the validation was to compare the estimates for small States utilizing the SAE methodology with estimates based on the internal benchmarks.

E.2.1 Replicate Formation Methodology

The validation study was performed by first subsampling the eight large States; for each of these large States, four sample replicates ("pseudo" small States) were formed that mimicked the design properties of the 42 small States and the District of Columbia. A key feature of this replicate formation strategy was mimicking the 50 percent overlap between the 1999 and 2000 samples of 96 area segments and between the 2000 and 2001 segment samples in each small sample State. Because new samples of dwellings and persons were drawn from all sample segments every year, the survey design-induced covariance between years is limited to this 50 percent overlap of sample block groups/segments.

Exhibit E.1 presents the 50 percent segment overlap plan for the 3 survey years. Note that there are 48 field interviewer (FI) regions in each of the eight large States and 12 FI regions in each of the 42 small States and the District of Columbia. Each FI region has four quarters, and each quarter is then expected to have two area segments. For various reasons, some of the FI region-by-quarter slots may be empty. In the following illustration, segments A, C, E, and G in 1999 were kept in 2000. Segments B, D, F, and H were replaced by segments I, J, K, and L in 2000. In 2001, the segments I, J, K, and L of 2000 were kept, and segments A, C, E, and G from 2000 were replaced by segments M, N, O, and P.

Exhibit E.1 Sample Segment 50 Percent Overlap Plan for the 1999, 2000, and 2001 NHSDAs

FI Region	Quarter	Segments
FI Region	Quarter	1999	2000	2001
1	1	A	A	M
	1	B	I	I
	2	C	C	N
	2	D	J	J
	3	E	E	O
	3	F	K	K
	4	G	G	P
	4	H	L	L

FI = field interviewer.

To select the four pseudo small State samples from each large State, 12 pseudo FI regions were first created within each large sample State by pooling their 48 initial FI regions into groups of 4. Each of these pseudo FI regions then was expected to have eight area segments per calendar quarter (see Exhibit E.2). For each of these pseudo FI region-by-quarter sets of eight area segments, any segments that were devoid of interviews were first randomly replaced by a selection from the non-empty segments in the set. The segments for the 1999, 2000, and 2001 NHSDA data were filled in separately. Once complete sets of eight non-empty segments for the 1999, 2000, and 2001 NHSDA data in each of the pseudo FI region-by-quarter sets were assembled, the 1999, 2000, and 2001 data were linked using State-by-pseudo FI region-by-quarter-by-segment identification codes.

Exhibit E.2 An Example of Sample Segment Assignment in Pseudo FI Regions in 1999, 2000, and 2001 NHSDAs

Pseudo FI Region	Quarter	Segments
Pseudo FI Region	Quarter	1999	2000	2001
1	1	a	a	m
		b	i	i
		c	c	n
		d	j	j
		e	e	o
		f	k	k
		g	g	p
		h	l	l

FI = field interviewer.

Let a, b, c, d, e, f, g, and h denote the eight segments in quarter 1 of pseudo FI region 1 in 1999. Approximately half of the eight segments represented cases where the 1999 segments were reused in 2000 (i.e., common segments a, c, e, and g in 1999 and 2000), and the remaining segments b, d, f, and h represented cases where 1999 segments were linked with new 2000 replacement segments i, j, k, and, l. Similarly between 2000 and 2001, segments i, j, k, and l are common segments, whereas segments a, c, e, and g are linked to new segments m, n, o, and p.

Next, the eight linked 1999 and 2000 segment pairs were stratified into two strata—the common segment pairs and the uncommon 1999 and 2000 segment pairs. One segment pair was then randomly drawn from each of these strata and combined to form four pseudo small States such that one of the paired replicates would have common segments in the 1999 and 2000 surveys and the other replicate pair would have uncommon segments for 1999 and 2000. The 2001 segments then were forced to go into the same pseudo States depending on the linkage between the 2000 and 2001 sample segments. For example, if segment "g" was assigned to pseudo State 1 in 1999, "g" also was linked to "p" in 2001 because "g" was common between 1999 and 2000; segment "g" in 2000 and the segment "p" in 2001 were forced to go into pseudo State 1. Exhibit E.3 demonstrates a typical assignment of segments among the four pseudo states for the 1999, 2000, and 2001 NHSDAs.

Exhibit E.3 Typical Assignment of Segments among Four Pseudo States for 1999, 2000, and 2001 NHSDAs

Pseudo FI Region	Quarter	Pseudo State	Segments
Pseudo FI Region	Quarter	Pseudo State	1999	2000	2001
1	1	1	g	g	p
		1	b	i	i
		2	a	a	m
		2	h	l	l
		3	e	e	o
		3	d	j	j
		4	c	c	n
		4	f	k	k

FI = field interviewer.

This subsampling validation exercise was repeated for all four quarters in a pseudo FI region and for all 12 pseudo FI regions in each of the eight large States. This resulted in 32 (8 large States × 4 subsamples from each large State) pseudo small States from eight large States. These pseudo small States mimicked the design properties of small States with the 50 percent sample segment overlap preserved across adjacent survey years.

E.2.2 Results of Validating the Small Area Estimates of Change Between 1999–2000 and 2000–2001

Table s E.9 to E.12 present the internal benchmark estimate (labeled "design-based") and the corresponding average estimate using the SAE procedures for the four substance use measures for each of the eight large States and the relative absolute bias (RAB) for each of the substance use measures. The estimate in each case is the odds of having used the substance in 2000–2001 divided by the odds of having used the substance in 1999–2000. In general, the average relative biases for the age 12 or older population are fairly small for substance use measures with larger prevalence rates and somewhat larger for the others. The average relative bias is worst for past year use of cocaine (12.7 percent for the population age 12 or older). Note, however, that the relative bias is generally conservative, producing SAE odds ratios that are closer to "no change" relative to the design-based odds ratios. For example, of the 32 pairs of State-by-age group estimates for cocaine, the SAE odds ratios are closer to 1.0 for 29 of the pairs and the design-based odds ratios are closer to 1.0 for only 3 pairs.

Table E.3 presents the ratio of widths of the 95 percent prediction intervals from the SAE data to the 95 percent confidence intervals from a direct estimate based on the same size sample. The estimates in the table are based on the recalculated (larger) estimate of the correlation between the two 2-year moving averages. As one can see, the width of the 95 percent prediction intervals are much smaller on average for each of the four substance measures validated, ranging from 0.60 for past month use of marijuana and past year use of cocaine to 0.77 for past month use of cigarettes for persons age 12 or older. This represents an improved precision that is equivalent to a sample size almost 3 times larger for marijuana and cocaine and about 2 times larger for cigarettes-relative to the precision obtained from the corresponding direct design-based estimate.

E.3. Validation of Combined Prevalence-Level Estimates for 1999–2000

The 2-year estimates had been validated in the 2000 State report for four variables: past month use of marijuana, past year use of cocaine, past month binge alcohol use, and past month use of cigarettes. The results of that validation are repeated here in Table s E.13 to E.16. On average, the relative absolute biases (RABs) were quite small. For the 12 or older age group, the RABs were as follows:

past month use of marijuana, 4.07 percent;
past year use of cocaine, 7.88 percent;
past month binge alcohol use, 0.98 percent; and
past month use of cigarettes, 1.22 percent.

Also, compared with the design-based confidence intervals, the 95 percent prediction intervals were much shorter, about 75 percent as large for marijuana, binge alcohol, and cigarettes and 65 percent as large for cocaine (Table E.17).

In addition, the 2-year estimates were compared with the corresponding 1-year estimates to ascertain the extent of improvement in estimation for the 42 States and the District of Columbia, given that those sample sizes would now be approximately double their size in 1999. For example, comparing the prediction intervals' widths across the 50 States and the District of Columbia, the SAE average prediction interval width for past month use of marijuana among persons 12 or older was 2.40 percent in 1999, but only 1.98 percent for 1999 and 2000 combined (see Section B.4.2 from Wright, 2002b). Just as importantly, because the States (and the District of Columbia) had smaller single-year sample sizes, the national model had a greater relative influence in the SAE estimates for 1999 than for 1999 and 2000 combined. Therefore, the 1999–2000 pooled State estimates would not be shrunk as much toward the national model-based estimate as would similar estimates based on a single year of data. One result is that the 2-year small area estimates would tend to be closer to their corresponding design-based estimates than small area estimates based on a single year of data. The other implication is that States with design-based estimates that were relatively lower or higher than other States would retain that distinction, and the overall range and spread of the State estimates would tend to be larger, for example, than it was in 1999. This should make it easier to identify States that have notably lower or higher substance use prevalence rates than other States.

E.4. Caveats

Some of the caveats regarding SAE are addressed in Chapter 7 in Volume I of this report. Table s E.18 to E.20 show the screening, interview, and overall response rates for the 50 States and the District of Columbia from 1999 to 2001, respectively. The response rates are somewhat higher in both 2000 and 2001.

In 2001, an incentive experiment was embedded in the regular data collection during quarters 1 and 2. For that experiment, small random samples were selected in each State proportionate to their population size, and sampled persons were assigned to receive $0, $20, or $40 for completing the questionnaire. Analysis of that data revealed that the response rates were significantly higher among those receiving an incentive than among those who did not receive an incentive and that the overall cost of the survey was less due to the much smaller number of callbacks that were necessary (Eyerman & Bowman, 2002). Initial analysis of that data did not indicate any significant differences in estimated prevalence levels between the incentive and nonincentive cases; however, subsequent analysis has revealed higher prevalence rates for the incentive cases for some of the substance measures. Because the incentive sample size is relatively small compared to the total State sample size, the decision was made to combine both incentive and nonincentive samples in 2001 to produce the national estimates and to produce the State estimates for 2000 and 2001 combined. For example, the incentive sample size for Alabama totaled 98 cases that received either the $20 or $40 incentive (Table E.21), but the total sample size for 2000–2001 for Alabama was 1,821 (Table E.22). The largest allocation of incentive sample cases was in Illinois. There, 442 cases received either the $20 or $40 incentive out of a total combined sample size of 7,218, about 6 percent. Table E.22 also presents the State sample sizes for 1999 through 2001. Table E.21 presents the State sample allocations for just the incentive experiment.

One other possible contributor to bias in the State estimates, and the estimates in general, is the effect of editing and imputation of the summary data. In developing the editing and imputation process for 1999 and subsequent years, the desire was to minimize the amount of editing because of its somewhat subjective nature, and instead let the random imputation process supply any partially missing information. Overall, the percentage of imputed information is quite small for any given substance.

The imputation method is based on a multivariate imputation in which some demographic and other substance use information from the respondent is used to determine a donor who is similar in those characteristics but has supplied data for the drug in question (Grau et al., 2001, 2002, 2003). Often, information also is available from the partial respondent on the recency of drug use. For example, respondents may have indicated that they used the drug in their lifetime or in the past year, but left blank the question about use in the past month. For many of the records, this type of auxiliary information was available. In a small portion of the time, no auxiliary information was available, in which case a random donor with similar drug use patterns and demographic characteristics was used. For the different substances, the largest differences between the edited and the imputed estimates typically occurred when there was a lot of auxiliary information. For past month use of marijuana, based on the 1999 data, the State with the largest percentage change from edited to imputed data was Alabama, whose edited rate of use of marijuana was 2.1 percent and whose imputed rate of use was 3.1 percent—a relative increase of almost 50 percent.

E.5. SAE Methodology

E.5.1 Background

In response to the need for State-level information on substance abuse problems, SAMHSA began developing and testing SAE methods for the NHSDA in 1994 under a contract with RTI of Research Triangle Park, North Carolina. That developmental work used logistic regression models with data from the combined 1991 to 1993 NHSDAs and local area indicators, such as drug-related arrests, alcohol-related death rates, and block group/tract-level characteristics from the 1990 Census that were found to be associated with substance abuse. In 1996, the results were published for 25 States for which there were sufficient sample data (OAS, 1996). A subsequent report described the methodology in detail and noted areas in which improvements were needed (Folsom & Judkins, 1997).

The increasing need for State-level estimates of substance use led to the decision to expand the NHSDA to provide estimates for all 50 States and the District of Columbia on an annual basis beginning in 1999. It was determined that, with the use of modeling similar to that used with the 1991 to 1993 NHSDA data in conjunction with a sample designed for State-level estimation, a sample of about 67,500 persons would be sufficient to make reasonably precise estimates.

The State-based NHSDA sample design implemented in 1999 through 2001 had the following characteristics:

States were stratified into field interviewer (FI) regions that covered the geography of each State. The FI regions are comprised of contiguous Census tracts and counties and designed to yield about 75 interviews per region. In the 42 smaller States (by population) and the District of Columbia, there are 12 FI regions; in the eight large States, there are 48 FI regions.
Within each region, eight segments were randomly selected for 1999 and two were allocated to each calendar quarter of data collection. For 2000, one segment from each 1999 region by quarter pair was retained, and its partner was replaced by a new random selection. For 2001, all new segments selected in 2000 were retained and the others were replaced by randomly selected new segments.
Within each segment, households were screened, and a sample of one to two persons per household was selected. An average of nine responding persons per segment was sought. For the 1999 segments that were retained in 2000, a new sample of households and persons was drawn for the 2000 survey, as was the case in 2001.
The annual samples were selected so that approximately 900 responding persons, 300 in each age group (12 to 17, 18 to 25, and 26 or older), were drawn in each of the 42 States and the District of Columbia. In the eight large States, the person samples were allocated equally to the three age groups with overall respondent sample sizes ranging from 2,669 to 4,681 in the 1999 NHSDA, 3,478 to 5,022 in the 2000 NHSDA, and 3,502 to 4,023 in the 2001 NHSDA.

In preparation for the modeling of the 1999 data, RTI used the data from the combined 1994–1996 NHSDAs to develop an improved methodology that utilized more local area data and produced better estimates of the accuracy of the State estimates (Folsom, Shah, & Vaish, 1999). That effort involved the development of procedures that would validate the results for geographic areas with large samples. This work was reviewed by a panel with SAE expertise.¹ They approved of the methodology, but suggested further improvements for the modeling to be used to produce the 1999 State estimates. Those improvements were incorporated into the methodology finally used for the 1999 State estimates. Similar methodology (as described earlier) was used for the 2000 State report and this 2001 State report. The SWHB methodology is described below.

E.5.2 Goals of Modeling

There were several goals underlying the estimation process. The first was to model drug use at the lowest possible level and aggregate over the levels to form the State estimates. The chosen level of aggregation was the 32 age group (12 to 17, 18 to 25, 26 to 34, 35+) by race/ethnicity (white, non-Hispanic; black, non-Hispanic; Hispanic; Other non-Hispanic) by gender cells at the block group level. Estimated population counts were obtained from a private vendor for each block group for each of the 32 cells. This level of aggregation was desired because the NHSDA first stage of sample selection was at the block group level, so that there would be data at this level to fit a model. In addition, there was a great deal of information from the Census at the block group level that could be used as predictors in the models. If prevalence rates could be estimated for each of the 32 cells at the block group level, it would only be necessary to multiply the rates by the estimated population counts and aggregate to the State level.

Another goal of the estimation process was to include the sampling weight in the model in such a way that the small area estimates would converge to the design-based (sample-weighted) estimates when they were aggregated to a sufficient sample size. There was a desire for the estimates to have this characteristic so that there would be consistency with the survey-weighted national estimates based on the entire sample.

A third goal was to include as much local source data as possible, especially data related to each substance use measure. This would help provide a better fit beyond the strictly sociodemographic information. The desire was to use national sources of these data so that there would be consistency of collection and estimation methodology across States.

Recognizing that estimates based solely on these "fixed" effects would not reflect differences across States due to differences in laws, enforcement activities, advertising campaigns, outreach activities, and other such unique State contributions, a fourth goal was to include "random" effects to compensate for these differences. The types of random effects that could be supported by the NHSDA data were a function of the size of sample and the model fit to the sample data. Random effects were included at the State level and for substate regions comprising three FI regions. Although this grouping of the three FI regions was principally motivated by the need to accumulate enough of a sample to support good model fitting for the low-prevalence NHSDA outcomes, it also was reasoned that it would be possible to produce substate hierarchical Bayes (HB) estimates for areas comprised of these FI region groups, once 2 or 3 years of NHSDA data were available, because that would yield substate region samples of at least 400 respondents. For substate areas that do not conform to the substate region boundaries (e.g., counties and large municipalities), HB estimates could be derived from their elemental block group-level contributions, but the design-based data employed in the estimation of the associated substate region effects would not be restricted to the county or city of interest. This mismatch of FI region and county/large municipality boundaries weakens the theoretical appeal of the associated HB estimate. For this reason, substate HB estimates probably should be restricted to areas that can be matched reasonably well to FI region groups.

One of the difficulties of typical SAE has been obtaining good estimates of the accuracy of the SAEs with prediction intervals that give a good representation of the true probability of coverage of the intervals. Therefore, the final major goal was to provide accurate prediction intervals—ones that would approach the usual sample-based intervals as the sample size increases.

E.5.3 Variables Modeled

In the 2001 NHSDA, a set of 19 measures covering a variety of aspects of substance use and abuse was designated for estimation. For the first 12, three estimates have been produced: one set based on pooled 1999 and 2000 NHSDA data, another set based on pooled 2000 and 2001 NHSDA data, and a third set measuring the change between the first two estimates. Estimates of measures of change between two consecutive single years had not been precise enough to declare significant the size of the annual changes that were observed. For the next six variables, only estimates based on the pooled 2000 and 2001 data were possible because the definitions of those variables had changed between 1999 and 2000. The final variable, serious mental illness (SMI), was added in 2001. The 19 outcome variables are listed as follows:

past month use of any illicit drug,
past month use of marijuana,
perceptions of great risk of smoking marijuana once a month,
average annual rates of first use of marijuana,
past month use of any illicit drug other than marijuana,
past year use of cocaine,
past month use of alcohol,
past month binge alcohol use,
perceptions of great risk of having five or more drinks of an alcoholic beverage once or twice a week,
past month use of any tobacco product,
past month use of cigarettes,
perceptions of great risk of smoking one or more packs of cigarettes per day,
past year alcohol dependence or abuse,
past year alcohol dependence,
past year any illicit drug dependence or abuse,
past year any illicit drug dependence,
past year dependence or abuse for any illicit drug or alcohol,
past year treatment gap, and
past year serious mental illness.

E.5.4 Predictors Used in Logistic Regression Models

Local area data used as potential predictor variables in the logistic regression models were obtained from several sources, including Claritas, the Census Bureau, the FBI (Uniform Crime Reports), Health Resources and Services Administration (Area Resource File), SAMHSA (Uniform Facility Data Set), and the National Center for Health Statistics (mortality data). The major list of sources and potential data items used in the modeling are provided below.

Claritas. The demographic data package called Building Block Basic, Age by Race for 1999 with projections to 2004 was used.
Census Bureau. Both 1990 Census (demographic and socioeconomic variables) and 1998 Food Stamp participation rates were used.
Federal Bureau of Investigation. Uniform Crime Report (UCR) arrest totals were used from http://fisher.lib.Virginia.EDU/crime/; the most current data are for 1998 for most counties, and previous years' data were used in a few cases.
Health Resources and Services Administration. Some variables were used relating to income and employment from the Area Resource File (ARF) February 2001 release from the Bureau of Health Professions, Office of Research and Planning.
National Center for Health Statistics. Mortality data using International Classification of Diseases, 9^th revision (ICD-9), 1993 to 1998 were used. The ICD-9 death rate data are from the National Center for Health Statistics at the Centers for Disease Control and Prevention.
SAMHSA, Office of Applied Studies. Uniform Facility Data Set (UFDS), 2000 data on drug and alcohol treatment rates were used from Synectics for Management Decisions, Inc.

The following lists provide the specific independent variables that were potential predictors in the models.

Claritas Data
Description	Level
% Population aged 0–18 in block group	Block group
% Population aged 19–24 in block group	Block group
% Population aged 25–34 in block group	Block group
% Population aged 35–44 in block group	Block group
% Population aged 45–54 in block group	Block group
% Population aged 55–64 in block group	Block group
% Population aged 65+ in block group	Block group
% Blacks in block group	Block group
% Hispanics in block group	Block group
% Other race in block group	Block group
% Whites in block group	Block group
% Males in block group	Block group
% Females in block group	Block group
% American Indian, Eskimo, Aleut in tract	Tract
% Asian, Pacific Islander in tract	Tract
% Population aged 0–18 in tract	Tract
% Population aged 19–24 in tract	Tract
% Population aged 25–34 in tract	Tract
% Population aged 35–44 in tract	Tract
% Population aged 45–54 in tract	Tract
% Population aged 55–64 in tract	Tract
% Population aged 65+ in tract	Tract
% Blacks in tract	Tract
% Hispanics in tract	Tract
% Other race in tract	Tract
% Whites in tract	Tract
% Males in tract	Tract
% Females in tract	Tract
% Population aged 0–18 in county	County
% Population aged 19–24 in county	County
% Population aged 25–34 in county	County
% Population aged 35–44 in county	County
% Population aged 45–54 in county	County
% Population aged 55–64 in county	County
% Population aged 65+ in county	County
% Blacks in county	County
% Hispanics in county	County
% Other race in county	County
% Whites in county	County
% Males in county	County
% Females in county	County

1990 Census Data
Description	Level
% Population who dropped out of high school	Tract
% Housing units built in 1940–1949	Tract
% Persons 16–64 with a work disability	Tract
% Hispanics who are Cuban	Tract
% Females 16 years or older in labor force	Tract
% Females never married	Tract
% Females separated/divorced/widowed/other	Tract
% One-person households	Tract
% Female head of household, no spouse, child <18	Tract
% Males 16 years or older in labor force	Tract
% Males never married	Tract
% Males separated/divorced/widowed/other	Tract
% Housing units built in 1939 or earlier	Tract
Average persons per room	Tract
% Families below poverty level	Tract
% Households with public assistance income	Tract
% Housing units rented	Tract
% Population 9–12 years of school, no high school diploma	Tract
% Population 0–8 years of school	Tract
% Population with associate's degree	Tract
% Population some college and no degree	Tract
% Population with bachelor's, graduate, professional degree	Tract
Median rents for rental units	Tract
Median value of owner-occupied housing units	Tract
Median household income	Tract

Uniform Crime Report Data
Description	Level
Drug possession arrest rate	County
Drug sale/manufacture arrest rate	County
Drug violations' arrest rate	County
Marijuana possession arrest rate	County
Marijuana sale/manufacture arrest rate	County
Opium cocaine possession arrest rate	County
Opium cocaine sale/manufacture arrest rate	County
Other drug possession arrest rate	County
Other dangerous non-narcotics arrest rate	County
Serious crime arrest rate	County
Violent crime arrest rate	County
Driving under influence arrest rate¹	County

Other Categorical Data
Description	Source	Level
=1 if Hispanic, =0 otherwise	Sample	Person
=1 if non-Hispanic Black, =0 otherwise	Sample	Person
=1 if non-Hispanic Other, =0 otherwise	Sample	Person
=1 if male, =0 if female	Sample	Person
=1 if Northeast region, =0 otherwise	1990 Census	State
=1 if Midwest region, =0 otherwise	1990 Census	State
=1 if South region, =0 otherwise	1990 Census	State
=1 if MSA with 1 million +, =0 otherwise	1990 Census	County
=1 if MSA with <1 million, =0 otherwise	1990 Census	County
=1 if non-MSA urban, =0 otherwise	1990 Census	Tract
=1 if underclass tract	Urban Institute	Tract
=1 if no Cubans in tract, =0 otherwise	1990 Census	Tract
=1 if urban area, =0 if rural area	1990 Census	Tract
=1 if no arrests for dangerous non-narcotics, =0 otherwise	UCR	County

Miscellaneous Data
Variable Description	Source	Level
Alcohol death rate, direct cause	ICD-9	County
Alcohol death rate, indirect cause	ICD-9	County
Cigarettes death rate, direct cause	ICD-9	County
Cigarettes death rate, indirect cause	ICD-9	County
Drug death rate, direct cause	ICD-9	County
Drug death rate, indirect cause	ICD-9	County
Alcohol treatment rate	UFDS	County
Alcohol and drug treatment rate	UFDS	County
Drug treatment rate	UFDS	County
% Families below poverty level	ARF	County
Unemployment rate	ARF	County
Per capita income (in thousands)	ARF	County
Food stamp participation rate	Census Bureau	County
Single state agency maintenance of effort²	National Association of State Alcohol and Drug Abuse Directors (NASADAD)	State
Block grant awards²	SAMHSA	State
Cost of Services Factor Index (2001–2003)²	SAMHSA	State
Total Taxable Resources Per Capita Index (1998)²	U.S. Department of Treasury	State
Average suicide rate (1996–1998, per 10,000)¹	ARF	County

¹ Indicates additional predictors used to model serious mental illness for 2001.
² Indicates additional predictors used to model treatment gap for 2000–2001.

E.5.5 Selection of Independent Variables for the Models

For serious mental illness (SMI) modeled using 2001 data alone, independent variables for each age group were identified by a Chi-squared Automatic Interaction Detector (CHAID) algorithm, which does not use sample weights. Prior to this process, all the continuous variables were categorized using deciles and were treated as ordinal in CHAID. Region was treated as a nominal categorical variable in CHAID. Significant (at 3 percent level) independent variables from each age group model and final nodes in the tree-growing process were identified as predictor variables destined for inclusion at a later step.

Independently, a SAS stepwise logistic regression model was fit for each age group. The SAS stepwise was used because it was able to quickly run all of the variables for all of the models, although it was recognized that the software would not take into account the complex sample design. The independent variables included all the first-order or linear polynomial trend contrasts across the 10 levels of the categorized variables plus the gender, region, and race variables. Significant variables (at the 3 percent level) were identified from this process. Based on the combined list from CHAID and SAS, a list of variables was created that included the corresponding second- and third-order polynomials and the interaction of the first-order polynomials with the gender, race, and region variables.

Next, the variables were entered into a SAS stepwise logistic model at the 1 percent significance level. Because of past concerns about overfitting of the data in earlier estimation using the 1991 to 1993 NHSDA data, the significance levels were made quite stringent. These variables were then entered into a SUrvey DAta ANalysis (SUDAAN) logistic regression model because the SUDAAN software would adjust for the effects of the weights and other aspects of the complex sample design (RTI, 2001). All variables that were still significant at the 1 percent significance level were entered into the survey-weighted hierarchical Bayes (SWHB) process.

For outcome variables modeled using pooled 2000 and 2001 data, the predictor set was the same one used in the 1999–2000 analyses, which was obtained using the same variable selection method described above for SMI.

E.5.6 General Model Description

The model can be characterized as a complex mixed model (including both fixed and random effects) of the form:

Each of the symbols represents a matrix or vector. The leading term is the usual (fixed) regression contribution, and represents random effects for the States and field interviewer (FI) region groups that the data will support and for which estimates are desired. Not obvious from the notation is that the form of the model is a logistic model used to estimate dichotomous data. The vector has elements , where the is the propensity for the k^th person in the j^th FI composite region in the i^th State to engage in the behavior of interest (e.g., to use marijuana in the past month). Also not obvious from the notation is that the model fitting utilizes the final "sample" weights as discussed above. The "sample" weights have been adjusted for nonresponse and poststratified to known Census counts.

The estimate for each State behaves like a "weighted" average of the design-based estimate in that State and the predicted value based on the national regression model. The "weights" in this case are functions of the relative precision of the sample-based estimate for the State and the predicted estimate based on the national model. The eight large States have large samples, and thus more "weight" is given to the sample estimate relative to the model-based regression estimate. The 42 small States and the District of Columbia put relatively more "weight" on the regression estimate because of their smaller samples. The national regression estimate actually uses national parameters that are based on the pooled 2000 and 2001 sample; however, the regression estimate for a specific State is based on applying the national regression parameters to that State's "local" county, block group, and tract-level predictor variables and summing to the State level. Therefore, even the national regression component of the estimate for a State includes "local" State data.

The goal then was to come up with the best estimates of and U. This would lead to the best estimates of , which would in turn lead to the best estimate of . Once the best estimate of for each block group and each age/race/gender cell within a block group has been estimated, the results could be weighted by the projected Census population counts at that level to make estimates for any geographic area larger than a block group.

In the model fitting for the pooled 2000 and 2001 data, the small numbers of predictor variables updated in 2001 were used in both their 2000 and 2001 versions when they appeared in a model. To produce the 2000–2001 pooled small area estimates, the common fixed and random effects were first employed to form State estimates and for 2000 and 2001 respectively. These annualized State estimates then were combined as population-weighted averages of the form

, D

where and are the population counts obtained from Claritas Inc.

E.5.7 Implementation of Modeling

The solution to the equation for in Section E.5.6 is not straightforward but involves a series of iterative steps to generate values of the desired fixed and random effects from the underlying joint distribution. The basic process can be described as follows.

Let denote the matrix of fixed effects, be the matrix of State random effects i = 1-51, and denote the matrix of FI composite region effects j within State i. Because the goal is to estimate separate models for four age groups, it is assumed that the random effect vectors are four-variate Normal with null mean vectors and 4×4 covariance matrices and , respectively. To estimate the individual effects, a Bayesian approach is used to represent the joint density function given the data by . According to the Bayes process, this can be estimated once the conditional distributions are known:

, , and .

To generate random draws from these distributions, MCMC processes need to be used. There is a body of methods for generating pseudo-random draws from probability distributions via Markov chains. A Markov chain is fully specified by its starting distribution and the transition kernel .

Each MCMC step that involves the vector of binary outcome variables y in the conditioning set needs first to be modified by defining a pseudolikelihood using survey weights. In defining pseudolikelihood, weights are introduced after scaling them to the effective sample size based on a suitable design effect. Note that with the pseudolikelihood, the covariance matrix of the pseudoscore functions is no longer equal to the pseudoinformation matrix; therefore, a sandwich type of covariance matrix was used to compute the design effect. In this process, weights are largely assumed to be noninformative (i.e., unrelated to the outcome variable y). The assumption of noninformative weights is useful in finding tractable expressions for the appropriate information matrix of the pseudoscore functions. The pseudo log-likelihood remains an unbiased estimate of the finite-population log-likelihood regardless of this assumption.

Step I (this does not depend on , )

With a flat prior for , the conditional posterior is proportional to the pseudolikelihood function. For large samples, this posterior can be approximated by the multivariate normal distribution with mean vector equal to the pseudomaximum likelihood estimate and with asymptotic covariance matrix having the associated sandwich form. Assuming that the survey weights are noninformative makes the age group-specific vectors conditionally independent of each other. Therefore, the can be updated separately at each MCMC cycle.

Step II (this does not depend on )

Here, the conditional posterior is proportional to the product of the prior , the pseudo-likelihood function as well as the prior ; this last prior can be omitted as it does not involve . To calculate the denominator (or the normalization constant) of the posterior distribution for requires multidimensional integration and is numerically intractable. To get around this problem, the Metropolis-Hastings (M-H) algorithm is used that requires a dominating density convenient for Monte Carlo sampling. For this purpose, the mode and curvature of the conditional posterior distribution are used; these can be simply obtained from its numerator. Then a Gaussian distribution is used with matching mode and curvature to define the dominating density for M-H. As with the age group-specific parameters, the State-specific random effect vectors are conditionally independent of each other and can be updated separately at each MCMC cycle.

Step III (this does not depend on )

Similar to step II.

Step IV , (here, and include all the information from y)

Here, the pseudo-likelihood involving design weights comes in implicitly through the conditioning parameters and evaluated at the current cycle. An exact conditional posterior distribution is obtained because the inverse Wishart priors for and are conjugate.

E.5.8 Remarks

In the NHSDA application, three FI regions were combined to form a minimum of four substate regions with corresponding random effects. This was done to ensure adequate sample sizes for estimation purposes.
There is self-calibration built in to the modeling. This is achieved via design effect-scaling of survey weights incorporated in the conditional posterior density so that small area estimates for large States become asymptotically equivalent to the design-based estimates. Similarly, survey-weighted estimates of the fixed parameters (in particular, the intercept) give calibration of the aggregate of State small area estimates to the national design-based estimate.
For posterior variance estimation purposes, the survey weights were largely assumed to be noninformative. The survey design effects on the posterior variance are therefore restricted to unequal weighting effects. It was assumed that all the design-related clustering effects are represented by between-State and between-substate (within-State) variability of random effects. This does not fully account for variability at lower levels of clustering if the design is nonignorable. However, sample size is not sufficient at lower levels to support stable estimates of random effects for area segments.
If the logistic mixed model fits well, the variance estimates should be reasonable. The self-calibration property provides some protection against model breakdown. Research is currently under way to develop a new MCMC algorithm that fully accounts for survey design effects on the small area estimate posterior prediction intervals.

E.6. References

Eyerman, J., & Bowman, K. (2002, January). 2001 National Household Survey on Drug Abuse: Incentive experiment combined quarter 1 and quarter 2 analysis. Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /nhsda/methods/incentive.pdf]

Folsom, R. E., & Judkins, D. R. (1997). Substance abuse in states and metropolitan areas: Model based estimates from the 1991–1993 National Household Surveys on Drug Abuse: Methodology report (DHHS Publication No. SMA 97–3140, Methodology Series M-1). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /methods.htm#methods]

Folsom, R. E., Shah, B., & Vaish, A. (1999). Substance abuse in states: A methodological report on model based estimates from the 1994–1996 National Household Surveys on Drug Abuse. In Proceedings of the Section on Survey Research Methods of the American Statistical Association (pp. 371–375). Washington, DC: American Statistical Association.

Grau, E. A., Bowman, K. R., Giacoletti, K. E. D., Odom, D. M., & Sathe, N. S. (2001, July). Imputation report. In 1999 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /nhsda/methods.cfm]

Grau, E. A., Bowman, K. R., Copello, E., Frechtel, P., Licata, A., & Odom, D. M. (2002, July). Imputation report. In 2000 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /nhsda/methods.cfm]

Grau, E. A., Barnett-Walker, K., Copello, E., Frechtel, P., Licata, A., Liu, B., & Odom, D. M. (2003, May). Imputation report. In 2001 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /nhsda/methods.cfm]

Office of Applied Studies. (1996). Substance abuse in states and metropolitan areas: Model based estimates from the 1991–1993 National Household Surveys on Drug Abuse—Summary report. Rockville, MD: Substance Abuse and Mental Health Services Administration. [Available as a WordPerfect 6.1 file at /analytic.htm]

RTI. (2001). SUDAAN user's manual: Release 8.0. Research Triangle Park, NC: RTI.

Wright, D. (2002a). State estimates of substance use from the 2000 National Household Survey on Drug Abuse: Volume I. Findings (DHHS Publication No. SMA 02–3731, NHSDA Series H-15). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /states.htm]

Wright, D. (2002b). State estimates of substance use from the 2000 National Household Survey on Drug Abuse: Volume II. Supplementary technical appendices (DHHS Publication No. SMA 02–3732, NHSDA Series H-16). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /states.htm]

Table E.1 Ratio of Average Widths of Change Between the 1999–2000 Pooled Data and the 2000–2001 Pooled Data (Based on the Underestimated Model-Based Correlations)

State Age in Years Total

12–17 18–25 26+

Past Month Use of Marijuana

CA 0.84 0.94 0.77 0.72

FL 0.74 1.02 0.72 0.73

IL 0.87 0.99 0.60 0.71

MI 0.74 1.04 0.88 0.84

NY 0.72 0.94 0.54 0.64

OH 0.75 1.01 0.75 0.80

PA 0.74 1.00 0.86 0.86

TX 0.94 1.01 0.38 0.67

Average 0.79 0.99 0.69 0.75
Past Year Use of Cocaine

CA 0.99 0.83 0.59 0.60

FL 0.64 1.20 0.92 1.05

IL 0.90 0.81 0.32 0.50

MI 0.09 0.96 0.79 0.79

NY 0.48 0.75 0.52 0.61

OH 0.44 1.07 0.69 0.87

PA 0.59 0.77 0.46 0.52

TX 0.86 0.97 0.39 0.67

Average 0.63 0.92 0.59 0.70
Past Month Use of Alcohol

CA 0.98 1.08 1.01 1.00

FL 0.82 0.91 1.03 1.01

IL 0.91 1.00 0.92 0.90

MI 0.96 0.99 1.00 0.95

NY 0.98 0.76 0.96 0.96

OH 0.93 0.87 1.09 1.10

PA 0.96 0.83 0.92 0.90

TX 1.25 1.03 1.10 1.07

Average 0.97 0.93 1.01 0.99
Past Month Use of Cigarettes

CA 1.03 1.14 1.02 0.99

FL 0.97 1.05 1.14 1.13

IL 1.04 1.20 1.10 1.12

MI 0.95 1.05 1.04 1.01

NY 0.81 1.10 1.11 1.08

OH 1.05 1.22 1.02 1.02

PA 1.02 1.05 1.10 1.07

TX 1.11 1.27 1.03 1.02

Average 1.00 1.14 1.07 1.05

Note: Ratio = Average width of model-based PIs of change for substates / Average width of design-based CIs of change for substates
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
CI = confidence interval; PI = predication interval.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

State	Age in Years	Total
12–17	18–25	26+
Past Month Use of Marijuana
CA	0.84	0.94	0.77	0.72
FL	0.74	1.02	0.72	0.73
IL	0.87	0.99	0.60	0.71
MI	0.74	1.04	0.88	0.84
NY	0.72	0.94	0.54	0.64
OH	0.75	1.01	0.75	0.80
PA	0.74	1.00	0.86	0.86
TX	0.94	1.01	0.38	0.67
Average	0.79	0.99	0.69	0.75
Past Year Use of Cocaine
CA	0.99	0.83	0.59	0.60
FL	0.64	1.20	0.92	1.05
IL	0.90	0.81	0.32	0.50
MI	0.09	0.96	0.79	0.79
NY	0.48	0.75	0.52	0.61
OH	0.44	1.07	0.69	0.87
PA	0.59	0.77	0.46	0.52
TX	0.86	0.97	0.39	0.67
Average	0.63	0.92	0.59	0.70
Past Month Use of Alcohol
CA	0.98	1.08	1.01	1.00
FL	0.82	0.91	1.03	1.01
IL	0.91	1.00	0.92	0.90
MI	0.96	0.99	1.00	0.95
NY	0.98	0.76	0.96	0.96
OH	0.93	0.87	1.09	1.10
PA	0.96	0.83	0.92	0.90
TX	1.25	1.03	1.10	1.07
Average	0.97	0.93	1.01	0.99
Past Month Use of Cigarettes
CA	1.03	1.14	1.02	0.99
FL	0.97	1.05	1.14	1.13
IL	1.04	1.20	1.10	1.12
MI	0.95	1.05	1.04	1.01
NY	0.81	1.10	1.11	1.08
OH	1.05	1.22	1.02	1.02
PA	1.02	1.05	1.10	1.07
TX	1.11	1.27	1.03	1.02
Average	1.00	1.14	1.07	1.05

Table E.2 Average Correlation Between the 1999–2000 and the 2000–2001 Model-Based and Design Based Estimates (Based on the Underestimated Model-Based Correlations)

State	Age in Years						Total
	12–17		18–25		26+		Total
	DB	MB	DB	MB	DB	MB	DB	MB
Past Month Use of Marijuana
CA	0.3204	0.1217	0.4943	0.1508	0.4107	0.3515	0.3273	0.3701
FL	0.5079	0.1998	0.5020	0.1456	0.3114	0.3024	0.3492	0.3308
IL	0.4133	0.1733	0.4996	0.1649	0.5736	0.3816	0.5988	0.3986
MI	0.3316	0.1322	0.4838	0.1203	0.5615	0.3651	0.5476	0.3843
NY	0.4372	0.2003	0.5343	0.1757	0.4083	0.3752	0.4609	0.3991
OH	0.3827	0.1516	0.6195	0.1514	0.5057	0.3990	0.5723	0.3984
PA	0.4838	0.1611	0.5863	0.1533	0.5799	0.3420	0.6406	0.3549
TX	0.5088	0.1337	0.5064	0.1675	0.3134	0.4462	0.4329	0.4362
Average	0.4346	0.1634	0.5321	0.1540	0.4633	0.3725	0.5094	0.3856
Past Year Use of Cocaine
CA	0.4937	0.1827	0.3807	0.1365	0.4240	0.2833	0.4380	0.3131
FL	0.3228	0.2723	0.4839	0.1286	0.6494	0.2852	0.5982	0.2994
IL	0.6058	0.3017	0.4796	0.1452	0.3945	0.2476	0.4316	0.2724
MI	0.4221	0.2550	0.5056	0.1419	0.5341	0.2935	0.5134	0.3233
NY	0.4502	0.2938	0.4186	0.1903	0.4097	0.2728	0.3996	0.3012
OH	0.5629	0.2872	0.4782	0.1389	0.5790	0.2679	0.5704	0.2887
PA	0.3517	0.2333	0.5553	0.1465	0.4333	0.2681	0.4394	0.2972
TX	0.3932	0.2160	0.3400	0.1274	0.2720	0.2830	0.3627	0.2952
Average	0.4455	0.2633	0.4635	0.1453	0.4662	0.2743	0.4726	0.2972
Past Month Use of Alcohol
CA	0.3987	0.0866	0.5756	0.0821	0.5560	0.1390	0.5808	0.1562
FL	0.4226	0.0998	0.5331	0.1181	0.4971	0.1539	0.5078	0.1659
IL	0.3669	0.1073	0.5651	0.0958	0.4712	0.1379	0.4637	0.1563
MI	0.4200	0.1142	0.4815	0.0836	0.5311	0.1466	0.4978	0.1634
NY	0.4680	0.1147	0.4835	0.1540	0.4485	0.1382	0.4914	0.1571
OH	0.3443	0.1063	0.5001	0.1032	0.4647	0.1207	0.4843	0.1383
PA	0.4636	0.0793	0.6181	0.1300	0.4895	0.1264	0.4856	0.1471
TX	0.6342	0.0738	0.5562	0.1084	0.6464	0.1576	0.6509	0.1700
Average	0.4444	0.0990	0.5351	0.1124	0.5083	0.1401	0.5136	0.1569
Past Month Use of Cigarettes
CA	0.3284	0.0717	0.5193	0.0491	0.5963	0.0760	0.5655	0.0910
FL	0.4907	0.0863	0.5048	0.0912	0.5069	0.0788	0.5184	0.0848
IL	0.4375	0.0827	0.5203	0.0861	0.5016	0.0367	0.5550	0.0577
MI	0.4284	0.0440	0.5433	0.0493	0.4787	0.0555	0.4999	0.0647
NY	0.3974	0.0829	0.5050	0.0715	0.4655	0.0581	0.4643	0.0706
OH	0.4731	0.0688	0.5462	0.0461	0.4433	0.0596	0.4696	0.0714
PA	0.4733	0.0734	0.5898	0.0483	0.4217	0.0558	0.4253	0.0727
TX	0.5882	0.0766	0.6083	0.0544	0.6135	0.1101	0.6321	0.1200
Average	0.4659	0.0735	0.5447	0.0634	0.4931	0.0652	0.5108	0.0778

Note: The design based (DB) correlation is derived from the SUDAAN sampling variance and covariance calculations for P1 and P2, where P1 is the 1999–2000 pooled small area estimate and P2 is the 2000–2001 pooled small area estimate. SUDAAN uses between-replicate, within-FI (field interviewer) region, mean squares, and cross products. The DB correlation on the log-odds scale is the same as on the prevalence scale. The model-based (MB) correlations are Bayes posterior correlations for the log-odds calculated from the Markov chain Monte Carlo (MCMC) samples. The MB correlations are underestimated because the software cannot properly account for the sampling covariance resulting from the 2000 data overlap.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.3 Ratio of Average Widths of Change Between the 1999–2000 Pooled Data and the 2000–2001 Pooled Data (Based on the Appropriately Estimated Model-Based Correlations)

State	Age in Years			Total
State	12–17	18–25	26+	Total
Past Month Use of Marijuana
CA	0.65	0.72	0.60	0.56
FL	0.55	0.78	0.58	0.58
IL	0.69	0.73	0.49	0.57
MI	0.58	0.76	0.72	0.68
NY	0.56	0.72	0.46	0.54
OH	0.57	0.71	0.62	0.63
PA	0.57	0.71	0.67	0.66
TX	0.68	0.73	0.34	0.55
Average	0.61	0.73	0.56	0.60
Past Year Use of Cocaine
CA	0.71	0.66	0.53	0.54
FL	0.48	0.91	0.76	0.87
IL	0.69	0.64	0.28	0.42
MI	0.07	0.73	0.71	0.71
NY	0.36	0.63	0.44	0.52
OH	0.33	0.82	0.59	0.73
PA	0.47	0.56	0.39	0.43
TX	0.65	0.76	0.36	0.59
Average	0.47	0.71	0.51	0.60
Past Month Use of Alcohol
CA	0.72	0.76	0.73	0.72
FL	0.59	0.66	0.76	0.74
IL	0.67	0.70	0.65	0.63
MI	0.71	0.70	0.73	0.69
NY	0.70	0.55	0.71	0.71
OH	0.68	0.62	0.77	0.77
PA	0.70	0.58	0.66	0.65
TX	0.88	0.71	0.76	0.72
Average	0.71	0.66	0.72	0.70
Past Month Use of Cigarettes
CA	0.77	0.84	0.72	0.70
FL	0.71	0.78	0.81	0.80
IL	0.79	0.85	0.78	0.80
MI	0.69	0.76	0.81	0.78
NY	0.60	0.80	0.82	0.80
OH	0.78	0.84	0.77	0.76
PA	0.77	0.73	0.81	0.78
TX	0.81	0.90	0.75	0.74
Average	0.74	0.81	0.78	0.77

Note: Ratio = Average width of model-based PIs of change for substates / Average width of design-based CIs of change for substates
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
CI = confidence interval; PI = predication interval.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.4 Average Correlation Between the 1999–2000 and the 2000–2001 Model-Based and Design-Based Estimates (Based on the Appropriately Estimated Model-Based Correlations)

State	Age in Years						Total
	12–17		18–25		26+		Total
	DB	MB	DB	MB	DB	MB	DB	MB
Past Month Use of Marijuana
CA	0.3204	0.4760	0.4943	0.4916	0.4107	0.5962	0.3273	0.6235
FL	0.5079	0.5380	0.5020	0.5025	0.3114	0.5441	0.3492	0.5775
IL	0.4133	0.4812	0.4996	0.5351	0.5736	0.5820	0.5988	0.6067
MI	0.3316	0.4588	0.4838	0.5279	0.5615	0.5752	0.5476	0.5944
NY	0.4372	0.5092	0.5343	0.5221	0.4083	0.5293	0.4609	0.5668
OH	0.3827	0.5138	0.6195	0.5711	0.5057	0.5844	0.5723	0.6269
PA	0.4838	0.4861	0.5863	0.5708	0.5799	0.5904	0.6406	0.6112
TX	0.5088	0.5371	0.5064	0.5606	0.3134	0.5498	0.4329	0.6190
Avg.	0.4346	0.5027	0.5321	0.5400	0.4633	0.5659	0.5094	0.6010
Past Year Use of Cocaine
CA	0.4937	0.5673	0.3807	0.4353	0.4240	0.4077	0.4380	0.4349
FL	0.3228	0.5644	0.4839	0.4814	0.6494	0.4919	0.5982	0.5117
IL	0.6058	0.5783	0.4796	0.4570	0.3945	0.4344	0.4316	0.4747
MI	0.4221	0.5396	0.5056	0.4837	0.5341	0.4272	0.5134	0.4568
NY	0.4502	0.5941	0.4186	0.4262	0.4097	0.4536	0.3996	0.4855
OH	0.5629	0.5787	0.4782	0.4728	0.5790	0.4549	0.5704	0.4816
PA	0.3517	0.4995	0.5553	0.5260	0.4333	0.4738	0.4394	0.5086
TX	0.3932	0.5457	0.3400	0.4495	0.2720	0.3754	0.3627	0.4430
Avg.	0.4455	0.5575	0.4635	0.4700	0.4662	0.4434	0.4726	0.4790
Past Month Use of Alcohol
CA	0.3987	0.4984	0.5756	0.5453	0.5560	0.5487	0.5808	0.5625
FL	0.4226	0.5282	0.5331	0.5375	0.4971	0.5352	0.5078	0.5494
IL	0.3669	0.5149	0.5651	0.5542	0.4712	0.5643	0.4637	0.5815
MI	0.4200	0.5062	0.4815	0.5398	0.5311	0.5416	0.4978	0.5613
NY	0.4680	0.5424	0.4835	0.5507	0.4485	0.5272	0.4914	0.5454
OH	0.3443	0.5266	0.5001	0.5399	0.4647	0.5625	0.4843	0.5791
PA	0.4636	0.5073	0.6181	0.5713	0.4895	0.5436	0.4856	0.5583
TX	0.6342	0.5414	0.5562	0.5711	0.6464	0.5996	0.6509	0.6189
Avg.	0.4444	0.5231	0.5351	0.5519	0.5083	0.5533	0.5136	0.5703
Past Month Use of Cigarettes
CA	0.3284	0.4741	0.5193	0.4868	0.5963	0.5384	0.5655	0.5452
FL	0.4907	0.5036	0.5048	0.5014	0.5069	0.5287	0.5184	0.5369
IL	0.4375	0.4614	0.5203	0.5375	0.5016	0.5109	0.5550	0.5240
MI	0.4284	0.5001	0.5433	0.5026	0.4787	0.4268	0.4999	0.4317
NY	0.3974	0.4829	0.5050	0.5028	0.4655	0.4770	0.4643	0.4851
OH	0.4731	0.4763	0.5462	0.5427	0.4433	0.4707	0.4696	0.4830
PA	0.4733	0.4727	0.5898	0.5377	0.4217	0.4913	0.4253	0.4996
TX	0.5882	0.5017	0.6083	0.5293	0.6135	0.5291	0.6321	0.5345
Avg.	0.4659	0.4852	0.5447	0.5210	0.4931	0.4920	0.5108	0.5005

NOTE: The design based (DB) correlation is derived from the SUDAAN sampling variance and covariance calculations for P1 and P2, where P1 is the 1999–2000 pooled small area estimate and P2 is the 2000–2001 pooled small area estimate. SUDAAN uses between replicate, within-FI (field interviewer) region, mean squares, and cross products. The DB correlation on the log-odds scale is the same as on the prevalence scale. The model-based (MB) correlations are Bayes posterior correlations for the log-odds calculated from the Markov chain Monte Carlo (MCMC). The MB correlations are adjusted to account for the sampling covariance resulting from the 2000 data overlap.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.5 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Marijuana

State	12–17			18–25			26 or Older			Total
State	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio
CA1	0.889	0.861		0.900	0.865		0.946	0.931		0.940	0.919
CA2	0.903	0.872		0.974	0.965		0.780	0.709		0.748	0.671
CA3	0.284	0.162		0.726	0.658		0.851	0.828		0.964	0.956
CA4	0.756	0.689		0.067	0.024		0.995	0.994		0.467	0.339
Average	0.708	0.646	0.91	0.667	0.628	0.94	0.893	0.866	0.97	0.780	0.721	0.92
FL1	0.364	0.219		0.809	0.749		0.627	0.581		0.752	0.716
FL2	0.338	0.211		0.911	0.886		0.998	0.997		0.859	0.821
FL3	0.601	0.456		0.820	0.758		0.764	0.704		0.831	0.782
FL4	0.822	0.787		0.325	0.208		0.668	0.578		0.491	0.369
Average	0.531	0.418	0.79	0.716	0.650	0.91	0.764	0.715	0.94	0.733	0.672	0.92
IL1	0.636	0.560		0.195	0.075		0.628	0.552		0.352	0.238
IL2	0.999	0.999		0.298	0.174		0.487	0.395		0.290	0.198
IL3	0.539	0.443		0.205	0.073		0.373	0.291		0.135	0.072
IL4	0.970	0.959		0.450	0.335		0.634	0.558		0.452	0.344
Average	0.786	0.740	0.94	0.287	0.164	0.57	0.531	0.449	0.85	0.307	0.213	0.69
MI1	0.521	0.407		0.402	0.278		0.956	0.945		0.863	0.835
MI2	0.268	0.131		0.410	0.262		0.682	0.589		0.313	0.172
MI3	0.317	0.228		0.434	0.275		0.875	0.856		0.479	0.397
MI4	0.695	0.633		0.691	0.576		0.694	0.642		0.542	0.471
Average	0.450	0.350	0.78	0.484	0.348	0.72	0.802	0.758	0.95	0.549	0.469	0.85
NY1	0.676	0.596		0.995	0.993		0.981	0.979		0.916	0.903
NY2	0.460	0.303		0.110	0.038		0.584	0.535		0.197	0.132
NY3	0.438	0.349		0.590	0.453		0.778	0.742		0.534	0.461
NY4	0.841	0.802		0.474	0.350		0.667	0.609		0.520	0.440
Average	0.604	0.513	0.85	0.542	0.459	0.85	0.753	0.716	0.95	0.542	0.484	0.89
OH1	0.761	0.694		0.668	0.555		0.583	0.486		0.448	0.324
OH2	0.228	0.143		0.680	0.568		0.728	0.703		0.676	0.627
OH3	0.766	0.685		0.543	0.378		0.646	0.551		0.549	0.415
OH4	0.947	0.924		0.421	0.253		0.628	0.579		0.398	0.283
Average	0.676	0.612	0.91	0.578	0.439	0.76	0.646	0.580	0.90	0.518	0.412	0.80
PA1	0.136	0.069		0.578	0.429		0.933	0.916		0.532	0.434
PA2	0.949	0.937		0.704	0.577		0.819	0.778		0.722	0.642
PA3	0.692	0.594		0.442	0.291		0.582	0.452		0.388	0.238
PA4	0.466	0.331		0.570	0.442		0.622	0.543		0.409	0.308
Average	0.561	0.483	0.86	0.574	0.435	0.76	0.739	0.672	0.91	0.513	0.406	0.79
TX1	0.349	0.222		0.657	0.523		0.656	0.602		0.713	0.638
TX2	0.636	0.547		0.906	0.870		0.327	0.272		0.337	0.241
TX3	0.786	0.694		0.912	0.885		0.479	0.451		0.497	0.441
TX4	0.995	0.993		0.679	0.565		0.367	0.331		0.366	0.265
Average	0.692	0.614	0.89	0.789	0.711	0.90	0.457	0.414	0.91	0.478	0.396	0.83
Average across substates			0.87			0.80			0.92			0.84

Note: p value(1) represents the Bayes significance level obtained from Method 1.
Note: p value(2) represents the Bayes significance level obtained from Method 2.
Note: In method 1, an eight age-group model was fit, where age groups 1 to 4 correspond with the pooled 1999–2000 data and age groups 5 to 8 correspond with pooled 2000–2001 data. The p value for this method was obtained by using the variance of the log-odds produced by fitting this model. In method 2, a 12 age-group model was fit. Age groups 1 to 4 correspond with the 1999 data, age groups 5 to 8 with the 2000 data, and age groups 9 to 12 correspond with the 2001 data. The p values were obtained using the correlation produced from this method and the variances of the logits produced in method 1.
Note: Ratio = Average p value(2) / Average p value(1).
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.6 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Year Use of Cocaine

State	12–17			18–25			26 or Older			Total
State	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio
CA1	0.765	0.672		0.831	0.793		0.689	0.664		0.810	0.792
CA2	0.946	0.931		0.375	0.263		0.936	0.931		0.671	0.645
CA3	0.989	0.985		0.841	0.799		0.384	0.331		0.380	0.333
CA4	0.945	0.924		0.575	0.505		0.962	0.957		0.745	0.714
Average	0.911	0.878	0.96	0.656	0.590	0.90	0.743	0.721	0.97	0.652	0.621	0.95
FL1	0.783	0.713		0.872	0.837		0.979	0.976		0.984	0.982
FL2	0.956	0.941		0.476	0.398		0.894	0.872		0.913	0.894
FL3	0.500	0.406		0.856	0.792		0.947	0.936		0.944	0.930
FL4	0.642	0.555		0.934	0.914		0.972	0.967		0.973	0.969
Average	0.720	0.654	0.91	0.785	0.735	0.94	0.948	0.938	0.99	0.954	0.944	0.99
IL1	0.823	0.762		0.383	0.210		0.826	0.781		0.601	0.486
IL2	0.782	0.739		0.698	0.658		0.829	0.803		0.756	0.725
IL3	0.594	0.508		0.720	0.675		0.650	0.604		0.624	0.582
IL4	0.643	0.532		0.336	0.203		0.683	0.662		0.445	0.382
Average	0.711	0.635	0.89	0.534	0.437	0.82	0.747	0.713	0.95	0.607	0.544	0.90
MI1	0.989	0.987		0.367	0.283		0.853	0.839		0.504	0.474
MI2	0.867	0.831		0.594	0.464		0.929	0.920		0.873	0.858
MI3	0.985	0.979		0.983	0.978		0.737	0.707		0.775	0.740
MI4	0.665	0.594		0.483	0.370		0.645	0.611		0.967	0.963
Average	0.877	0.848	0.97	0.607	0.524	0.86	0.791	0.769	0.97	0.780	0.759	0.97
NY1	0.390	0.306		0.560	0.526		0.773	0.736		0.711	0.681
NY2	0.939	0.917		0.441	0.357		0.757	0.722		0.545	0.478
NY3	0.822	0.769		0.766	0.714		0.937	0.926		0.870	0.843
NY4	0.670	0.538		0.527	0.428		0.933	0.925		0.886	0.867
Average	0.705	0.633	0.90	0.574	0.506	0.88	0.850	0.827	0.97	0.753	0.717	0.95
OH1	0.962	0.953		0.512	0.411		0.737	0.700		0.557	0.500
OH2	0.836	0.778		0.899	0.870		0.811	0.790		0.763	0.732
OH3	0.943	0.927		0.653	0.562		0.829	0.806		0.989	0.987
OH4	0.847	0.797		0.267	0.156		0.981	0.977		0.622	0.542
Average	0.897	0.864	0.96	0.583	0.500	0.86	0.840	0.818	0.97	0.733	0.690	0.94
PA1	0.721	0.677		0.269	0.127		0.663	0.623		0.349	0.269
PA2	0.618	0.564		0.402	0.252		0.758	0.701		0.555	0.453
PA3	0.846	0.811		0.829	0.793		0.723	0.665		0.658	0.603
PA4	0.936	0.909		0.699	0.585		0.795	0.770		0.698	0.653
Average	0.780	0.740	0.95	0.550	0.439	0.80	0.735	0.690	0.94	0.565	0.495	0.88
TX1	0.750	0.671		0.668	0.532		0.900	0.893		0.683	0.636
TX2	0.422	0.296		0.614	0.496		0.970	0.967		0.939	0.930
TX3	0.654	0.563		0.885	0.873		0.879	0.869		0.973	0.971
TX4	0.943	0.926		0.549	0.475		0.964	0.963		0.770	0.747
Average	0.692	0.614	0.89	0.679	0.594	0.87	0.928	0.923	0.99	0.841	0.821	0.98
Average across substates			0.93			0.87			0.97			0.94

Table E.7 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Alcohol

State	12–17			18–25			26 or Older			Total
State	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio
CA1	0.317	0.175		0.801	0.707		0.239	0.103		0.238	0.096
CA2	0.788	0.723		0.486	0.324		0.743	0.651		0.627	0.502
CA3	0.819	0.758		0.150	0.049		0.267	0.129		0.157	0.055
CA4	0.657	0.546		0.496	0.336		0.956	0.939		0.842	0.780
Average	0.645	0.551	0.85	0.483	0.354	0.73	0.551	0.456	0.83	0.466	0.358	0.77
FL1	0.306	0.153		0.287	0.155		0.157	0.052		0.193	0.074
FL2	0.386	0.239		0.806	0.734		0.826	0.761		0.819	0.748
FL3	0.733	0.624		0.308	0.154		0.875	0.837		0.982	0.976
FL4	0.872	0.829		0.603	0.465		0.987	0.982		0.943	0.923
Average	0.574	0.461	0.80	0.501	0.377	0.75	0.711	0.658	0.93	0.734	0.680	0.93
IL1	0.883	0.847		0.819	0.744		0.668	0.526		0.654	0.507
IL2	0.512	0.342		0.920	0.889		0.491	0.337		0.491	0.330
IL3	0.863	0.819		0.711	0.593		0.689	0.602		0.747	0.668
IL4	0.714	0.622		0.891	0.843		0.501	0.332		0.531	0.365
Average	0.743	0.658	0.88	0.835	0.767	0.92	0.587	0.449	0.77	0.606	0.468	0.77
MI1	0.879	0.839		0.978	0.970		0.332	0.180		0.366	0.211
MI2	0.655	0.549		0.383	0.210		0.121	0.034		0.096	0.020
MI3	0.464	0.331		0.479	0.325		0.631	0.524		0.531	0.402
MI4	0.716	0.622		0.097	0.016		0.273	0.130		0.179	0.059
Average	0.679	0.585	0.86	0.484	0.380	0.79	0.339	0.217	0.64	0.293	0.173	0.59
NY1	0.853	0.780		0.567	0.427		0.896	0.859		0.956	0.939
NY2	0.441	0.261		0.904	0.873		0.966	0.955		0.953	0.937
NY3	0.763	0.693		0.950	0.932		0.483	0.326		0.458	0.297
NY4	0.834	0.784		0.995	0.993		0.464	0.337		0.493	0.362
Average	0.723	0.630	0.87	0.854	0.806	0.94	0.702	0.619	0.88	0.715	0.634	0.89
OH1	0.697	0.580		0.579	0.437		0.361	0.187		0.411	0.233
OH2	0.438	0.337		0.327	0.156		0.359	0.219		0.271	0.133
OH3	0.950	0.930		0.393	0.245		0.877	0.823		0.793	0.701
OH4	0.831	0.753		0.128	0.037		0.853	0.789		0.684	0.558
Average	0.729	0.650	0.89	0.357	0.219	0.61	0.613	0.505	0.82	0.540	0.406	0.75
PA1	0.358	0.203		0.420	0.244		0.896	0.863		0.769	0.694
PA2	0.457	0.301		0.585	0.432		0.852	0.798		0.764	0.679
PA3	0.317	0.173		0.855	0.803		0.340	0.168		0.305	0.138
PA4	0.454	0.320		0.216	0.073		0.931	0.905		0.761	0.675
Average	0.397	0.249	0.63	0.519	0.388	0.75	0.755	0.684	0.91	0.650	0.547	0.84
TX1	0.528	0.392		0.991	0.987		0.747	0.649		0.827	0.753
TX2	0.970	0.958		0.757	0.662		0.419	0.253		0.381	0.206
TX3	0.864	0.796		0.773	0.684		0.888	0.842		0.839	0.769
TX4	0.667	0.539		0.704	0.568		0.181	0.037		0.171	0.031
Average	0.757	0.671	0.89	0.806	0.725	0.90	0.559	0.445	0.80	0.555	0.440	0.79
Average across substates			0.84			0.80			0.82			0.79

Table E.8 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Cigarettes

State	12–17			18–25			26 or Older			Total
State	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio	p Value(1)	p Value(2)	Ratio
CA1	0.648	0.554		0.810	0.742		0.498	0.321		0.482	0.305
CA2	0.831	0.780		0.454	0.291		0.978	0.969		0.892	0.849
CA3	0.965	0.954		0.383	0.247		0.809	0.733		0.917	0.883
CA4	0.541	0.407		0.678	0.577		0.572	0.443		0.586	0.457
Average	0.746	0.674	0.90	0.581	0.464	0.80	0.714	0.617	0.86	0.719	0.624	0.87
FL1	0.304	0.178		0.943	0.926		0.469	0.294		0.425	0.246
FL2	0.344	0.199		0.771	0.687		0.823	0.770		0.849	0.801
FL3	0.791	0.700		0.797	0.730		0.695	0.588		0.656	0.536
FL4	0.393	0.261		0.113	0.029		0.977	0.966		0.861	0.797
Average	0.458	0.335	0.73	0.656	0.593	0.90	0.741	0.655	0.88	0.698	0.595	0.85
IL1	0.535	0.423		0.326	0.172		0.532	0.377		0.467	0.300
IL2	0.554	0.401		0.757	0.670		0.443	0.277		0.444	0.282
IL3	0.566	0.445		0.758	0.659		0.479	0.332		0.575	0.439
IL4	0.134	0.071		0.812	0.736		0.557	0.412		0.438	0.277
Average	0.447	0.335	0.75	0.663	0.559	0.84	0.503	0.350	0.70	0.481	0.325	0.67
MI1	0.191	0.052		0.541	0.404		0.266	0.148		0.176	0.078
MI2	0.893	0.856		0.942	0.919		0.414	0.295		0.435	0.314
MI3	0.951	0.932		0.833	0.773		0.457	0.361		0.425	0.334
MI4	0.690	0.601		0.257	0.115		0.606	0.495		0.508	0.380
Average	0.681	0.610	0.90	0.643	0.553	0.86	0.436	0.325	0.75	0.386	0.277	0.72
NY1	0.153	0.040		0.647	0.546		0.373	0.249		0.283	0.164
NY2	0.207	0.092		0.176	0.063		0.857	0.811		0.840	0.787
NY3	0.472	0.373		0.451	0.311		0.247	0.106		0.346	0.193
NY4	0.485	0.354		0.682	0.560		0.416	0.276		0.438	0.296
Average	0.329	0.215	0.65	0.489	0.370	0.76	0.473	0.361	0.76	0.477	0.360	0.76
OH1	0.684	0.572		0.473	0.280		0.420	0.281		0.541	0.411
OH2	0.953	0.941		0.928	0.899		0.639	0.537		0.633	0.529
OH3	0.069	0.015		0.251	0.102		0.417	0.272		0.231	0.103
OH4	0.617	0.500		0.488	0.317		0.889	0.853		0.942	0.923
Average	0.581	0.507	0.87	0.535	0.400	0.75	0.591	0.486	0.82	0.587	0.492	0.84
PA1	0.574	0.463		0.827	0.758		0.312	0.181		0.306	0.176
PA2	0.586	0.450		0.774	0.681		0.979	0.971		0.933	0.908
PA3	0.247	0.129		0.438	0.261		0.677	0.563		0.668	0.552
PA4	0.672	0.582		0.660	0.526		0.593	0.472		0.580	0.456
Average	0.520	0.406	0.78	0.675	0.557	0.82	0.640	0.547	0.85	0.622	0.523	0.84
TX1	0.910	0.882		0.982	0.974		0.827	0.760		0.813	0.739
TX2	0.989	0.986		0.496	0.327		0.749	0.657		0.658	0.545
TX3	0.317	0.157		0.783	0.705		0.914	0.888		0.804	0.742
TX4	0.717	0.591		0.727	0.612		0.949	0.928		0.992	0.988
Average	0.733	0.654	0.89	0.747	0.655	0.88	0.860	0.808	0.94	0.817	0.754	0.92
Average across substates			0.81			0.83			0.82			0.81

Table E.9 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Marijuana Use

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	1.10	1.06	1.08	1.08
Average across 4 substates	1.06	1.06	1.01	1.03
Relative Absolute Bias	3.04	0.29	6.74	4.18
FL (design-based)	1.21	0.98	0.93	0.99
Average across 4 substates	1.10	1.02	0.99	1.02
Relative Absolute Bias	9.20	3.88	5.65	2.87
IL (design-based)	0.97	1.22	1.33	1.21
Average across 4 substates	1.01	1.16	1.13	1.12
Relative Absolute Bias	3.88	4.27	15.06	7.72
MI (design-based)	1.26	1.05	1.01	1.06
Average across 4 substates	1.14	1.05	1.05	1.06
Relative Absolute Bias	9.96	0.58	4.31	0.10
NY (design-based)	1.14	1.10	1.43	1.22
Average across 4 substates	1.04	1.11	1.07	1.07
Relative Absolute Bias	8.71	1.24	25.53	12.21
OH (design-based)	1.13	1.04	1.05	1.06
Average across 4 substates	1.06	1.05	1.10	1.07
Relative Absolute Bias	5.79	1.11	4.42	1.49
PA (design-based)	1.24	1.15	1.07	1.11
Average across 4 substates	1.12	1.08	1.07	1.08
Relative Absolute Bias	9.67	5.40	0.62	2.87
TX (design-based)	1.02	0.99	1.37	1.11
Average across 4 substates	1.08	1.01	1.18	1.08
Relative Absolute Bias	6.10	1.65	13.55	2.62
Average Relative Absolute Bias	7.04	2.30	9.48	4.26

Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.10 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Year Use of Cocaine

State Age in Years Total

12–17 18–25 26+

CA (design-based) 0.98 1.12 1.31 1.21

Average across 4 substates 0.97 1.08 1.10 1.08

Relative Absolute Bias 0.51 3.92 16.27 10.46

FL (design-based) 0.84 0.79 0.79 0.80

Average across 4 substates 0.93 0.95 1.02 0.99

Relative Absolute Bias 11.54 19.50 29.31 24.10

IL (design-based) 0.71 1.29 1.43 1.33

Average across 4 substates 0.90 1.16 1.10 1.10

Relative Absolute Bias 26.64 10.71 23.10 17.29

MI (design-based) 1.26 0.94 0.65 0.83

Average across 4 substates 1.02 1.03 0.93 0.97

Relative Absolute Bias 18.44 10.07 43.20 17.23

NY (design-based) 0.82 1.27 1.23 1.21

Average across 4 substates 0.90 1.13 1.04 1.06

Relative Absolute Bias 9.71 10.81 15.46 12.36

OH (design-based) 1.10 0.85 0.92 0.90

Average across 4 substates 0.97 1.00 0.97 0.98

Relative Absolute Bias 11.51 17.03 5.74 9.10

PA (design-based) 1.14 1.27 1.17 1.20

Average across 4 substates 1.00 1.15 1.10 1.10

Relative Absolute Bias 12.02 10.08 6.23 8.12

TX (design-based) 0.87 0.98 1.11 1.00

Average across 4 substates 0.91 0.97 1.00 0.97

Relative Absolute Bias 4.14 0.93 9.99 3.00

Average Relative Absolute Bias 11.81 10.38 18.66 12.71

Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

State	Age in Years	Total
12–17	18–25	26+
CA (design-based)	0.98	1.12	1.31	1.21
Average across 4 substates	0.97	1.08	1.10	1.08
Relative Absolute Bias	0.51	3.92	16.27	10.46
FL (design-based)	0.84	0.79	0.79	0.80
Average across 4 substates	0.93	0.95	1.02	0.99
Relative Absolute Bias	11.54	19.50	29.31	24.10
IL (design-based)	0.71	1.29	1.43	1.33
Average across 4 substates	0.90	1.16	1.10	1.10
Relative Absolute Bias	26.64	10.71	23.10	17.29
MI (design-based)	1.26	0.94	0.65	0.83
Average across 4 substates	1.02	1.03	0.93	0.97
Relative Absolute Bias	18.44	10.07	43.20	17.23
NY (design-based)	0.82	1.27	1.23	1.21
Average across 4 substates	0.90	1.13	1.04	1.06
Relative Absolute Bias	9.71	10.81	15.46	12.36
OH (design-based)	1.10	0.85	0.92	0.90
Average across 4 substates	0.97	1.00	0.97	0.98
Relative Absolute Bias	11.51	17.03	5.74	9.10
PA (design-based)	1.14	1.27	1.17	1.20
Average across 4 substates	1.00	1.15	1.10	1.10
Relative Absolute Bias	12.02	10.08	6.23	8.12
TX (design-based)	0.87	0.98	1.11	1.00
Average across 4 substates	0.91	0.97	1.00	0.97
Relative Absolute Bias	4.14	0.93	9.99	3.00
Average Relative Absolute Bias	11.81	10.38	18.66	12.71

Table E.11 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Use of Alcohol

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	0.94	1.07	1.01	1.02
Average across 4 substates	0.96	1.08	1.08	1.07
Relative Absolute Bias	1.84	0.82	6.44	5.02
FL (design-based)	1.14	0.96	1.09	1.07
Average across 4 substates	1.09	0.95	1.05	1.04
Relative Absolute Bias	4.97	1.17	3.41	3.18
IL (design-based)	1.03	0.99	1.08	1.05
Average across 4 substates	1.04	1.00	1.02	1.02
Relative Absolute Bias	1.70	0.87	4.73	3.46
MI (design-based)	1.06	1.10	1.11	1.09
Average across 4 substates	1.05	1.10	1.11	1.10
Relative Absolute Bias	1.43	0.18	0.44	0.27
NY (design-based)	1.03	1.00	0.96	0.97
Average across 4 substates	1.04	1.02	1.00	1.00
Relative Absolute Bias	0.91	1.69	3.62	3.03
OH (design-based)	1.04	1.08	1.01	1.02
Average across 4 substates	1.04	1.08	1.06	1.06
Relative Absolute Bias	0.21	0.14	5.30	4.00
PA (design-based)	1.16	1.15	1.07	1.08
Average across 4 substates	1.12	1.08	1.04	1.04
Relative Absolute Bias	3.77	6.13	3.00	3.25
TX (design-based)	0.99	1.01	1.01	1.01
Average across 4 substates	1.00	1.03	1.06	1.05
Relative Absolute Bias	1.73	1.23	5.15	4.04
Average Relative Absolute Bias	2.07	1.53	4.01	3.28

Table E.12 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Use of Cigarettes

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	0.96	1.00	1.02	1.02
Average across 4 substates	1.00	0.98	0.97	0.97
Relative Absolute Bias	3.44	1.87	5.20	4.23
FL (design-based)	0.99	1.07	0.96	0.97
Average across 4 substates	0.96	1.03	0.96	0.97
Relative Absolute Bias	2.43	3.02	0.47	0.25
IL (design-based)	0.87	1.03	1.00	0.99
Average across 4 substates	0.89	1.02	1.00	1.00
Relative Absolute Bias	2.02	0.84	0.51	0.30
MI (design-based)	0.95	1.00	1.02	1.01
Average across 4 substates	0.94	1.01	0.99	0.99
Relative Absolute Bias	1.35	1.22	3.50	2.51
NY (design-based)	1.02	1.10	0.88	0.92
Average across 4 substates	1.01	1.07	0.91	0.94
Relative Absolute Bias	0.38	2.64	2.63	1.41
OH (design-based)	0.93	0.95	1.03	1.01
Average across 4 substates	0.91	0.97	1.01	1.00
Relative Absolute Bias	1.80	2.32	1.30	0.81
PA (design-based)	0.90	1.04	1.01	1.00
Average across 4 substates	0.91	1.03	1.00	1.00
Relative Absolute Bias	1.50	0.96	0.65	0.55
TX (design-based)	0.96	0.99	1.01	1.00
Average across 4 substates	0.96	0.98	0.99	0.99
Relative Absolute Bias	0.36	0.37	1.69	1.33
Average Relative Absolute Bias	1.66	1.65	1.99	1.42

Table E.13 Relative Absolute Bias for Past Month Use of Marijuana Based on Pooled 1999 and 2000 Data

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	7.60	13.94	4.16	5.86
Average across 4 substates	7.47	13.45	3.77	5.49
Relative Absolute Bias	1.75	3.54	9.28	6.35
FL (design-based)	6.33	13.31	3.39	4.73
Average across 4 substates	6.80	13.28	3.52	4.87
Relative Absolute Bias	7.39	0.19	3.77	3.02
IL (design-based)	8.57	14.31	2.51	4.70
Average across 4 substates	7.69	14.45	2.75	4.81
Relative Absolute Bias	10.24	1.01	9.66	2.44
MI (design-based)	7.77	16.64	3.53	5.68
Average across 4 substates	8.01	16.92	3.40	5.64
Relative Absolute Bias	3.08	1.69	3.71	0.68
NY (design-based)	6.32	16.77	2.02	4.26
Average across 4 substates	7.08	15.38	2.62	4.63
Relative Absolute Bias	12.08	8.26	29.53	8.69
OH (design-based)	6.07	14.31	2.49	4.40
Average across 4 substates	6.68	13.98	2.44	4.38
Relative Absolute Bias	10.03	2.31	2.17	0.50
PA (design-based)	5.83	14.16	2.79	4.42
Average across 4 substates	6.81	13.91	2.75	4.45
Relative Absolute Bias	16.90	1.75	1.63	0.71
TX (design-based)	6.00	10.41	1.34	3.22
Average across 4 substates	5.84	10.59	1.77	3.55
Relative Absolute Bias	2.65	1.79	32.35	10.19
Average Relative Absolute Bias	8.01	2.57	11.51	4.07

Note: Relative absolute bias = 100 × abs(Average small area estimate over 4 substates - Large State design-based estimate) / Large State design-based estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.14 Relative Absolute Bias for Past Year Use of Cocaine Based on Pooled 1999 and 2000 Data

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	2.05	4.79	1.29	1.85
Average across 4 substates	2.00	4.76	1.17	1.75
Relative Absolute Bias	2.19	0.65	9.24	5.36
FL (design-based)	1.52	5.96	1.18	1.73
Average across 4 substates	1.55	4.86	1.20	1.63
Relative Absolute Bias	2.18	18.43	1.60	5.78
IL (design-based)	0.96	3.93	1.11	1.47
Average across 4 substates	1.41	4.36	1.08	1.55
Relative Absolute Bias	47.60	10.97	2.88	5.43
MI (design-based)	1.02	5.04	0.86	1.42
Average across 4 substates	1.41	4.72	1.13	1.62
Relative Absolute Bias	38.23	6.34	30.39	14.07
NY (design-based)	1.18	3.87	1.01	1.37
Average across 4 substates	1.46	4.30	1.10	1.53
Relative Absolute Bias	23.10	11.10	9.67	11.32
OH (design-based)	0.78	4.98	0.92	1.43
Average across 4 substates	1.32	4.68	1.07	1.57
Relative Absolute Bias	69.04	6.15	17.00	9.45
PA (design-based)	1.18	4.39	1.00	1.41
Average across 4 substates	1.47	4.57	1.02	1.48
Relative Absolute Bias	25.11	4.01	1.71	4.45
TX (design-based)	2.66	6.10	0.83	1.82
Average across 4 substates	2.32	5.54	1.17	1.95
Relative Absolute Bias	12.90	9.07	41.21	7.17
Average Relative Absolute Bias	27.54	8.34	14.21	7.88

Table E.15 Relative Absolute Bias for Past Month Binge Alcohol Use Based on Pooled 1999 and 2000 Data

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	9.12	32.46	18.58	19.42
Average across 4 substates	9.16	32.16	18.55	19.36
Relative Absolute Bias	0.40	0.93	0.18	0.32
FL (design-based)	7.93	35.02	17.72	18.67
Average across 4 substates	8.79	34.68	17.60	18.62
Relative Absolute Bias	10.94	0.97	0.68	0.28
IL (design-based)	11.53	41.83	21.43	23.13
Average across 4 substates	11.00	41.62	21.00	22.72
Relative Absolute Bias	4.60	0.50	2.00	1.77
MI (design-based)	10.88	42.23	19.08	21.23
Average across 4 substates	10.84	41.12	19.55	21.44
Relative Absolute Bias	0.38	2.64	2.47	1.00
NY (design-based)	10.14	39.47	18.61	20.33
Average across 4 substates	9.92	38.99	18.74	20.35
Relative Absolute Bias	2.25	1.22	0.72	0.11
OH (design-based)	9.97	41.73	20.32	22.04
Average across 4 substates	10.42	41.67	19.95	21.79
Relative Absolute Bias	4.48	0.15	1.84	1.13
PA (design-based)	9.30	42.13	20.55	21.97
Average across 4 substates	10.20	41.92	19.67	21.35
Relative Absolute Bias	9.67	0.50	4.26	2.84
TX (design-based)	11.07	35.62	20.08	21.31
Average across 4 substates	10.78	36.06	20.15	21.39
Relative Absolute Bias	2.59	1.24	0.35	0.39
Average Relative Absolute Bias	4.41	1.02	1.56	0.98

Table E.16 Relative Absolute Bias for Past Month Use of Cigarettes Based on Pooled 1999 and 2000 Data

State	Age in Years			Total
State	12–17	18–25	26+	Total
CA (design-based)	8.73	29.62	22.07	21.62
Average across 4 substates	9.16	30.65	21.53	21.41
Relative Absolute Bias	4.93	3.49	2.41	0.98
FL (design-based)	10.92	34.60	25.16	24.85
Average across 4 substates	11.59	35.52	24.92	24.82
Relative Absolute Bias	6.16	2.65	0.96	0.13
IL (design-based)	15.61	43.44	25.57	26.93
Average across 4 substates	15.16	42.31	25.29	26.51
Relative Absolute Bias	2.87	2.60	1.11	1.53
MI (design-based)	15.68	43.81	25.14	26.57
Average across 4 substates	15.91	42.82	26.38	27.42
Relative Absolute Bias	1.45	2.27	4.93	3.17
NY (design-based)	12.28	36.29	23.95	24.31
Average across 4 substates	12.19	36.30	24.08	24.40
Relative Absolute Bias	0.76	0.03	0.54	0.38
OH (design-based)	15.83	45.66	28.21	29.21
Average across 4 substates	16.06	44.67	27.70	28.71
Relative Absolute Bias	1.45	2.16	1.83	1.71
PA (design-based)	16.21	42.32	24.97	26.14
Average across 4 substates	16.36	41.74	25.32	26.37
Relative Absolute Bias	0.97	1.39	1.44	0.87
TX (design-based)	12.73	34.49	23.12	23.57
Average across 4 substates	12.39	35.11	23.37	23.80
Relative Absolute Bias	2.74	1.79	1.07	0.98
Average Relative Absolute Bias	2.67	2.05	1.79	1.22

Table E.17 Ratio of Average Widths for Pooled 1999 and 2000 Data

State	Age in Years			Total
State	12–17	18–25	26+	Total
Past Month Use of Marijuana
CA	0.76	0.71	0.75	0.76
FL	0.72	0.76	0.77	0.81
IL	0.62	0.70	0.79	0.74
MI	0.72	0.81	0.73	0.80
NY	0.79	0.70	0.91	0.85
OH	0.67	0.64	0.62	0.67
PA	0.71	0.65	0.65	0.71
TX	0.72	0.72	0.67	0.75
Average	0.71	0.71	0.74	0.76
Past Year Use of Cocaine
CA	0.70	0.66	0.52	0.58
FL	0.53	0.60	0.60	0.64
IL	0.65	0.66	0.46	0.54
MI	0.54	0.58	0.59	0.65
NY	0.46	0.71	0.75	0.79
OH	0.60	0.62	0.61	0.68
PA	0.61	0.59	0.50	0.57
TX	0.62	0.65	0.72	0.71
Average	0.59	0.63	0.59	0.65
Past Month Binge Alcohol Use
CA	0.82	0.76	0.77	0.81
FL	0.71	0.63	0.72	0.73
IL	0.64	0.66	0.70	0.69
MI	0.69	0.75	0.71	0.71
NY	0.74	0.60	0.76	0.77
OH	0.85	0.60	0.75	0.72
PA	0.75	0.59	0.70	0.69
TX	0.79	0.71	0.70	0.72
Average	0.75	0.66	0.73	0.73
Past Month Use of Cigarettes
CA	0.82	0.84	0.65	0.66
FL	0.71	0.74	0.86	0.86
IL	0.67	0.83	0.69	0.69
MI	0.79	0.71	0.73	0.72
NY	0.64	0.76	0.82	0.82
OH	0.72	0.81	0.75	0.75
PA	0.72	0.69	0.81	0.78
TX	0.72	0.74	0.68	0.66
Average	0.72	0.77	0.75	0.74

Note: Ratio = Average width of model-based prediction intervals for substates / Average width of design-based confidence intervals for substates.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.18 1999 NHSDA Weighted Screening and Interview Response Rates, by State

State	Screening Response Rate	Interview Response Rate	Overall Response Rate	State	Screening Response Rate	Interview Response Rate	Overall Response Rate
Total	89.63	68.55	61.44	Missouri	91.32	73.59	67.21
Alabama	92.60	71.36	66.08	Montana	92.76	76.39	70.86
Alaska	91.07	77.20	70.31	Nebraska	89.99	72.05	64.84
Arizona	94.43	65.87	62.21	Nevada	79.89	63.05	50.37
Arkansas	95.71	80.45	77.00	New Hampshire	85.36	69.87	59.65
California	87.47	64.12	56.08	New Jersey	89.65	65.24	58.48
Colorado	91.62	65.84	60.32	New Mexico	96.12	77.77	74.75
Connecticut	85.62	58.60	50.17	New York	84.28	59.98	50.55
Delaware	87.13	58.36	50.85	North Carolina	92.87	71.84	66.72
District of Columbia	93.35	79.93	74.61	North Dakota	89.89	77.48	69.65
Florida	89.94	68.20	61.33	Ohio	90.35	67.78	61.24
Georgia	90.47	66.97	60.59	Oklahoma	91.58	67.79	62.08
Hawaii	89.11	67.61	60.25	Oregon	85.20	71.57	60.98
Idaho	92.93	75.45	70.11	Pennsylvania	92.34	68.99	63.71
Illinois	87.35	63.74	55.68	Rhode Island	86.68	66.72	57.83
Indiana	91.68	73.06	66.98	South Carolina	91.96	65.92	60.61
Iowa	92.44	69.69	64.41	South Dakota	94.35	76.14	71.84
Kansas	90.59	72.89	66.03	Tennessee	90.92	67.70	61.56
Kentucky	92.36	73.75	68.12	Texas	92.57	75.12	69.54
Louisiana	94.81	76.97	72.98	Utah	93.16	81.70	76.11
Maine	89.96	75.18	67.63	Vermont	90.26	74.49	67.24
Maryland	87.78	64.66	56.76	Virginia	89.84	66.28	59.55
Massachusetts	80.59	61.82	49.82	Washington	86.49	75.06	64.92
Michigan	88.21	66.54	58.70	West Virginia	95.59	74.31	71.03
Minnesota	89.46	77.72	69.53	Wisconsin	90.19	73.05	65.89
Mississippi	94.51	82.77	78.23	Wyoming	93.79	72.62	68.11

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table E.19 2000 NHSDA Weighted Screening and Interview Response Rates, by State

State	Screening Response Rate	Interview Response Rate	Overall Response Rate	State	Screening Response Rate	Interview Response Rate	Overall Response Rate
Total	92.84	73.93	68.64	Missouri	92.25	70.80	65.31
Alabama	95.50	77.98	74.47	Montana	94.91	80.21	76.13
Alaska	95.43	80.24	76.58	Nebraska	93.13	74.58	69.46
Arizona	92.99	73.78	68.61	Nevada	92.08	74.44	68.54
Arkansas	97.19	81.00	78.73	New Hampshire	92.41	75.12	69.42
California	90.99	69.50	63.24	New Jersey	91.96	66.56	61.21
Colorado	94.84	75.26	71.37	New Mexico	97.43	80.80	78.72
Connecticut	89.83	71.36	64.10	New York	88.78	73.73	65.46
Delaware	92.91	68.25	63.42	North Carolina	94.51	73.19	69.17
District of Columbia	93.50	85.56	80.00	North Dakota	94.43	79.46	75.03
Florida	94.64	75.73	71.67	Ohio	94.89	75.79	71.92
Georgia	92.95	69.76	64.84	Oklahoma	93.06	74.85	69.66
Hawaii	91.95	78.45	72.14	Oregon	91.87	73.91	67.90
Idaho	93.94	74.45	69.94	Pennsylvania	94.37	73.50	69.36
Illinois	88.71	65.59	58.19	Rhode Island	91.26	74.11	67.63
Indiana	92.62	73.87	68.42	South Carolina	94.69	77.84	73.71
Iowa	94.78	80.00	75.83	South Dakota	95.15	76.67	72.95
Kansas	92.28	73.45	67.79	Tennessee	90.25	72.45	65.39
Kentucky	95.79	84.14	80.59	Texas	94.72	78.12	74.00
Louisiana	95.04	80.81	76.80	Utah	95.11	83.44	79.36
Maine	92.39	78.46	72.49	Vermont	92.62	80.80	74.83
Maryland	94.88	76.88	72.94	Virginia	91.44	75.18	68.75
Massachusetts	89.77	66.45	59.65	Washington	93.59	75.45	70.61
Michigan	93.19	73.18	68.20	West Virginia	95.19	78.17	74.41
Minnesota	94.66	80.62	76.32	Wisconsin	94.33	75.06	70.81
Mississippi	93.60	79.14	74.07	Wyoming	95.41	76.61	73.09

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2000.

Table E.20 2001 NHSDA Weighted Screening and Interview Response Rates, by State

State	Screening Response Rate	Interview Response Rate	Overall Response Rate	State	Screening Response Rate	Interview Response Rate	Overall Response Rate
Total	91.86	73.31	67.34	Missouri	93.12	78.34	72.95
Alabama	92.20	73.31	67.59	Montana	95.08	77.50	73.68
Alaska	96.03	79.62	76.46	Nebraska	94.04	76.47	71.91
Arizona	93.50	76.41	71.44	Nevada	95.32	75.37	71.84
Arkansas	96.70	75.36	72.88	New Hampshire	92.35	76.00	70.18
California	92.46	71.83	66.42	New Jersey	87.52	70.28	61.51
Colorado	94.78	70.64	66.95	New Mexico	97.07	80.81	78.45
Connecticut	92.16	69.79	64.32	New York	84.33	68.67	57.91
Delaware	92.03	69.07	63.57	North Carolina	92.76	72.11	66.89
District of Columbia	86.40	78.30	67.65	North Dakota	94.38	77.62	73.25
Florida	91.15	72.34	65.94	Ohio	93.46	76.51	71.51
Georgia	91.53	70.84	64.84	Oklahoma	93.07	74.69	69.51
Hawaii	91.13	68.17	62.12	Oregon	93.40	77.36	72.25
Idaho	93.83	76.75	72.01	Pennsylvania	93.65	74.97	70.21
Illinois	85.85	64.39	55.28	Rhode Island	90.97	69.70	63.41
Indiana	92.29	69.68	64.31	South Carolina	94.46	71.52	67.55
Iowa	94.00	77.52	72.87	South Dakota	94.13	80.36	75.64
Kansas	94.35	77.32	72.96	Tennessee	94.37	74.43	70.24
Kentucky	94.76	76.62	72.61	Texas	93.00	77.77	72.33
Louisiana	94.47	74.21	70.11	Utah	96.19	80.23	77.18
Maine	90.69	84.36	76.51	Vermont	93.00	80.29	74.67
Maryland	92.45	79.19	73.21	Virginia	91.50	75.20	68.81
Massachusetts	89.99	67.51	60.76	Washington	93.67	74.07	69.38
Michigan	91.28	73.71	67.28	West Virginia	94.34	70.06	66.10
Minnesota	93.10	79.88	74.36	Wisconsin	92.85	70.98	65.91
Mississippi	95.62	73.73	70.50	Wyoming	94.44	76.73	72.46

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2001.

Table E.21 Total Number of Respondents in the Incentive Experiment, by State, for 2001

State	$0	$20	$40	State	$0	$20	$40
Total	4,233	2,489	2,878	Missouri	50	31	40
Alabama	79	45	53	Montana	65	38	69
Alaska	18	10	9	Nebraska	74	23	38
Arizona	63	41	22	Nevada	51	29	75
Arkansas	29	24	10	New Hampshire	91	67	44
California	144	94	93	New Jersey	86	29	30
Colorado	63	54	37	New Mexico	122	25	65
Connecticut	136	66	115	New York	336	209	224
Delaware	120	62	60	North Carolina	26	21	9
District of Columbia	80	54	35	North Dakota	22	17	11
Florida	216	93	142	Ohio	208	106	176
Georgia	28	8	17	Oklahoma	74	58	50
Hawaii	5	11	1	Oregon	68	46	68
Idaho	39	28	23	Pennsylvania	196	103	119
Illinois	313	209	233	Rhode Island	80	48	35
Indiana	7	8	17	South Carolina	71	58	48
Iowa	49	31	29	South Dakota	35	31	41
Kansas	76	42	77	Tennessee	35	36	74
Kentucky	43	25	32	Texas	203	133	90
Louisiana	49	20	17	Utah	80	40	54
Maine	103	42	41	Vermont	21	10	10
Maryland	19	8	15	Virginia	0	0	0
Massachusetts	96	50	55	Washington	75	65	66
Michigan	187	109	157	West Virginia	49	28	39
Minnesota	53	36	24	Wisconsin	0	0	0
Mississippi	43	21	29	Wyoming	57	47	60

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2001.

Table E.22 Total Number of Respondents, by State, for 1999, 2000, and 2001

State	1999	2000	2001	State	1999	2000	2001
Total	66,706	71,764	68,929	Missouri	903	893	882
Alabama	826	936	885	Montana	899	914	896
Alaska	879	833	951	Nebraska	847	906	920
Arizona	824	927	964	Nevada	756	925	944
Arkansas	926	960	911	New Hampshire	791	883	913
California	4,681	5,022	3729	New Jersey	933	1,200	1,069
Colorado	865	911	886	New Mexico	830	874	872
Connecticut	768	891	1055	New York	2,669	3,589	4,023
Delaware	883	928	893	North Carolina	1,167	1043	852
District of Columbia	776	918	877	North Dakota	951	896	883
Florida	3,096	3,478	3502	Ohio	3,234	3,678	3,706
Georgia	1,164	1,145	940	Oklahoma	858	973	862
Hawaii	895	945	887	Oregon	915	864	880
Idaho	943	894	936	Pennsylvania	3,460	3,997	3,734
Illinois	3,201	3,660	3558	Rhode Island	789	950	895
Indiana	1,044	1,061	915	South Carolina	832	855	891
Iowa	907	921	961	South Dakota	936	855	931
Kansas	886	897	922	Tennessee	938	947	921
Kentucky	969	1,018	911	Texas	3,951	4,020	3,604
Louisiana	934	939	909	Utah	1,280	1031	895
Maine	856	901	896	Vermont	802	981	926
Maryland	887	967	961	Virginia	946	1,047	929
Massachusetts	762	1,002	933	Washington	1,070	1,006	911
Michigan	3,109	3,576	3768	West Virginia	910	950	876
Minnesota	1,019	893	883	Wisconsin	1,066	1,119	883
Mississippi	955	917	885	Wyoming	918	828	913

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

¹ The panel included William Bell of the U.S. Bureau of the Census; Partha Lahiri of the Joint Program in Survey Methodology and Interim Director, University of Maryland Statistics Consortium; Balgobin Nandram of Worcester Polytechnic Institute; Wesley Schaible, formerly Associate Commissioner for Research and Evaluation at the Bureau of Labor Statistics; J.N.K. Rao of Carleton University; and Alan Zaslavsky of Harvard University. Other attendees involved in the development or discussion were Ralph Folsom, Judith Lessler, Avinash Singh, and Akhil Vaish of RTI and Joe Gfroerer and Doug Wright of SAMHSA.

SAMHSA, an agency in the Department of Health and Human Services, is the Federal Government's lead agency for improving the quality and availability of substance abuse prevention, addiction treatment, and mental health services in the United States.

* Adobe™ PDF and MS Office™ formatted files require software viewer programs to properly read them. Click here to download these FREE programs now