Appendix B: Statistical Methods and Limitations of the Data: National Household Survey on Drug Abuse, 2000

An important limitation of the NHSDA estimates of drug use prevalence is that they are only designed to describe the target population of the survey, e.g., the civilian noninstitutionalized population aged 12 and older. Although this population includes almost 98% of the total U.S. population aged 12 and older, it does exclude some important and unique subpopulations who may have very different drug-using patterns. The survey excludes active military personnel, who have been shown to have significantly lower rates of illicit drug use. Persons living in institutional group quarters, such as prisons and residential drug treatment centers, are not included in the NHSDA and have been shown in other surveys to have higher rates of illicit drug use. Also excluded are homeless persons not living in a shelter on the survey date, another population shown to have higher than average rates of illicit drug use. Appendix C describes other surveys that provide data for these populations.

The sampling error of an estimate is the error caused by the selection of a sample instead of conducting a census of the population. Sampling error is reduced by selecting a large sample and by using efficient sample design and estimation strategies such as stratification, optimal allocation, and ratio estimation.

With the use of probability sampling methods in the NHSDA, it is possible to develop estimates of sampling error from the survey data. These estimates have been calculated for all prevalence estimates presented in this report using a Taylor series linearization approach that takes into account the effects of the complex NHSDA design features. The sampling errors are used to identify unreliable estimates and to test for the statistical significance of differences between estimates.

Estimates of proportions, such as drug use prevalence rates, take the form of nonlinear statistics where the variances can not be expressed in closed form. Variance estimation for nonlinear statistics is performed using a first-order Taylor series approximation in RTI's SUDAAN software package. The approximation is unbiased for sufficiently large samples and has proven to be at least as accurate and less costly to implement than its competitors such as balanced repeated replication or jackknife methods (Rao and Wu, 1985).

Corresponding to proportion estimates,

, the number of drug users, Y_d , can be estimated as

where

is the estimated population total for domain d, and

is the estimated proportion for domain d. The standard error for the total estimate, is obtained by multiplying the standard error of the proportion by

, i.e

This approach is theoretically correct when the domain size estimates

are among those forced to Census Bureau population projections through the weight calibration process. In these cases,

is clearly not subject to sampling error.

For domain totals Y_d where

is not fixed, this formulation may still provide a good approximation if we can reasonably assume that the sampling variation in

is negligible relative to the sampling variation in

. In most analysis conducted for prior years, this has been a reasonable assumption.

For some of the tables produced from the 2000 data, it was clear that the above approach yielded an underestimate of the variance of a total because

was subject to considerable variation. In these cases, a different method was used to estimate variances. SUDAAN provides an option to directly estimate the variance of the linear statistic which estimates a population total. Using this option did not affect the standard error estimates for the corresponding proportions presented in the same sets of tables.

As was done in the past, direct survey estimates considered to be unreliable due to unacceptably large sampling errors are not shown in this report, and are noted by asterisks (*) in the tables containing such estimates found in the appendices. The criterion used for suppressing all direct survey estimates was based on the relative standard error (rse), which is defined as the ratio of the standard error (se) over the estimate.

Proportion estimates (p) within the range [0<p<1], rates and corresponding estimated number of users were suppressed if:

Using a first-order Taylor series approximation to estimate rse[(-ln(p)] and rse[(-ln(1-p)], we have the following, which was used for computational purposes:

The separate formulae for p < 0.5 and p > 0.5 produces a symmetric suppression rule; that is, if p is suppressed, then so will 1- p. This is an ad hoc rule that requires an effective sample size in excess of 50. When 0.05 <p< 0.95, the symmetric properties of the rule produces a local maximum effective sample size of 68 at p =0.5. Thus, estimates with these values of p along with effective sample sizes falling below 68 are suppressed. A local minimum effective sample size of 50 occurs at p =0.2 and again at p =0.8 within this same interval; so, estimates are suppressed for values of p with effective sample sizes below 50.

In previous NHSDA surveys, these varying sample size restrictions sometimes produced unusual occurrences of suppression for a particular combination of prevalence rates. For example, in some cases, lifetime prevalence rates near p =0.5 were suppressed (effective sample size was less than 68 but greater than 50), while not suppressing the corresponding past year or past month estimates near p = 0.2 (effective sample sizes were greater than 50). To reduce the occurrence of this type of inconsistency, a minimum effective sample size of 68 was added to the suppression criteria in the 2000 NHSDA. As p approaches 0.00 or 1.00 outside the interval (0.05, 0.95), the suppression criteria will still require increasingly larger effective sample sizes. For example, if p=0.01 and 0.001, the effective sample size must exceed 152 and 684, respectively.

Also new to the 2000 survey is a minimum nominal sample size suppression criteria (n=100) that protect against unreliable estimates caused by small design effects and small nominal sample sizes. Prevalence estimates are also suppressed if they are close to zero or 100 percent (i.e., if p < .00005 or if p >.99995).

Estimates of other totals (e.g., number of initiates) along with means and rates (both not bounded between 0 and 1) are suppressed if:

Additionally, estimates of mean age of first use were suppressed if the sample size is smaller than 10 respondents; also, the estimated incidence rate and number of initiates were suppressed if they round to 0.

The suppression criteria for various NHSDA estimates are summarized in Table B.1 below.


Table B.1. Summary of 2000 NHSDA Suppression Rules
Estimate	Suppress if:
Prevalence rate, p, with nominal sample size, n and design effect deff	The estimated prevalence rate, p, is less than 0.00005 or greater than 0.99995, or when p < 0.5, or when p > 0.5, or Effective n < 68, or n < 100 where Note: The rounding portion of this suppression rule for prevalence rates will produce some estimates that round at one decimal place to 0.0% or 100.0% but are not suppressed from the tables.
Estimated Number (Numerator of p)	The estimated prevalence rate, p, is suppressed. Note: In some instances when p is not suppressed, the estimated number may appear as a 0 in the tables; this means that the estimate is greater than 0 but less than 500 (estimated numbers are shown in thousands).
Mean age at first use, , with nominal sample size, n	, or n < 10
Incidence rate,	Rounds to less than 0.1 per thousand person-years of exposure, or
Number of initiates,	Rounds to less than 1000 initiates, or

This section describes the methods that were used to compare the prevalence estimates in this report. Customarily, the observed difference between estimates is evaluated in terms of its statistical significance. "Statistical significance" refers to the probability that a difference as large as that observed would occur due to random error in the estimates if there were no difference in the prevalence rates for the population groups being compared. The significance of observed differences in this report is generally reported at the 0.05 and 0.01 levels. When making comparisons between the 1999 and 2000 prevalence estimates, one can test the null hypotheses (no difference in the 1999 and 2000 prevalence rates) against the alternative hypothesis (there is a difference in prevalence rates) using the standard difference in proportions test expressed as

where p₁ = 1999 estimate, p₂ = 2000 estimate, var(p₁) = variance of 1999 estimate, var(p₂) = variance of 2000 estimate, and cov(p₁,p₂) = covariance between p₁ and p₂.

Under the null hypothesis, Z is asymptotically distributed as a normal random variable. Calculated values of Z can therefore be referred to as the unit normal distribution to determine the corresponding probability level (i.e., p-value). Since there is a 50 percent overlap in the sampled segments between the 1999 and 2000 NHSDAs, the covariance term in the formula for Z will, in general, be greater than zero. Estimates of Z along with its p-value were calculated using RTI's (Research Triangle Institute) SUDAAN, using the analysis weights and accounting for the sample design as described in Appendix A. A similar procedure and formula for Z are used for estimated totals.

When making comparisons of estimates for different population subgroups from the same data year, the covariance term, which is usually small and positive, was ignored. This results in somewhat conservative tests of hypotheses that sometimes fail to establish statistical significance when in fact it exists.

Nonsampling errors can occur from nonresponse, coding errors, computer processing errors, errors in the sampling frame, reporting errors, and other errors not due to sampling. Nonsampling errors are reduced through data editing, statistical adjustments for nonresponse, close monitoring and periodic retraining of interviewers, and improvement in various quality control procedures.

Although nonsampling errors can often be much larger than sampling errors, measurement of most nonsampling errors is difficult or impossible. However, some indication of the effects of some types of nonsampling errors can be obtained through proxy measures such as response rates and from other research studies.

Response rates for the NHSDA were stable for the period of 1994-1998, with the screening response rate at about 93% and the interview response rate at about 78% (response rates discussed in this Appendix are weighted). In 1999, the CAI screening response rate was 89.6% and the interview response rate was about 68.6%. A more stable and experienced field interviewer workforce improved these rates in 2000. Of the 182,576 eligible households sampled for the 2000 NHSDA main study, 169,769 were successfully screened for a weighted screening response rate of 92.8% (Table B.2). In these screened households, a total of 91,961 sample persons were selected, and completed interviews were obtained from 71,764 of these sample persons, for a weighted interview response rate of 73.9%. A total of 10,109 (15.0%) sample persons were classified as refusals, 4,834 (5.5%) were not available or never at home, and 5,254 (5.5%) did not participate for various other reasons, such as physical or mental incompetence or language barrier (Table B.3). Tables B.4 and B.5 show the distribution of the selected sample by interview code and age group. The weighted interview response rate was highest among 12 to 17 year olds (82.6%), females (75.1%), blacks and Hispanics (76.2% and 78.0% respectively), in non-metropolitan areas (77.6%), and among persons residing in the South (76.4%) (Table B.6).

The increase in nonresponse between the 1998 and 1999 NHSDAs can be attributed primarily to the hiring of many new and inexperienced Field Interviewers in 1999 and a larger than usual turnover. By the end of 2000, the interviewer workforce primarily consisted of experienced interviewers and fewer were leaving for other jobs. In 1999, there were 1,997 Field Interviewers hired and trained to conduct the computer-assisted interviewing (CAI) and paper and pencil interviews (PAPI) surveys. More than a third of them did not complete the survey year (37.7%). In 2000, the number of trained interviewers decreased to 1356 (since only CAI interviews were conducted in 2000), and the attrition rate dropped to 29.8%. Both prior NHSDA experience and on-the-job experience were shown to be related to nonresponse. Previously experienced interviewers and interviewers with one, two, or three quarters of on-the-job experience were more successful at obtaining an interview.

The overall weighted response rate, defined as the product of the weighted screening response rate and weighted interview response rate, was 61.5% in 1999 and 68.6% in 2000 (an 11.5 percent improvement over the 1999 rate). Nonresponse bias can be expressed as the product of the response rate (R) and the difference between the characteristic of interest between respondents and nonrespondents in the population (P_r - P_nr). Thus, assuming the quantity (P_r -P_nr) is fixed over time, the improvement in response rates in 2000 will result in estimates with lower nonresponse bias.

Among survey participants, item response rates were above 98% for most questionnaire items. However, inconsistent responses for some items, including the drug use items, are common. Estimates of substance use from the NHSDA are based on the responses to multiple questions by respondents, so that the maximum amount of information is used in determining whether a respondent is classified as a drug user. Inconsistencies in responses are resolved through a logical editing process that involves some judgment on the part of survey analysts and is a potential source of nonsampling error. Because of the automatic routing through the CAI questionnaire (e.g., lifetime drug use questions which skip entire modules when answered "no"), there is less editing of this type than in the PAPI questionnaire used in previous years.

Table B.2. Weighted Percent and Sample Size for 1999 and 2000 NHSDAs by Screening Result Code


Screening Result	1999 NHSDA		2000 NHSDA
Screening Result	Sample Size	Weighted Percent	Sample Size	Weighted Percent
Total Sample	223,868	100.00	215,860	100.00
Ineligible Cases	36,026	15.78	33,284	15.09
Eligible Cases	187,842	84.22	182,576	84.91
Ineligibles	36,026	100.00	33,284	100.00
Vacant	18,034	49.71	16,796	50.76
Not a Primary Residence	4,516	12.90	4,506	13.26
Not a Dwelling Unit	4,626	12.70	3,173	9.33
All Military Personnel	482	1.22	414	1.21
Other, Ineligible	8,368	23.46	8,395	25.43
Eligible Cases	187,842	100.00	182,576	100.00
Screening Complete	169,166	89.63	169,769	92.84
No One Selected	101,537	54.19	99,999	55.36
One Selected	44,436	23.63	46,981	25.46
Two Selected	23,193	11.82	22,789	12.03
Screening Not Complete	18,676	10.37	12,807	7.16
No One Home	4,291	2.38	3,238	1.82
Respondent Unavailable	651	0.36	415	0.24
Physically or Mentally Incompetent	419	0.24	310	0.16
Language Barrier - Hispanic	102	0.06	83	0.05
Language Barrier - Other	486	0.28	434	0.27
Refusal	11,097	5.92	7,535	4.14
Other, Access Denied	1,536	1.08	748	0.45
Other, Eligible	38	0.02	7	0.00
Other, Problem Case	56	0.03	37	0.02

Table B.3. Weighted Percent and Sample Sizes for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 12 or Older


Final Interview Code	1999 NHSDA		2000 NHSDA
Final Interview Code	Sample Size	Weighted Percent	Sample Size	Weighted Percent
Total Selected Persons	89,883	100.00	91,961	100.00
Interview Complete	66,706	68.55	71,764	73.93
No One at Dwelling Unit	1,795	2.13	1,776	2.02
Respondent Unavailable	3,897	4.53	3,058	3.52
Break-Off	50	0.07	72	0.09
Physically/Mentally Incompetent	1,017	2.62	1,053	2.57
Language Barrier - Spanish	168	0.12	109	0.08
Language Barrier - other	480	1.46	441	1.06
Refusal	11,276	17.98	10,109	14.99
Parental Refusal	2,888	1.01	2,655	0.88
Other	1,606	1.53	924	0.86

Table B.4. Weighted Percent and Sample Sizes for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 12 to 17


Final Interview Code	1999 NHSDA		2000 NHSDA
Final Interview Code	Sample Size	Weighted Percent	Sample Size	Weighted Percent
Total Selected Persons	32,011	100.00	31,242	100.00
Interview Complete	25,384	78.07	25,756	82.58
No One at Dwelling Unit	322	1.09	278	0.86
Respondent Unavailable	872	3.04	617	2.05
Break-Off	13	0.03	18	0.05
Physically/Mentally Incompetent	244	0.76	234	0.76
Language Barrier - Spanish	15	0.03	10	0.03
Language Barrier - other	58	0.18	50	0.20
Refusal	1,808	5.97	1,455	4.52
Parental Refusal	2,885	9.50	2,641	8.35
Other	410	1.33	183	0.59

Table B.5. Weighted Percent and Sample Size for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 18 or Older


Final Interview Code	1999 NHSDA		2000 NHSDA
Final Interview Code	Sample Size	Weighted Percent	Sample Size	Weighted Percent
Total Selected Persons	57,872	100.00	60,719	100.00
Interview Complete	41,322	67.41	46,008	72.92
No One at Dwelling Unit	1,473	2.25	1,498	2.16
Respondent Unavailable	3,025	4.71	2,441	3.69
Break-Off	37	0.07	54	0.09
Physically/Mentally Incompetent	773	2.85	819	2.78
Language Barrier - Spanish	153	0.13	99	0.09
Language Barrier - other	422	1.62	391	1.16
Refusal	9,468	19.41	8,654	16.22
Parental Refusal	3	0.00	14	0.01
Other	1,196	1.55	741	0.89

Table B.6. Response Rates and Sample Sizes for the 1999 and 2000 NHSDAs by Demographic Characteristics


	1999 NHSDA			2000 NHSDA
	Selected Persons	Completed Interviews	Weighted Response Rate	Selected Persons	Completed Interviews	Weighted Response Rate
Total	89,883	66,706	68.55%	91,961	71,764	73.93%
Age
12-17	32,011	25,384	78.07%	31,242	25,756	82.58%
18-25	30,439	22,151	71.21%	29,424	22,849	77.34%
26 or Older	27,433	19,171	66.76%	31,295	23,159	72.17%
Gender
Male	43,883	31,987	67.12%	44,899	34,375	72.68%
Female	46,000	34,719	69.81%	47,062	37,389	75.09%
Race/Ethnicity
Hispanic	11,203	8,755	74.59%	11,454	9,396	77.95%
Non-Hispanic, White	63,211	46,272	67.98%	64,517	49,631	73.39%
Non-Hispanic, Black	10,552	8,044	70.39%	10,740	8,638	76.19%
Non-Hispanic, All Other Races	4,917	3,635	59.28%	5,250	4,099	67.31%
Region
Northeast	16,794	11,830	64.03%	18,959	14,394	71.68%
Midwest	24,885	18,103	69.63%	25,428	19,355	73.23%
South	27,390	21,018	70.93%	27,217	22,041	76.38%
West	20,814	15,755	67.47%	20,357	15,974	72.68%
County Type
Large Metro	36,101	25,901	65.15%	37,754	28,744	71.77%
Small Metro	30,642	22,612	69.98%	31,400	24,579	74.96%
Nonmetro	23,140	18,193	74.97%	22,807	18,441	77.58%

In addition, less logical editing is used because with the CAI data, statistical imputation is relied upon more heavily to determine the final values of drug use variables in cases where there is the potential to use logical editing to make a determination. The combined amount of editing and imputation in the CAI data is still considerably less than the total amount used in prior PAPI surveys. For the 2000 CAI data, for example, 3.2% of the estimate of past month hallucinogen use is based on logically edited cases and 5.4% on imputed cases, for a combined amount of8.6%. For the 1999 CAI data, 1.7% of the estimate of past month hallucinogen use is based on logically edited cases and 4.6% on imputed cases, for a combined amount of 6.2%. In the 1998 NHSDA (administered using PAPI), the amount of editing and imputation for past month hallucinogen use was 60% and 0%, respectively, for a total of 60%. The combined amount of editing and imputation for the estimate of past month heroin use is 5.0% for the 2000 CAI, 14.8% for the 1999 CAI, and 37.0% for the 1998 PAPI data.

While working on the 2000 NHSDA imputations, a programming error was discovered in the 1999 imputations of recency of use, frequency of use, and age at first use for several drugs. This error resulted in overestimates of past year and past month use of marijuana, inhalants, heroin, and alcohol. Thus, estimates such as past month any illicit drug use and use of any illicit drug other than marijuana were also affected. The error was limited to cases which did not have complete recency information, where it was necessary to maintain consistency between the 30-day frequency and 12-month frequency data during the imputation process. This error did not affect lifetime use measures. Because of the sequential nature of the imputation procedures (i.e., imputed values for a substance processed early are used subsequently in the imputation of data on other substances), it was necessary to reimpute recency of use, frequency of use, and age at first use measures for all substances. Rerunning the imputations for all substances provided the opportunity to employ several minor enhancements to the imputation procedure that had been developed for the 2000 data, thereby improving consistency between the 1999 and 2000 estimates. Due to these enhancements and the random nature of the imputation process, the revised 1999 substance use estimates are slightly different from those previously published for all substances. Below is a discussion of how the error was discovered and the corrective actions that were taken. More information about the statistical imputation procedures used in the NHSDA data can be found in Appendix A. A more complete discussion of the imputation error can be found in the 1999 NHSDA Methodological Resource Book, Section 4.

New quality control checks were instituted on the 2000 imputations of substance use variables. These checks were also applied to the 1999 data, revealing unusual imputation results for alcohol, marijuana, inhalant, and heroin use variables. Results showed that a large proportion of respondents who were known lifetime users, but had missing recency information, had been imputed to be past month and past year users. Further checking of computer programs involved in the imputation of these variables identified the error.

If a respondent is a past month user of one of these four substances, he or she should have values for frequency of use in the past month and in the past year. Legitimate values for users are 1 to 30 for past month frequency and 1 to 365 for past year frequency. (For the 12-month frequency, the variable that is actually used in the imputation of missing values is the proportion of the past year that the donor used a particular drug.) However, if the respondent is a user of a substance in the past year but not the past month, he or she would not have a value for the 30-day frequency of use variable. Moreover, respondents who did not use a substance in the past year would not have values for either of the frequency of use variables. Before the NHSDA imputation programs are run, the editing procedures assign "skip" codes for the frequency of use variables for these respondents for whom frequency information is not present: a "93" for the 30-day frequency variables and a "993" for the 12-month frequency variables.

For NHSDA respondents with missing values for certain key items (such as recency and frequency of substance use), the imputation procedure involves defining a "donor pool" which consists of respondents with complete data that can be "donated" to the respondents with missing data. This process is done within subgroups of users based on the amount of information that is known. For example, respondents with missing data on lifetime use of a substance draw from a donor pool that includes both users and nonusers, but respondents who are known to be lifetime users but have unknown recency draw from a donor pool of lifetime users, excluding the nonusers. For many of the substance use measures, the imputation is multivariate, meaning that a respondent with more than one item missing will receive imputed values for all those missing items from a single donor.

The donor pool for respondents whose recency is not completely known should consist of respondents with a variety of values for recency and frequency of use, including skip codes for frequency of use where applicable. For example, if a respondent is a lifetime user of marijuana but past year and past month use information is missing, donors consist of the following possibilities.

One of the constraints built into the imputation programs is to make sure that each respondent's 12-month frequency of use is greater than his or her 30-day frequency, provided he or she is a past month user. Thus, potential donors are checked to make sure that when their frequency-of-use information is donated to a respondent with missing data, it is consistent with pre-existing frequency of-use-data for that respondent. The error resulted from implementing this check across all potential donors, regardless of their recency of use. As a result, missing data values were incorrectly applied in comparisons that were designed to work only with valid frequency of use values. Many potential donors that were past year but not past month users were excluded from the donor pool because their past year frequency was less than 93, the skip code for 30-day frequency of use. Even more significant, potential donors who were lifetime but not past year users were entirely excluded from the donor pool because the proportion of the past year that the donor used for these cases was correctly coded to a missing value. The donated 12-month frequency that was derived from this proportion was therefore also missing. These missing values were then compared with the past month frequency skip code (93) and determined to be smaller by the software used (SAS). The result of these donor pool restrictions was that for respondents who were known lifetime users of any of the four drugs but had missing information on recency of use, the imputation procedure applied a donor pool made up entirely of past year users, most of whom were past month users.

Table B.7. Comparison of Original And Revised Estimates of Percentages Reporting Past Year and Past Month Use of Illicit Drugs and Alcohol Among Persons Aged 12 or Older: 1999



	Past Year		Past Month
Drug	1999 Original	1999 Revised	1999 Original	1999 Revised
Any Illicit Drug¹	11.9	11.5	6.7	6.3
Marijuana and Hashish	8.9	8.6	5.1	4.7
Heroin	0.2	0.2	0.1	0.1
Inhalants	1.1	0.9	0.5	0.3
Any Illicit Drug Other Than Marijuana¹	6.3	6.1	2.9	2.7
Alcohol	62.6	62.3	47.3	46.4
Binge Use	-	-	20.2	20.2
Heavy Use	-	-	5.6	5.7

Table B.8. Comparison of Original And Revised 1999 Estimates of Percentages Reporting Past Year and Past Month Use of Illicit Drugs and Alcohol Among Persons Aged 12 to 17: 1999



	Past Year		Past Month
Drug	1999 Original	1999 Revised	1999 Original	1999 Revised
Any Illicit Drug¹	20.3	19.8	10.9	9.8
Marijuana and Hashish	14.4	14.2	7.7	7.2
Heroin	0.3	0.3	0.2	0.2
Inhalants	4.6	3.9	1.9	1.1
Any Illicit Drug Other Than Marijuana¹	12.0	11.6	5.3	4.5
Alcohol	34.9	34.1	18.6	16.5
Binge Use	-	-	10.9	10.1
Heavy Use	-	-	2.5	2.4

-- Not available.
¹ Any Illicit Drug indicates use at least once of marijuana/hashish, cocaine (including crack), heroin, hallucinogens (including LSD and PCP), inhalants, or any prescription-type psychotherapeutic used nonmedically. Any Illicit Drug Other Than Marijuana indicates use at least once of any of these listed drugs, regardless of marijuana/hashish use; marijuana/hashish users who also have used any of the other listed drugs are included.
² Nonmedical use of any prescription-type pain reliever, tranquilizer, stimulant, or sedative; does not include over-the-counter drugs.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 CAI.

In the revised programs for the multivariate imputation of recency and frequency of use, the consistency constraints that are applied depend upon the recency of use of the potential donor. Hence, donors who are past month users have one set of consistency constraints applied, past year but not past month users have another set, and lifetime but not past year users have yet another set.

Tables B.7 and B.8 present the 1999 estimates before the error was corrected (original) and after the correction (revised). These original estimates are presented in the 1999 NHSDA Summary of Findings Report (SAMHSA, 2000c); the revised 1999 estimates are included in this report. As expected, most revised estimates are lower than the original estimates. Measures with the most notable decrease were past year and past month use of inhalants, particularly among adolescents. For example, past year inhalant use among persons aged 12 to 17 decreased from 4.6 percent to 3.9 percent (Table B.8).

NHSDA estimates are based on self-reports of drug use, and their value depends on respondents' truthfulness and memory. Although many studies have generally established the validity of self-report data and the NHSDA procedures were designed to encourage honesty and recall, some degree of underreporting is assumed. No adjustment to NHSDA data is made to correct for this (Appendix D lists a number of references addressing the validity of self-reported drug use data). The methodology used in the NHSDA has been shown to produce more valid results than other self-report methods (e.g., by telephone) (Turner, Lessler, and Gfroerer 1992; Aquilino 1994). However, comparisons of NHSDA data with data from surveys conducted in classrooms suggest that underreporting of drug use by youth in their homes may be substantial (Gfroerer 1993; Gfroerer, Wright, and Kopstein 1997).

While the redesign has improved the NHSDA estimates of substance use prevalence, it also made it difficult to assess long-term trends. Because of the major differences between the CAI and PAPI methods, it is not appropriate to compare the 1999 or 2000 CAI estimates of substance use prevalence to earlier NHSDA estimates to assess changes over time in substance use. To assess trends, SAMHSA fielded a supplemental national sample employing the PAPI methodology in 1999. This sample of 13,809 persons employed a paper questionnaire that was identical to the one fielded in 1998. Weighting, editing, and imputation procedures were also conducted in a manner comparable to prior years' surveys.

In spite of the efforts taken to maintain total methodological comparability, analyses have suggested that the 1999 PAPI data are not comparable to earlier data. Investigations into possible problems related to data collection, response rates, Quarter 1 startup, weighting, editing and imputation were done to see if any procedural changes or errors may underlie the problem. While no technical problems or obvious causes associated with these factors have been discovered, one line of inquiry was to investigate possible interviewer experience effects. That study shows that respondents were more likely to report substance use in interviews conducted by inexperienced interviewers than by experienced interviewers. Differences were found in prevalence rates based on data collected by experienced and inexperienced interviewers. Because of the expansion of the sample, a significantly larger proportion of the interviewers in 1999 were inexperienced than in prior years. Also observed was a decline in substance use rates over time (within 1999) that seemed to be correlated only with the growing experience of interviewers.

The impact on prevalence estimates is large enough that comparisons of the 1999 PAPI estimates to estimates from earlier NHSDAs should not generally be included to describe long-term trends. However, based on analysis of statistical models that account for the effect of interviewer experience, adjustments to 1999 PAPI data (in the form of revised analysis weights) have been developed for a limited set of key trend measures of interest. Analysis of the CAI sample discussed in this Appendix indicates smaller interviewer experience effects.

In view of the large discrepancies between the distributions of the interviewer characteristics over the two years, the bounds on the poststratification adjustment factor had to be broadened to keep the same set of covariates in the model in addition to the new interviewer experience covariates. As a result, the realized design effect for the total sample increased from 3.01 to 5.77 because, on average, the adjusted weights were about twice as large as the original weights for the prior NHSDA experience interviewer data while being cut in half for data corresponding to interviewers with no prior NHSDA experience.

In the 1999 NHSDA Summary of Findings Report (SAMHSA, 2000c), it was reported that the large change in the distribution of experienced and inexperienced Field Interviewers (FI) between the 1998 and 1999 surveys was associated with unanticipated and unusually large increases in substance use rates for data collected using the paper and pencil interview (PAPI) method. The report also found that data collected from interviewers with prior NHSDA experience resulted in drug use rates that were significantly lower than rates based on data collected from interviewers with no prior NHSDA experience. As a result, the 1999 PAPI estimates presented in the above SAMHSA report were based on analysis weights that were adjusted to measures representing the 1998 FI experience distribution.

Along with fielding PAPI data, the 1999 NHSDA marked the beginning of the use of computer-assisted interviewing (CAI) methods to solicit data from over 66,000 respondents in 50 states and the District of Columbia that year. This section will focus on the analysis of 1999 and 2000 CAI data to determine the impact of FI experience on drug use estimates (PAPI data were not collected in 2000). Overall, it was found these interviewer effects still remain although not as pronounced as found in the PAPI data. Based on these findings, it was not necessary to adjust the CAI analysis weights as was done with the 1999 PAPI data.

Similar to analyses of the 1998 and 1999 PAPI data, Field Interviewer experience for 1999 and 2000 CAI data was defined two different ways: 1) a two level overall experience variable (no prior NHSDA experience, some prior NHSDA experience) and, 2) by interview order, which is a measure of experience level over the course of the survey year (i.e., 1=first interview conducted, 100=100th interview conducted). Here, an interview order was defined in terms of a five level variable is used (1-19, 20-39, 40-59, 60-99, and 100+). For the 1999 CAI, interviewers with no experience were simply those who did not have NHSDA experience prior to the 1999 survey. For the 2000 survey, interviewers with no experience were those who did not have NHSDA experience prior to 1999 and did not complete any interviews in 1999; thus, until the 2000 survey, these individuals did not have any experience collecting NHSDA data. Tables B.9 and B.10 present the distribution of CAI Field Interviewers and interviews in 1999 and 2000 according to interviewer experience. Over 86 percent of the 1999 interviewer workforce had no prior NHSDA experience, and they were responsible for about 78 percent of the 66,706 completed interviews. In contrast, less than 28 percent of the 2000 interviewer workforce had no prior NHSDA experience, collecting data from less than 15 percent of the 71,764 completed interviews. The large number of inexperienced interviewers in 1999 was due to extensive hiring to work the sample which had expanded threefold from 1998. Note that over half of the interviews were conducted by FIs before their 40^th interview in either survey year. Table B.11 (which is the weighted version of Table B.10) show results similar to Table B.10. Overall, the 1999 FI workforce and collected data were dominated by inexperienced interviewers, while the opposite was true in 2000.

Tables B.12 and B.13 compare 1999 CAI and PAPI weighted estimates of lifetime use of any illicit drug and nonmedical use of any psychotherapeutic drug by prior interviewer experience and interview order. Both the 1999 PAPI and 1999 CAI estimates show a decreasing trend as the interview order increases; also, estimates within a given year and interview order were higher among interviewers with no prior NHSDA experience than among those with some experience. However, the decline among PAPI interviewers was generally larger than among CAI interviewers. For example, among PAPI interviewers, the percent change in rates of lifetime use of any nonmedical psychotherapeutic drug decreased overall by 38.8 percent between the 1-19 and 100+ interview order group (from 13.4 percent to 8.2 percent) (Table B.13). In comparison, estimates from the same interview order groups in the CAI declined by 15.8 percent (from 15.8 percent to 13.3 percent). Estimates of lifetime use of any illicit drug also declined for both PAPI and CAI overall, although at a slower rate between the lowest and highest interview order groups among CAI interviewers.

Using the same two drug measures, Table B.14 contains prevalence rates from the 2000 survey as a function of interview order and experience. Parallel to what was observed from the 1999 PAPI and CAI data, there appears to be an inverse relationship between interview order and drug use rates.

To investigate the effects of adjusting for interview experience on various measures of change, a logistic regression model was used with the results shown as odds ratios. RTI's (Research Triangle Institute) SUDAAN was employed and the analysis weights were used in both years. The sample structure was represented using standard NHSDA analysis NEST statements for variance strata and variance replicates. The drug use measures modeled were lifetime, past year, and past month use of any illicit drug, marijuana, and nonmedical use of any psychotherapeutic drug (Table B.15). In these models, the response variable was a dichotomous measure of drug use (1=yes, 0=no). Odds ratios that are in bold and less than 1 for the "changefrom 1999 to 2000" effect indicate that 2000 estimates are significantly lower than the 1999 estimates; other odds ratios shown in bold are statistically significant from the reference class (at the "=0.05 level of significance). Results are shown before and after the adjustment for covariates. The covariates used are the following: (1) year (1999, 2000); (2) prior interviewer experience (no NHSDA experience, some NHSDA experience); (3) interview order (1-19, 20-39, 40-59, 60-99, and 100+); (4) age of respondent (12-17, 18-25, 26-34, 35+); (5) census region (Northeast, North Central, South, and West); (6) gender of respondent; (7) race/ethnicity of respondent (Hispanic, Non-Hispanic black, and Non-Hispanic, all other races), and (8) population density (1 million or more persons in a Metropolitan Statistical Area (MSA), 250,000 to 999,999 persons in an MSA, less than 250,000 persons in an MSA, persons not in an MSA and not in a rural area; and persons not in an MSA and in a rural area).

Odds ratios that are in bold and less than 1 for the "change from 1999 to 2000" effect indicate that 2000 estimates are significantly lower than the 1999 estimates; other odds ratios shown in bold are statistically significant from the reference class (at the a=0.05 level of significance). Table B.15 shows the unadjusted odds ratio for the "change from 1999 to 2000" to be, in general, similar to the model odds ratio which controls for demographics, prior interviewer experience, and interviewer order. Most notable are odds ratios which are generally lower for experienced interviewers compared to those with no prior experience. However, compared to the PAPI analysis (using exactly the same model on the 1998 and 1999 PAPI data), the CAI odds ratios comparing experienced to inexperienced interviewers are much closer to 1.00. For example, the PAPI odds ratios for nonmedical use of any psychotherapeutics drug during the lifetime and past month were 0.69 and 0.59 (statistically significant), respectively (SAMHSA, 2000c), compared to 0.85 (statistically significant) and 1.02 (not statistically significant), respectively for CAI. Statistically significant odds ratios for any illicit and marijuana lifetime use from the PAPI data where also lower, ranging from 0.84 to 0.90 compared to 0.88 to 0.92 from the CAI data.

Table B.16 shows results from age-specific models for lifetime and past month any illicit substance use. Results for marijuana (not shown) are similar to results for any illicit substance. Except for the elimination of age, the same covariates are used as the model used in Table B.15. As before, results are shown before and after adjustment for demographics, prior interviewer experience, and interview order. Similarly, across age groups, the adjustment does not significantly change the magnitude of the year to year change. Compared to the 1999 PAPI analysis, the odds ratios for Field Interviewers with some NHSDA experience were generally higher in the CAI interviewing environment (although still below 1.00).

In order to examine more directly the effect the more experienced field interviewer workforce in 2000 would have on the 1999 estimates, and subsequently trends, the analysis weights in the 1999 CAI were adjusted (in Table B.17 in this appendix only). More specifically, the 1999 analysis weights were adjusted by introducing additional controls from the 2000 survey into the poststratification step of the 1999 weighting process. The additional control totals were derived by using the 2000 weighted distribution as shown in Table B.11 (i.e., 86.0% with prior NHSDA experience vs. 14.0% with no prior NHSDA experience; 30.0% with interview number 1-19, 55.2% in the category 20-99, and 14.7% in the 100+ category). Since the 2000 control totals for FI experience were so different from the observed ones for 1999 CAI,it required a drastic weight adjustment, and resulted in a three-fold increase the design effect due to unequal weighting (from 4.6 before adjustment to 15.9 after adjustment). On average, the adjusted weights were about 3.5 times larger than original weights for the prior NHSDA experience interviewer data, while being cut by a factor of 0.3 for data corresponding to interviewers with no prior NHSDA experience. Table B.17 presents past month use of various illicit drugs, alcohol, and tobacco for 1999 (adjusted and unadjusted for interviewer experience) and 2000. As with the unadjusted 1999 estimates, the results of this interviewer experience adjustment show very few statistically significant differences between the adjusted 1999 and 2000 estimates. Statistical significance between the adjusted 1999 and 2000 estimates and the unadjusted 1999 and 2000 estimates occurred among different characteristics. However, the direction of the change (statistically significant or not) was consistent. For example, for binge alcohol use among persons aged 12 or older, there is a statistically significant increase between the adjusted 1999 estimate (19.3 percent) and the 2000 estimate (20.6 percent). The unadjusted 1999 estimate was 20.2 percent which, while not statistically different from the 2000 estimate, was lower in magnitude. Similar occurrences can be seen for cocaine use (aged 18 and over), heroin use (aged 12 to 17), use of pain relievers (aged 12-17), binge alcohol use (aged 18 and over) and cigarette use (aged 12 to 17).

The analysis presented here indicates that the uneven mix of experienced and inexperienced NHSDA field interviewers between 1999 and 2000 had some effect on estimated drug use rates in 1999, 2000, and the trend. Overall, the 1999 and 2000 CAI rates of decline are smaller in magnitude than the 1999 PAPI rates of decline, which is an indication that the CAI methods are playing a role in reducing the effects of FI experience on substance use rates. However, because the mechanism of these effects is unknown, additional studies will be undertaken to increase our understanding this phenomenon. In the meantime, analyses of interviewer effects as seen in this Appendix will continue to be presented in subsequent reports.

These findings have resulted in added emphasis being placed in training and in the field to encourage experienced and new FI's to follow the interview protocol.

Table B.9. Unweighted Distribution of Interviewers by Field Interviewer Experience: 1999 and 2000 CAI


Prior Interviewer NHSDA Experience	CAI Interviewers
	1999		2000
	No.	%	No.	%
None	1544	86.4	368	27.5
Some	243	13.6	968	72.5
Total	1787	100.0	1336	100.0


Table B.10. Unweighted Distribution of CAI Interviews by Interview Order and Prior Interviewer Experience: 1999 and 2000 CAI
Interview Order	1999 CAI					2000 CAI
	No Prior NHSDA		Some Prior NHSDA		Total	No Prior NHSDA		Some Prior NHSDA		Total
	No.	%	No.	%	%	No.	%	No.	%	%
1-19	18,713	28.1	2,999	4.5	32.6	5,036	7.0	15,744	21.9	29.0
20-39	12,088	18.1	2,656	4.0	22.1	2,633	3.7	13,143	18.3	22.0
40-59	7,902	11.9	2,262	3.4	15.2	1,276	1.8	10,163	14.2	15.9
60-99	8,505	12.8	3,076	4.6	17.4	1,126	1.6	12,244	17.1	18.6
100 +	5,114	7.7	3,391	5.1	12.8	426	0.6	9,973	13.9	14.5
Subtotals	52,322	78.4	14,384	21.6	100.0	10,497	14.6	61,267	85.4	100.0
Total	66,706					71,764


Table B.11. Weighted Distribution of CAI Interviews by Interview Order and Prior Interviewer Experience (Numbers in Thousands): 1999 and 2000 CAI
Interview Order	1999 CAI					2000 CAI
	No Prior NHSDA		Some Prior NHSDA		Total	No Prior NHSDA		Some Prior NHSDA		Total
	No. (000)	%	No. (000)	%	%	No. (000)	%	No. (000)	%	%
1-19	66,339	30.0	14,760	6.7	36.7	15,335	6.9	51,724	23.2	30.0
20-39	39,169	17.7	12,646	5.7	23.4	7,957	3.6	38,896	17.4	21.0
40-59	22,925	10.4	8,582	3.9	14.3	3,376	1.5	31,086	13.9	15.4
60-99	22,507	10.2	11,166	5.1	15.2	3,361	1.5	38,677	17.3	18.8
100 +	12,416	5.6	10,613	4.8	10.4	1,259	0.6	31,610	14.2	14.7
Subtotals	163,355	73.9	57,768	26.1	100.0	31,287	14.0	191,993	86.0	100.0
Total	221,123					223,280


Table B.12. Percent Reporting Lifetime Use of Any Illicit Drug by Interview Order and Prior Interviewer Experience: 1999 PAPI and 1999 CAI
Interview Order	1999 PAPI			1999 CAI
Interview Order	No Prior NHSDA	Some Prior NHSDA	All Interviews	No Prior NHSDA	Some Prior NHSDA	All Interviews
1-19	39.9	36.3	39.3	41.5	39.5	41.1
20-39	40.3	41.8	40.7	40.8	39.4	40.5
40-59	38.0	37.7	37.9	38.9	35.4	38.0
60-99	37.7	37.8	37.7	40.7	34.8	38.7
100 +	35.7	30.6	33.8	37.1	35.8	36.5
All Interviews	38.9	37.1	38.5	40.5	37.3	39.7
% Change from 1-19 to 100+ Interviews	-10.5%	-15.7%	-14.0%	-10.6%	-9.4%	-11.2%


Table B.13 Percent Reporting Lifetime Nonmedical Use of Any Psychotherapeutic Drug by Interview Order and Prior Interviewer Experience: 1999 PAPI and 1999 CAI
Interview Order	1999 PAPI			1999 CAI
Interview Order	No Prior NHSDA	Some Prior NHSDA	All Interviews	No Prior NHSDA	Some Prior NHSDA	All Interviews
1-19	13.3	13.8	13.4	16.0	14.8	15.8
20-39	11.9	10.9	11.7	16.5	16.4	16.5
40-59	12.7	7.2	11.1	15.6	11.4	14.4
60-99	10.6	8.5	10.0	16.2	13.3	15.2
100 +	9.2	6.7	8.2	14.2	12.2	13.3
All Interviews	12.0	9.7	11.4	16.0	13.9	15.4
% Change from 1-19 to 100+ Interviews	-30.8%	-51.4%	-38.8%	-11.3%	-17.6%	-15.8%


Table B.14. Percent Reporting Lifetime Use of Any Illicit and Nonmedical Use of Any Psychotherapeutic Drug by Interview Order and Prior Interviewer Experience: 2000 CAI
Interview Order	2000 CAI Any Illicit Drug			2000 CAI Nonmedical Use of Any Psychotherapeutic
Interview Order	No Prior NHSDA	Some Prior NHSDA	All Interviews	No Prior NHSDA	Some Prior NHSDA	All Interviews
1-19	42.9	40.9	41.4	18.4	15.7	16.3
20-39	40.0	38.7	38.9	17.1	14.8	15.2
40-59	43.5	35.9	36.6	15.3	11.9	12.2
60-99	45.7	38.1	38.7	13.5	13.7	13.7
100 +	34.0	36.8	36.7	10.8	13.4	13.3
All Interviews	42.2	38.4	38.9	16.9	14.1	14.5
% Change from 1-19 to 100+ Interviews	-20.7%	-10.0%	-11.4%	-41.3%	-14.6%	-18.4%


Table B.15. Odds Ratios for Year, Prior Interviewer Experience, and Order Effects for Any Illicit Drug, Marijuana, and Nonmedical Use of Any Psychotherapeutic:1999 and 2000 CAI
Description	Any Illicit			Marijuana			Any Psychotherapeutics
Description	Life-time	Past year	Past Month	Life-time	Past year	Past Month	Life-time	Past year	Past Month
Change from 1999 to 2000
Before adjustment	0.97	0.95	1.00	0.98	0.96	1.02	0.93	0.94	0.96
Model adjustment	1.06	1.03	1.05	1.05	1.04	1.07	1.05	1.00	0.97
Prior interviewer experience
No NHSDA (reference class)	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
Some NHSDA	0.88	0.89	0.94	0.92	0.90	0.93	0.85	0.91	1.02
Interview order
1-19 (reference class)	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
20-39	0.94	0.93	0.90	0.93	0.94	0.87	1.00	0.94	1.04
40-59	0.85	0.85	0.84	0.88	0.85	0.79	0.82	0.95	1.07
60-99	0.92	0.88	0.87	0.95	0.92	0.88	0.90	0.86	0.84
100+	0.84	0.84	0.86	0.86	0.86	0.85	0.83	0.83	0.91

Odds ratios in bold are statistically different from 1.00 at the 0.05 level of significance.


Table B.16. Odds Ratios for Year, Prior Interviewer Experience, and Order Effects for Any Illicit Drug, by Age Category:1999 and 2000 CAI
Description	Lifetime				Past Month
Description	12-17	18-25	26-34	35+	12-17	18-25	26-34	35+
Change from 1999 to 2000
Before adjustment	0.96	0.95	0.91	0.99	0.99	0.96	1.16	0.98
Model adjustment	1.02	1.02	0.97	1.10	1.08	1.04	1.14	1.01
Prior interviewer experience
No NHSDA (reference class)	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
Some NHSDA	0.92	0.89	0.93	0.86	0.88	0.90	1.06	0.97
Interview order
1-19 (reference class)	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
20-39	0.98	0.91	0.92	0.95	0.92	0.88	0.90	0.92
40-59	0.89	0.91	0.83	0.84	0.93	0.90	0.80	0.78
60-99	0.92	0.89	0.88	0.94	0.89	0.94	0.83	0.83
100+	0.89	0.84	0.77	0.86	0.94	0.84	0.85	0.84

Odds ratios in bold are statistically different from 1.00 at the 0.05 level of significance.

Table B.17. Percentages Reporting Past Month Use of Illicit Drugs, Alcohol, and Tobacco by Age Group: 1999, 1999 Adjusted¹ and 2000


	TIME PERIOD AND AGE
	Total			12-17			18 and Over
Drug	1999	1999 Adj¹	2000	1999	1999 Adj¹	2000	1999	1999 Adj¹	2000
Any Illicit Drug²	6.3	6.2	6.3	9.8	9.2	9.7	5.8	5.9	5.9

Marijuana and Hashish	4.7	4.7	4.8	7.2	6.9	7.2	4.4	4.4	4.5
Cocaine	0.7	0.6	0.5	0.5	0.4	0.6	0.7^a	0.6	0.5
Crack	0.2	0.2	0.1	0.1	0.1	0.1	0.2	0.3	0.1
Heroin	0.1	0.1	0.1	0.2^a	0.2	0.1	0.1	0.1	0.1
Hallucinogens	0.4	0.5	0.4	1.1	1.1	1.2	0.3	0.4	0.4
LSD	0.2	0.2	0.2	0.6	0.7	0.5	0.2	0.2	0.1
PCP	0.0	0.0	0.0	0.1	0.1	0.1	0.0	0.0	0.0
Inhalants	0.3	0.2	0.3	1.1	0.9	1.0	0.2	0.2	0.2
Nonmedical Use of Any Psychotherapeutic³	1.8	2.0	1.7	2.9	2.6	3.0	1.7	1.9	1.6
Pain Relievers	1.2	1.3	1.2	2.1	1.9^a	2.3	1.1	1.3	1.1
Tranquilizers	0.5	0.7	0.4	0.5	0.5	0.5	0.5	0.7	0.4
Stimulants	0.4	0.6	0.4	0.7	0.6	0.8	0.4	0.6	0.3
Sedatives	0.1	0.1	0.1	0.2	0.2	0.2	0.1	0.1	0.1
Any Illicit Drug Other Than Marijuana	2.7	2.9	2.6	4.5	4.2	4.6	2.5	2.7	2.3

Alcohol	46.4	46.3	46.6	16.5	16.4	16.4	50.0	49.8	50.2
"Binge" Alcohol Use⁴	20.2	19.3^a	20.6	10.1	9.8	10.4	21.4	20.4^a	21.8
Heavy Alcohol Use⁴	5.7	5.2	5.6	2.4	2.4	2.6	6.1	5.5	6.0
Cigarettes	25.8	25.5	24.9	14.9^b	14.5	13.4	27.0	26.7	26.3
Smokeless Tobacco	3.4	3.2	3.4	2.3	2.1	2.1	3.6	3.3	3.5

^aDifference between estimate and 2000 estimate is statistically significant at the .05 level.
^bDifference between estimate and 2000 estimate is statistically significant at the .01 level.
¹ 1999 Adj estimates have been adjusted to reflect the 2000 distribution of NHSDA interviewing experience among field interviewers.
² Any Illicit Drug indicates use at least once of marijuana/hashish, cocaine (including crack), heroin, hallucinogens (including LSD and PCP), inhalants, or any prescription-type psychotherapeutic used nonmedically. Any Illicit Drug Other Than Marijuana indicates use at least once of any of these listed drugs, regardless of marijuana/hashish use; marijuana/hashish users who also have used any of the other listed drugs are included.
³ Nonmedical use of any prescription-type pain reliever, tranquilizer, stimulant, or sedative; does not include over-the-counter drugs.
⁴ "Binge" Alcohol Use is defined as drinking five or more drinks on the same occasion on at least 1 day in the past 30 days. By "occasion" is meant at the same time or within a couple hours of each other. Heavy Alcohol Use is defined as drinking five or more drinks on the same occasion on each of 5 or more days in the past 30 days; all Heavy Alcohol Users are also "Binge" Alcohol Users.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

For diseases, the incidence rate for a population is defined as the number of new cases of the disease, N, divided by the person time, PT, of exposure or:

The person time of exposure can be measured for the full period of the study or for a shorter period. The person time of exposure ends at the time of diagnosis (e.g., Greenberg et al., 1996, pp. 16-19). Similar conventions are applied for defining the incidence of first use of a substance.

Beginning in 1999, the NHSDA questionnaire allows for collection of year and month of first use for recent initiates. Month, day, and year of birth are also obtained directly or imputed in the process. In addition, the questionnaire call record provides the date of the interview. By imputing a day of first use within the year and month of first use reported or imputed, the key respondent inputs in terms of exact dates are known. Exposure time can be determined in terms of days and converted to an annual basis.

Having exact dates of birth and first use also allows us to determine person time of exposure during the targeted period, t. Let the target time period for measuring incidence be specified in terms of dates; e.g,. for the period 1998 we would specify:

a period that includes 1 January 1998 and all days up to but not including 1 January 1999. The target age group can also be defined by a half open interval as a= [a₁,a₂). For example, the age group 12 to 17 would be defined by a=[12, 18) for persons at least age 12, but not yet age 18. If person I was in age group a during period t, the time and age interval,

,can then be determined by the intersection:

assuming we can write the time of birth as in terms of day (DOB_i), month (MOB_i), and year (YOB_i). Either this intersection will be empty

or we will designate it by the half open interval

where:

The date of first use, t _fu,d,i, is also expressed as an exact date. An incident of first drug d use by person I in age group a occurs in time t if

. The indicator function I _i (d, a, t) used to count incidents of first use is set to

, and to 0 otherwise. The person time exposure measured in years and denoted by e_i(d, a, t) for a person I of age group a depends on the date of first use. If the date of first use precedes the target period

, then e_i(d, a, t) = 0. If the date of first use occurs after the target period or if person I has never used drug d, then:

Note that both I_i (d,a,t) and e_i (d,a,t) are set to zero if the target period L_t,a,i is empty; i.e., person I is not in age group a during time t. The incidence rate is then estimated as a weighted ratio estimate:

Prior to the 1999 survey, exact date data were not available for computing incidence rates. For these rates, a person was considered to be of age a during the entire time interval t , if his/her ath birthday occurred during time interval t (generally, a single year). If the person initiated use during the year, the person time exposure was approximated as one-half year for all such persons rather than computing it exactly for each person.

Because of the new methodology, the incidence estimates discussed in section 5 are not strictly comparable to the estimates before the 1999 NHSDA. Since they are based on retrospective reports by survey respondents as was the case for earlier estimates, they may be subject to some of the same kinds of biases.

Bias due to differential mortality occurs because some persons who were alive and exposed to the risk of first drug use in the historical periods shown in the tables died before the 1999 NHSDA was conducted. This bias is probably very small for estimates shown in this report. Incidence estimates are also affected by memory errors, including recall decay (tendency to forget events occurring long ago) and forward telescoping (tendency to report that an event occurred more recently than it actually did). These memory errors would both tend to result in estimates for earlier years (i.e., 1960s and 1970s) that are downwardly biased (because of recall decay) and estimates for later years that are upwardly biased (because of telescoping). There is also likely to be some underreporting bias due to social acceptability of drug use behaviors andrespondents' fear of disclosure. This is likely to have the greatest impact on recent estimates, which reflect more recent use and reporting by younger respondents. Finally, for drug use that is frequently initiated at age 10 or younger, estimates based on retrospective reports one year later underestimate total incidence because 11 year old children are not sampled by the NHSDA. Prior analyses showed that alcohol and cigarette (any use) incidence estimates could be significantly affected by this. Therefore, for these drugs no 1998 estimates were made.

A recent study (Johnson, Gerstein, and Rasinski, 1998) concluded that the marijuana incidence trend from the NHSDA was biased because the reporting of initiation declines as the length of time between initiation and the survey increases. However, this study did not address very recent estimates, i.e., 1996-98, which could be biased because they reflect recent drug use and because they are heavily based on the reports of adolescents. In order to better understand the size of the biases and to assess the reliability of estimates for recent years, OAS performed an analysis of estimates based on single years of NHSDA data. This analysis focused on three drugs: cocaine, heroin, and marijuana. Using the survey data from 1994 to 1998, estimates were made of the number of initiates, the rate of initiation for youth aged 12 to 17, and the rate of initiation for persons aged 18 to 25. For the 1994 survey, an estimate was made for the year 1993. For the 1995 survey, another estimate was made for the year 1993. In this way, two recent estimates of the same year could be compared. Similarly, the 1995 and 1996 data provided two estimates for 1994, the 1996 and 1997 surveys provided two estimates for 1995, the 1997 and 1998 surveys provided two estimates for 1996. Since these calculations represent two measurements of the same population characteristic, they would ideally be the same. Examples of these estimates are shown in the following table:

Table B.18. Comparison of Initiation Rates by Year of Initiation and Survey Year

	Year of Initiation								Avg. of Ratio of 1-Year Recall to 2-Year Recall
	1993		1994		1995		1996
	Year of Survey
	1994	1995	1995	1996	1996	1997	1997	1998
Rate for Age 12-17 Marijuana Cocaine Heroin Rate for Age 18-25 Marijuana Cocaine Heroin	59.2 8.9 0.7 46.9 12.8 0.1	53.7 5.0 0.5 41.4 12.8 1.4	74.2 10.2 2.1 42.1 9.9 1.4	75.2 5.7 1.4 55.9 11.8 2.1	75.7 10.6 2.5 47.7 13.8 2.4	73.6 8.0 1.8 53.4 14.7 1.9	83.2 11.3 3.9 53.6 14.8 2.3	75.6 11.0 1.5 50.5 13.9 3.0	1.055 1.480 1.722 0.960 0.961 0.692
Number of Initiates Marijuana Cocaine Heroin	2,035 595 41	1,783 538 62	2,251 533 122	2,548 530 97	2,368 652 141	2,443 654 93	2,540 675 171	2,384 664 127	1.015 1.031 1.195

Drug initiation rates for youth aged 12 to 17 for the more hard core drugs (like cocaine and heroin) appear to be most prone to bias. For example, on average across the four survey years, the estimate for the rate of initiation of cocaine use among youth aged 12 to 17 was 48% higher the first time the estimate could be made than the second time. This indicates a probable bias in the estimation; however, it is unclear which estimate is the correct one. As a result, one should be cautious in interpreting any changes between the prior year and the most recent year in the initiation rates for youth of the more stigmatized drugs. Since only five years of data were used to estimate how the rate of incidence changes between the first year it can be estimated and the second, one should be cautious about inferring the magnitude of the bias (for example, that it is 48% for cocaine).

In the above table, the average ratio of one year recall to two year recall is calculated across four "years." Implicit in the above table is the fact that the estimates for each ratio vary around the average. For example, therefore, taking the 18 to 25 marijuana incidence numbers, the four individual ratios can be calculated as 1.13, .75, .89, and 1.06. While the average ratio is .96, the year-to-year variation is much larger, ranging from .75 to 1.13. So, it is clear that for any single year, the bias implied by the sample estimates could be negative or positive. Since we are not clear whether the 1-year recall or the 2-year recall estimate is closer to unbiased true value, then the estimate that we use for the most recent year could be as much as 25 percent too high or too low in this example. The samples for 1999 and 2000 based on the new computer-assisted interviewing method are significantly larger than those in prior years; therefore, estimates of bias should suffer from less sampling variability and the estimates should be less variable than before. Nevertheless, since there are only two years under the new computer assisted interview method, and, therefore, only one calculation possible of the ratio of the one-to-two year recall, more analysis is needed to see how stable the new estimates from CAI will be.

SAMHSA, an agency in the Department of Health and Human Services, is the Federal Government's lead agency for improving the quality and availability of substance abuse prevention, addiction treatment, and mental health services in the United States.

* Adobe™ PDF and MS Office™ formatted files require software viewer programs to properly read them. Click here to download these FREE programs now