|
|
METHODS
Estimation An estimate of the number of trucks for a particular state
and truck characteristic was computed in the following manner. Weighted
estimates of the number of trucks having the characteristic of interest
were computed for each of the five truck strata. The weight for a given
truck was the product of two factors—the reciprocal of the truck’s
probability of selection and a nonresponse adjustment factor. (See the Nonsampling Error section for a description of the
nonresponse adjustment procedure.) The truck stratum estimates were
summed to form a state-level estimate. Two types of truck miles
estimates are provided. Distributed truck miles estimates, as shown in
Table 8, were computed by apportioning each truck’s annual miles into
the appropriate category based on the percent of miles driven in the
category as reported by the respondent. Truck miles estimates presented
in all other tables were computed by attributing 100 percent of an
individual truck’s annual miles to the category with the greatest
reported percentage. For example, say a particular truck was driven
50,000 miles in the survey year and the respondent indicated 80 percent
of the trips were between 201 and 500 miles from the home base, while
20 percent of the trips were between 101 and 200 miles from the home
base. In Table 8, 40,000 miles would be tabulated in the ’’201 to 500
miles’’ category and 10,000 miles would be tabulated in the ‘‘101 to
200 miles’’ category. In all other tables, 50,000 miles would be
tabulated in the ‘‘201 to 500 miles’’ category. To compute an estimate
of the average miles per truck, the total miles estimate was divided by
the number of trucks estimate for the characteristic of interest. RELIABILITY OF THE ESTIMATES Estimates in published tables are based on data from the 2002 Vehicle Inventory and Use Survey and administrative records. To maintain confidentiality, no estimates are published that would disclose the operations of an individual truck. The total error or a published estimate may be considered to be comprised of sampling error and nonsampling error. Individuals who use the Vehicle Inventory and Use Survey estimates to create new estimates should cite the Census Bureau as the source of only the original estimates. The total error of an estimate based on a sample survey is the
difference between the estimate and the population parameter that it
estimates. This error may be considered to be comprised of sampling
error and nonsampling error. Sampling error is the difference between
the estimate and the result that would be obtained from a complete
enumeration of the sampling frame conducted under the same survey
conditions. This error occurs because characteristics differ among
sampling units and because only a subset of the entire population is
measured in a sample survey. Nonsampling error encompasses all other
factors that contribute to the total error of a sample survey estimate.
The accuracy of a survey result may be affected by these two types of
errors. Measures of Sampling Variability Because the estimates are based on a sample, exact agreement
with the results that would be obtained from a complete enumeration of
the truck registrations on the sampling frame is not expected. However,
because each truck included on the sampling frame has a known
probability of being selected into the sample, it is possible to
estimate the sampling variability of the survey estimates. An estimate from a particular sample and the standard error
associated with the estimate can be used to construct a confidence
interval. A confidence interval is a range about a given estimator that
has a specified probability of containing the result of a complete
enumeration of the sampling frame conducted under the same survey
conditions. Associated with each interval is a percentage of
confidence, which is interpreted as follows. If, for each possible
sample, an estimate of a population parameter and its approximate
standard error were obtained, then:
To illustrate the computation of a confidence interval for an
estimate of the number of trucks, assume that an estimate of trucks is
3,377.8 thousand and the coefficient of variation for this estimate is
2.9 percent, or 0.029. First obtain the standard error of the estimate
by multiplying the number of trucks estimate by its coefficient of
variation. For this example, multiply 3,377.8 thousand by 0.029. This
yields a standard error of 97.9562 thousand. The upper and lower bounds
of the 90-percent confidence interval are computed as 3,377.8 thousand
plus or minus 1.645 times 97.9562 thousand. Consequently, the
90-percent confidence interval is 3,216.7 thousand to 3,538.9 thousand.
If corresponding confidence intervals were constructed for all possible
samples of the same size and design, approximately 9 out of 10 (90
percent) of these intervals would contain the result obtained from a
complete enumeration of all trucks on the sampling frame. Nonsampling Error Nonsampling error encompasses all other factors that contribute to the total error of a sample survey estimate and may also occur in censuses. It is often helpful to think of nonsampling error as arising from deficiencies or mistakes at some point in the survey process. Nonsampling error can be attributed to many sources:
A potential source of bias in the estimates is nonresponse. Nonresponse is defined as the failure to obtain all the intended measurements or responses about all the trucks in the sample. Two types of nonresponse are often distinguished. Unit nonresponse is used to describe the failure to obtain any of the substantive measurements about a sampled truck. In most cases of unit nonresponse, the questionnaire was never returned to the Census Bureau after several attempts to elicit a response. Item nonresponse occurs either when a question is unanswered or the response to the question fails computer or analyst edits. The procedures used to account for unit and item nonresponse are discussed below. Unit nonresponse is handled in the estimation procedure by reweighting. To apply this method of nonresponse adjustment, we make the assumption that the population of trucks can be divided into a finite number of mutually exclusive adjustment cells so that within each cell, all the population elements possess similar characteristics and share a similar probability of responding, if selected into the sample. The adjustment cells for the 2002 VIUS are identical to the sampling strata. A nonresponse adjustment factor is computed for each adjustment cell and is equal to the ratio of the number of truck registrations selected into the sample to the number of responses received within each cell. In this sense, reweighting allocates characteristics to the nonrespondents in proportion to the characteristics observed for the respondents within each adjustment cell. The amount of bias introduced by this nonresponse adjustment procedure depends on the extent to which the nonrespondents differ, characteristically, from the respondents in each adjustment cell. For item nonresponse, a missing value is replaced by a predicted value obtained from an appropriate model for nonresponse. This procedure is called imputation. To impute annual miles and lifetime miles, we divide the sample into a finite number of mutually exclusive cells based on state of registration, and related vehicle characteristics. For each cell, estimates of average annual miles and average lifetime miles are computed based on those trucks in the cell for which annual miles and lifetime miles have been reported. Missing values are then replaced with the appropriate average values. A slightly different imputation procedure is used to impute length and average weight (empty weight plus cargo weight). For these data items, we replace a missing value with data from a truck with similar characteristics for which length and average weight have been reported. For all other data items, no imputation is performed. Instead,
separate estimates are published in a ‘‘Not reported’’ category. For
example, a respondent who did not indicate the type of business in
which his/her truck was used would be included in the estimate for the
‘‘Not reported’’ category. Users of the estimates should exercise
caution when allocating the estimate for the ‘‘Not reported’’ category
to the estimates for the reported categories in the proportions
observed for the reported categories. This is because the
characteristics of the trucks for which we obtained information may
differ significantly from those trucks for which we obtained no
information. Unpublished Estimates Additional statistics not shown in the tables are obtainable by tabulating records on a CD-ROM containing the survey microdata. These additional estimates have not been included in the published reports because of high sampling variability, poor response, or other factors that may make them potentially misleading. It should be noted that some unpublished estimates can be derived directly from these reports by subtracting published estimates from their respective totals. However, the estimates obtained by such subtraction would be subject to the poor response rates or high sampling variability as previously described. Data users should take into account the magnitude of "Not Reported" categories when assessing estimates computed using data contained in the CD-ROM. Individuals who use estimates from the published reports to
create new estimates should cite the Census Bureau as the source of
only the original estimates. Individuals who use the CD-ROM microdata
to create estimates not published by the Census Bureau should cite the
Census Bureau as the source of only the microdata used, and not as the
source of the new estimates. CONFIDENTIALITY AND DISCLOSURE Title 13 of the United States Code authorizes the Census Bureau to conduct censuses and surveys. Section 9 of the same Title requires that any information collected from the public under the authority of Title 13 be maintained as confidential. Section 214 of Title 13 and Sections 3559 and 3571 of Title 18 of the United States Code provide for the imposition of penalties of up to five years in prison and up to $250,000 in fines for wrongful disclosure of confidential census information. In accordance with Title 13, no estimates are published that would disclose the operations of an individual firm. The Census Bureau’s internal Disclosure Review Board sets the confidentiality rules for all data releases. A checklist approach is used to ensure that all potential risks to the confidentiality of the data are considered and addressed. A disclosure of data occurs when an individual can use
published statistical information to identify either an individual or
firm that has provided information under a pledge of confidentiality.
Disclosure limitation is the process used to protect the
confidentiality of the survey data provided by an individual or firm.
Using disclosure limitation procedures, the Census Bureau modifies or
removes the characteristics that put confidential information at risk
for disclosure. Although it may appear that a table shows information
about a specific individual or business, the Census Bureau has taken
steps to disguise or suppress the original data while making sure the
results are still useful. The techniques used by the Census Bureau to
protect confidentiality in tabulations vary, depending on the type of
data.
|
|
Last Revised: 09/03/2004
|