Chapter 16.
Consumer Expenditures and Income
Sample Design
Selection of households
The Consumer Expenditure Survey is a nationwide household survey
designed to represent the total U.S. civilian noninstitutional
population. The selection of households begins with the definition
and selection of primary sampling units (PSUs). PSUs are counties
(or parts thereof), groups of counties, or independent cities grouped
together into geographic entities called core-based statistical
areas (CBSAs). The sample of PSUs currently used in the survey
consists of 91 areas, of which 75 urban areas are also used by the
Consumer Price Index program.
The 91 PSUs are classified into four categories:
- 21 A PSUs, which are metropolitan CBSAs with a
population over 2.7 million people
- 38 X PSUs, which are metropolitan CBSAs with a
population under 2.7 million people
- 16 Y PSUs, which are micropolitan CBSAs, defined as areas
that have at least one urban cluster of at least 10,000 but less
than 50,000 population, plus adjacent territory that has a
high degree of social and economic integration with the core
as measured by commuting ties
- 16 Z PSUs, which are non-CBSA areas, and are often
referred to as rural PSUs
Within these 91 PSUs, the sampling frame (the list of addresses from
which the sample is drawn) is now generated from the 2000 Census
100-Percent Detail File. It is augmented by a sample of addresses
drawn from new construction permits and by extra housing units
identified through coverage improvement techniques.
The population represented by the survey is the total U.S.
civilian noninstitutional population, both urban and rural. It
includes people living in houses, condominiums, apartments, and
group quarters such as college dormitories. It excludes people
such as military personnel living on base, nursing home residents,
and people in prisons.
The U.S. Census Bureau selects a sample of approximately 12,000
addresses per year to participate in the Diary Survey. Usable diaries
are obtained from approximately 7,100 households at those addresses.
Diaries are not obtained from the other addresses due to refusals,
vacancies, ineligibility, or the nonexistence of a housing unit at
the selected address. The actual placement of diaries is spread
equally over all 52 weeks of the year.
The Interview Survey is a rotating panel survey in which
approximately 14,000 addresses are contacted in each calendar
quarter of the year. One-fifth of the addresses contacted each
quarter are new to the survey and provide bounding interviews
that provide baseline data, which are not used to compute the surveys
published expenditure estimates. Excluding these bounding interviews
and interviews not completed due to refusals, vacancies, ineligibility,
or the nonexistence of a housing unit at the selected address,
usable interviews are obtained from approximately 7,100 households
each quarter. After a housing unit has been in the sample for five
consecutive quarters, it is dropped from the survey and a new housing
unit is selected to replace it.
Note: The sample design described above with 91
PSUs is based on information collected in the 2000 Census, and has
been in use since 2006. The original 2000 census-based sample design
was introduced in 2005 and consisted of 102 PSUs: 28 A PSUs,
42 X PSUs, 16 Y PSUs, and 16 Z PSUs. However, budget cuts in
2006 forced seven A PSUs to be changed to the X category,
and 11 X PSUs to be dropped from the sample. Dropping 11 X PSUs
from the sample reduced the number of sampled addresses and
interviewed households by approximately 8 percent. Otherwise
the original and current 2000 census-based sample designs are
the same.
Cooperation levels
Response data for the 2005 Consumer Expenditure Survey are shown in
table 1. For the Interview survey, the totals refer to housing units
in the second through fifth quarters of the survey (the non-bounding
interviews), with each unique housing unit providing up to four
usable interviews. For the Diary Survey, the totals refer to housing
units in weeks 1 and 2 of the survey, with each unique housing unit
providing up to two usable interviews. Most Diary respondents
participate for both weeks.
There are three general categories of nonresponse:
-
Type A nonresponses are refusals, temporary absences,
and noncontacts
-
Type B nonresponses are vacant housing units, housing units
with temporary residents, and housing units under construction
-
Type C nonresponses are destroyed or abandoned housing units,
and housing units converted to nonresidential use
Response rates are defined to be the percent of eligible housing
units (that is, the designated sample less Type B and Type C
nonresponses) from which usable interviews are collected. In the
2005 Interview Survey there were 39,988 eligible housing units from
which 29,804 usable interviews were collected, resulting in a response
rate of 74.5 percent. In the 2005 Diary Survey there were 21,309
eligible housing units from which 15,126 usable interviews were
collected, resulting in a response rate of 71.0 percent.
Table 1. Analysis of responses in the
Consumer Expenditure Survey, 2005
Sample unit |
Interview survey |
Diary survey |
Housing units designated for the survey
|
49,242 |
26,054 |
Less Type B or C nonresponses
|
9,254 |
4,745 |
Equals eligible units
|
39,988 |
21,309 |
Less Type A nonresponses
|
10,184 |
6,183 |
Equals Interview units
|
29,804 |
15,126 |
Percent of eligible units interviewed
|
74.5 |
71.0 |
Estimation methodology
The estimation of population quantities of interest, such as the
average expenditure per consumer unit on a particular item, is
achieved through the use of weights. Each consumer unit in the
survey is assigned a weight, which is the number of similar consumer
units in the U.S. civilian noninstitutional population the
sampled consumer unit represents. Using these weights, the
average expenditure per consumer unit on a particular item
category is estimated by
where
= average expenditure per consumer unit on the item category,
yi = expenditure made by the i th consumer unit on the item category,
wi = weight of the i th consumer unit in the sample, and
s = sample of consumer units that participate in the survey.
For example, if yi is the expenditure on butter
made by the i th consumer unit in the sample
during a given time period, then
is an estimate of the average expenditure
on butter made by all consumer units in the U.S. civilian
noninstitutional population during that time period.
If one wanted to estimate the proportion of consumer units that
purchased butter during a given time period, then the same formula
is applied, where yi is set equal to 1 if
the i th consumer unit
purchased butter during the time period, and 0 if it did not.
When this 1/0 definition of yi is used,
is an estimate of the
proportion of all consumer units in the U.S. civilian
noninstitutional population that purchased butter during the
given time period.
Several factors are involved in computing the weight of each
consumer unit for which a usable interview is received. Each
consumer unit is initially assigned a base weight, which is equal
to the inverse of the consumer units probability of being
selected for the sample. Base weights in the Consumer Expenditure
Survey are typically around 10,000, which means that a consumer
unit in the sample represents 10,000 consumer units in the
U.S. civilian noninstitutional populationitself plus 9,999
other consumer units that were not selected for the sample.
The base weight is then adjusted by the following factors
to correct for certain nonsampling errors:
Weighting control factor. This adjusts for subsampling in
the field. Subsampling occurs when a data collector visits a
particular address and discovers multiple housing units where
only one housing unit was expected.
Noninterview adjustment factor. This adjusts for interviews
that cannot be conducted in occupied housing units due to a consumer
units refusal to participate in the survey or the inability to contact
anyone at the sample unit in spite of repeated attempts. This
adjustment is based on region of the country, household tenure
(owner/renter), consumer unit size, and race of the reference person.
Calibration factor. This adjusts the weights to 24 known
population counts to account for frame undercoverage. These known
population counts are for age, race, household tenure (owner/renter),
region of the country, and urban/rural. The population counts are
updated quarterly. Each consumer unit is given its own unique
calibration factor. There are infinitely many sets of calibration
factors that make the weights add up to the 24 known population
counts, and the Consumer Expenditure Survey selects the set that
minimizes the amount of change made to the initial weights
(initial weight = base weight x weighting control factor x
noninterview adjustment factor).
Precision of the estimates The precision of the
estimator
is measured by its standard error. Standard errors measure the
sampling variability of the Consumer Expenditure Survey estimates.
That is, they measure the uncertainty in the survey estimates caused
by the fact that a random sample of consumer units from across the
United States is used instead of collecting data from every
consumer unit in the nation.
Standard errors are estimated using the method of balanced
repeated replication. In this method the sampled PSUs are divided
into 43 groups (called strata), and the consumer units within each
stratum are randomly divided into two half samples. Half of the
consumer units are assigned to one half sample, and the other
half are assigned to the other half sample. Then 44 different
estimates of
are created using data from only one half sample
per stratum. There are many combinations of half samples that
can be used to create these replicate estimates, and the
Consumer Expenditure Survey uses 44 of them that are created in a
balanced way with a 44 x 44 Hadamard matrix. The standard
error of
is then estimated by
,
where r
is the rth replicate estimate of .
The coefficient of variation is a related measure of sampling
variability. It measures the variability of the survey
estimate relative to the mean. It is defined by the equation
and usually is expressed as a percent.
Table 2. Precision of the Consumer
Expenditure Survey expenditure estimates, integrated Diary and Interview
survey data, 2005
Item category |
Average annual expenditure per consumer unit |
Standard error SE (y) |
Coefficient of variation, CV (y) (in percent) |
Total expenditures
|
$46,409 |
$254 |
0.55 |
Food
|
5,931 |
42 |
.71 |
Housing
|
15,167 |
120 |
.79 |
Apparel
|
1,886 |
40 |
2.10 |
Transportation
|
8,344 |
130 |
1.55 |
Health care
|
2,664 |
25 |
.94 |
Entertainment
|
2,388 |
54 |
2.26 |
Personal care
|
541 |
7 |
1.28 |
Cash contributions
|
1,663 |
43 |
2.60 |
Personal insurance and pensions
|
5,204 |
59 |
1.13 |
Other
|
2,621 |
- |
- |
Next: Presentation
|