Centers for Disease Control and Prevention


About N C H S graphic and link
InformaciĆ³n en EspaƱol
Fastats A-Z provides health statistics and links to additional sources of information
N C H S help graphic and link
Coming Events graphic and link
Surveys and Data Collection Systems graphic and link
National Health and Nutrition Examination Survey graphic and link
National Health Care Survey graphic and link
National Health Interview Survey graphic and link
National Immunization Survey graphic and link
Longitudinal Studies of Aging (LSOAs)
National Survey of Family Growth graphic and link
State and Local Area Integrated Telephone Survey graphic and link
National Vital Statistics System graphic and link
Initiatives graphic and link
Aging Activities graphic and link
Disease Classification graphic and link
Healthy People graphic and link
Injury graphic and link
Research and Development graphic and link
Research Data Center
NCHS Press Room
News Releases graphic and link
Publications and Information Products graphic and link
Statistical Export and Tabulation System
Listserv graphic and link
Graphic and link to FEDSTATS and other sites
Download graphic
Adobe Acrobat Reader graphic and link
PowerPoint Viewer 2003 graphic and link
National Center for Health Statistics 3311 Toledo Road Hyattsville, Maryland 20782
Toll Free Data Inquiries 1-800-232-4636


CDC Home Search Health Topics A-Z
N C H S graphic and link to home page


National Health Interview
Survey (NHIS)
Celebrating the First 50 years:
1957 - 2007

NCHS Home | NHIS Home | Questionnaires, Datasets, and Related Documentation | 1997-2008 Data and Related Documentation | 1996-Prior Data and Related Documentation | NHIS on Disability | Coming EventsMethods | Reports | Data Linked to NHIS Data | Related Sites  | CDC/NCHS Privacy Policy Notice | Accessibility | Search NCHS | Contact us

6.2.  Tobacco Related Codebooks

This section provides general and detailed information for data analysts who need to understand the changes to the tobacco related codebooks over the years.  See Section 6.2.2 for a list of codebook files.  See Section 6.2.3 for the web links.

6.2.1 Overview of the NHIS Tobacco Codebook Files

In order to analyze data on tobacco use, the analyst needs to consult the NHIS codebooks for the desired years to determined which variables to use in the analysis.  The NHIS codebook format has undergone major changes over the years, so the essential information is provided here to help the analyst.

NHIS codebooks are abbreviated documents that present condensed information from the questionnaire and the dataset to assist the analyst in selecting variables and writing  computer programs to analyze data using standard statistical packages such as SAS, SPSS, SUDAAN, or STATA.  Synonyms for codebook used in the NHIS documentation include:  file layout, tape layout, and data dictionary.

Note that there is not always a one-to-one relationship between a question in the questionnaire and a variable in the public use dataset.  For some questions, there are several variables, either because more than one answer can be selected, or because recodes were computed.  Alternatively, a given recode may be a composite of several variables.  For some questions, the answers are not released in the NHIS public use dataset to protect confidentiality, but recoded information can be released.  There can be multiple files for each NHIS data year and there is a corresponding codebook for each data file.

Generally, a NHIS codebook contains the following components:

 For the whole dataset:

 The name and data year for the dataset.

 The sample size.

 For each variable:

 A descriptive label for each variable.  This label is usually an abbreviation of the question from the questionnaire.  It could also be a brief description of a recode or computed variable  (variable label).

 The location of each variable (inclusive column numbers)

 A list of answer codes for categorical variables (value).

 A label for each categorical code (value label).

 A range of answers for continuous variables (such as height or weight)

 The unweighted frequencies for each value of each variable in that dataset so that the analyst can check the accuracy of his/her output.

 In recent years, the codebook may also include variable names that follow the SAS naming conventions (8 characters or less; starts with a letter; contains only numbers, letters and a limited number of special characters; does not include any embedded spaces.)

 Variable names in NHIS Codebooks:

 Starting in 1997, when the NHIS switched from paper forms to computer assisted personal interviewing (CAPI), each NHIS variable and recode was assigned a SAS compatible variable name (8 characters or less, starting with an alphabetic character).

 Some supplements from 1996 and before years were retroactively assigned variable names when they were released for public use (such as the 1992 Cancer Control and Epidemiology files). The HIS core files from 1996 and before continued to list the column numbers, not the variable names in identifying the location of each field.

Major Changes to the NHIS Codebook Format:

1965-1996:  The NHIS codebooks from these years included: sample size, frequencies, column numbers, variable labels, answer codes, and value labels.  These codebooks did not usually include variable names.

1997-2003:  With the advent of the 1997 questionnaire redesign, variable names were systematically created for each variable, and included for every variable in each codebook.

2004 to present:  Beginning with the 2004 data year NHIS codebook information can be found in the following files:

 Variable summary:  This file contains the question number, the variable name, the variable label, the column location and the column length (width).

 Variable layout:  This file contains the question number, the variable name, the value codes and the value labels.

 Variable frequencies:   This file contains the question number, the variable name, unweighted frequencies and percentages for the categorical variables in each file.  The responses for certain continuous variables, such as age, are given in broad categories.

 Return to Introduction

 Return to Detailed Outline

 

This page last reviewed October 15, 2008

H H S Health and Human Services logo and link
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
Centers for Disease Control and Prevention
National Center for Health Statistics
Hyattsville, MD
20782

1-800-232-4636