CDC en Español

Search:
National Center for Health Statistics  Monitoring the Nation's Health

NHANES Web Tutorial

  • Email this page

Frequently Asked Questions (FAQ)

These Frequently Asked Questions (FAQs) and answers cover the the most common questions encountered when working with Continuous NHANES (1999 and on), NHANES III, NHANES II, and NHANES I data. The FAQs are arranged by tutorial module topic. Click the hyperlinked question to view the answer.

Survey Orientation

Survey Overview

Question 1. I noticed your survey acronyms changed from NHES to NHANES. Are there any differences between these surveys?

Question 2. Have the data contents remained constant across surveys?

Question 3. Why do you call NHANES conducted after 1999 " continuous NHANES”?

Navigate the NHANES Website

Question 1. Can I access current and historic data conducted by your agency from the website?

Question 2. Why are certain variables or data files not publicly released on the website?

Question 3. How do you decide the contents for each NHANES survey?

Question 4. I noticed there is a lot of information in the data documentation.  Do I have to read these documents?

Data Structure & Contents

Question 1. Are NHANES data structured the same way throughout the years?

Question 2. What are the main differences in data structure between the continuous NHANES and NHANES III?

Question 3. I cannot find my variables in NHANES III series 11 No. 1a or 2a. Where else shall I search?

Preparing an Analytic Dataset

Locate Variables

Question 1. How are continuous NHANES data (1999-present) organized in the publicly accessible website?

Question 2. Why are there so many data files?

Question 3. How do I know which component contains the variables of interest to me?

Question 4. Once I know which cycle and component to search for my variables, what is the fastest way to find them?

Question 5. My search resulted in a long list of variables. Which one is appropriate for my analysis?

Question 6. What kinds of NHANES documents are available and how is it best to use them?

Question 7. On the NHANES 2003-2004 data page I see links for data.  How do I access the data from these links?

Question 8. Next to the name of each questionnaire section, laboratory component, or exam component on the NHANES 2003-2004 data page there are links that appear as follows: [Data, Docs, Procedures].  What are these links for?

Question 9. I know NHANES collected information on certain topics, but I couldn't find them on the variable lists. What happened to those data collected?

Question 10. Why isn't the adolescent data on alcohol use, smoking, sexual behavior, reproductive health and drug use available as a public release file?

Download Data Files

Question 1. Where can I access NHANES data files?

Question 2. What format are the data files in?  Can they be used with SAS, SPSS, or Stata?

Question 3. I have downloaded the files, but I cannot run any of them with my statistical program. What happened?

Question 4. What operating system do I need to extract these files?

Question 5. Why do I need to go through the trouble of extracting and saving these files? Can't I just double click on the files and let the SAS program extract and save them automatically?

Append & Merge Datasets

Question 1. I noticed that continuous NHANES data files are released in 2-year cycles. What do I do if I need to combine different years together?

Question 2. Why do I have to check the contents of the data files before appending the data? What do I do if I find variables named or labeled differently?

Question 3. The variables I'm interested in come from interview, examination and laboratory components. How do I combine them together?

Question 4. What is the unique identifier in NHANES data that we need to append or merge data by?

Clean & Recode Data

Question 1. What percent of missing data is usually acceptable for NHANES data analysis?

Question 2. How are missing values, "blank but applicable", "don't know" and other values coded?

Question 3. Why do I have to check the missing data?

Question 4. How do I determine the skip patterns for a questionnaire section?

Question 5. How do I check for outliers, and what do I do with influential outliers?

 

Format & Label Data

Question 1. Do I have to format and label all variables?

Question 2. Are there rules on how to format and label variables?

Survey Design Factors

Sample Design

Question 1. What do you mean by the phrase " NHANES is a complex survey”?

Question 2. How do you draw an NHANES sample?

Question 3. What is a Sample Weight?

Question 4. Do I have to use sample weights and other survey design variables?

Question 5. What are Masked Variance Unites (MVUs) and why do we need them in analyses?

Question 6. Why does NHANES oversample some groups but not others? Do you oversample different groups over the years?

 

Specifying Weighting Parameters

Question 1. How are NHANES weights constructed?

Question 2. How do NHANES weights account for different response rates to the in-home interview and MEC exam?

Question 3. Will data and weights be available on public use files for single years such as 1999, 2000, 2001, or 2002?

Question 4. I was told I have to use the 4-year weights provided on public use files for 1999-2002. Why can't I combine weights together myself?

Question 5. How do I calculate 6-year weights?

Question 6. What are the subsample weights and how are they constructed?

Question 7. Can you combine subsample weights?

Question 8. When I subset NHANES data, should I do it in SUDAAN or in SAS data steps?

 

Variance Estimation

Question 1. What kind of sampling features may affect the variance estimates of NHANES data?

Question 2. What would happen to the variance estimates if standard statistical software for simple random samples is used?

Question 3. How do you estimate the impact of a complex sample design on variance estimates?

Question 4. Are there specific mathematical formulas you recommend to use for computing variance estimates for complex survey data?

Question 5. Why do you emphasize degrees of freedom so much in your NHANES tutorial?

Question 6. How do you properly calculate the degrees of freedom?

Question 7. Are there any differences between SAS and SUDAAN software in terms of handling the degrees of freedom?

Question 8. How do you generate confidence intervals using SAS or SUDAAN?

NHANES Analysis

Descriptive Statistics

Question 1. In the tutorial, you recommend checking the frequency distribution of each variable before analysis. Why?

Question 2. Is it a good idea to get frequency tables for all variables in your analysis, and print them out for reference?

Question 3. If the statistics for normality turn out to be significant in my analysis, does that mean I cannot use parametric tests any more?

Question 4. What do you use percentiles for?

Question 5. Can you generate percentiles with SAS Survey Procedures?

Question 6. When should you use geometric means instead of arithmetic means?

Question 7. In the Descriptive Statistics module you demonstrated how to calculate prevalence for hypertension. But the definition you used in this tutorial is different from the one I usually use. Why is that?

 

Hypothesis Testing

Question 1. Can we use the student t-test for NHANES data?

Question 2. How should I handle the degrees of freedom when conducting hypothesis testing with NHANES data?

Question 3. When I calculate confidence intervals for a point estimate in NHANES, should I use the t score or the Z score in the formula?

Question 4. Do I have to use weights and design based methods when calculating confidence intervals?

Question 5. Can I get confidence intervals for highly skewed variables?

Question 6. Can I obtain geometric means and their confidence intervals using SAS proc surveymeans?

Question 7. What procedures would you recommend for chi square testing?

 

Age Standardization

Question 1. When do I have to use age standardization?

Question 2. Are age-adjusted rates usually different from unadjusted rates in NHANES data?

Question 3. There are different methods for age-standardization. Which do you recommend for NHANES data?

Question 4. When do you recommend the use of population estimates?

Question 5. How do you calculate population estimates for NHANES data?

Question 6. Where can we obtain CPS totals for continuous NHANES data?

Question 7. Can you combine population totals across survey cycles, or for multiple age and gender or race/ethnic subgroups?

Question 8. Why can't you just sum the final sampling weights for the population totals?

 

Linear Regression

Question 1. When do you use linear regression for NHANES data?

Question 2. Which test statistics would you recommend for regression analysis of NHANES data, WALD F, Satterthwaite adjusted F, or Satterthwaite adjusted chi square?

Question 3. How do you specify a multiple regression model in SUDAAN with both continuous and discrete independent variables plus interaction terms?

Question 4. Can you do multiple regression analysis in SAS?

Question 5. How do you select a reference category in a regression analysis?

 

Logistic Regression

Question 1. What statistical software can I use for logistic regression analysis?

Question 2. Are these software packages very similar in programming languages?

Question 3. How do you select weights for logistic regression models?

Question 4. How do you code the dependent variable for event and non-event?

Question 5. When I run both the SAS Survey and SUDAAN programs for the same logistic regression model, why do I sometimes get different results?

Page Last Modified: August 05, 2008

NHANES Tutorials

Additional Resources

 

National Center for Health Statistics
3311 Toledo Road
Hyattsville, MD 20782
Phone: 1-866-441-NCHS (6247)
For data inquiries, use
nchsquery@cdc.gov

 

Problems or comments about the Tutorial?
Email the Tutorial Team: NHANESWebTutorial@cdc.gov

Safer Healthier People

Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta, GA 30333, U.S.A
Tel: (404) 639-3311 / Public Inquiries: (404) 639-3534 / (800) 311-3435