Centers for Disease Control and Prevention
CDC HomeSearchHealth Topics A-Z
   
  WONDER Home FAQ Help Contact Us Search  
 
Scientific Data Documentation
CDC Scientific Data Index


Scientific Data Index


CDC's Central Data Repository

The Epidemiology Program Office (EPO) maintains CDC's Central  Data Repository.  This central repository 
is actually a library of public use datasets with a prefix of "CC36", "CC37".  These files are available for use 
by all CDC employees with mainframe accounts.  The documentation files for these datasets are available on 
the CDC WONDER Web site.  

Using This Scientific Data Index

This reference is divided into sections corresponding to the seven categories of datasets:

Within each category section, the datasets are listed by title.

To find a dataset name, first select the appropriate category.  Select the documentation file for the appropriate dataset.   The corresponding dataset name will be listed at the top of the documentation file.  For multiple 
dataset names, a separate file will be listed.

User Support

TSO/ROSCOE Support:
For help in TSO or ROSCOE, try the "HowTo" utility in each system.  For more  specific help, call the Mainframe Support Hotline at (404) 639-7500.

Scientific Data Support:
For help with specific datasets, please call the Information Technology Branch Branch at (770) 488-8360.

CDC WONDER Support
For assistance with CDC WONDER, please call user support at (770) 488-8360.

Using CDC WONDER To Obtain Technical Specifications/Documentation

CDC WONDER Scientific Documentation 

The documentation provides all the technical specifications associated with the file.  These specifications 
include Background, Abstract, Record Layout, Tables, and any Appendices, etc.

Definition of Acronyms

ARFArea Resource File
CPSCurrent Population Survey, March - Annual Demographic File,  September - Veterans and Smoking Supplement
FARSFatal Accident Reporting System
FIPSFederal Information Processing Codes
HANESHealth and Nutrition Examination Survey
ICDInternational Classification of Diseases
ILTCPInventory of Long Term Care Places
LSOALongitudinal Study on Aging
MARFMaster Area Resource File
NAMCSNational Ambulatory Medical Care Survey
NCHSNational Center for Health Statistics
NCYFSNational Children and Youth Fitness Study
NHDSNational Hospital Discharge Survey
NHISNational Health Interview Survey
NIDANational Institute on Drug Abuse
NMCUESNational Medical Care Utilization and Expenditures Survey
NMESNational Medical Expenditures Survey (replaces NMCUES)
NMFINational Master Facility Inventory
NNHSNational Nursing Home Survey
NSFGNational Survey of Family Growth
PUMSPublic Use Microdata Sample
SEERSurveillance Epidemiology and End Results
STFSummary Tape File


Census / Population

Decennial Data

Decennial data files include population data occurring every 10 years.

Intercensal Data

Intercensal data files include population estimates using two decennial census years as endpoints and 
interpolates to derive estimates for subsequent years prior to the following census.

Postcensal Data

Postcensal data files include census estimates using one decennial as a beginning point and extrapolates 
to derive estimates for subsequent years prior to the following years.

Population Projections

Population projections include census projections using one decennial census of postcensal estimates 
and extrapolates to derive totals for many years into the future.

STF1 

Summary Tape File 1 (STF1) contains 100-percent population and housing counts.  Population items 
include age, race, sex, marital status, Hispanic origin,  household type, and household relationship.  
Population items are cross tabulated by age, race,  Hispanic origin, or sex.  Housing items include occupancy/vacancy status, tenure, units in structure, contract rent, meals included in rent, value, and 
number of rooms in housing unit.  Housing data are cross tabulated by race or Hispanic origin of 
householder or by tenure.

Geographic Area
 File  A    States, counties, county subdivisions, places, census tracts, block numbering areas (BNA's), 
                block groups (BG's). Also Alaska Native areas and State parts of American Indian areas

 File  B    States, counties, county subdivisions,  places, census tracts/BNA's, BG's, blocks. Also Alaska 
               Native areas and State parts of American Indian.

 File  C    U.S., regions, divisions, States (including summaries such as urban and rural), counties, places of 
               10,000 or more inhabitants, county subdivisions of 10,000 or more inhabitants in selected States, 
               metropolitan areas (MA's), urbanized areas (UA's), American Indian and Alaska Native areas

 File  D    Congressional districts (CD's) of the 103rd Congress by State; and within each CD: counties, 
                places of 10,000 or more inhabitants, county subdivisions of 10,000 or more inhabitants in 
                selected States, Alaska Native areas, and American Indian areas STF2 Files

STF2  

Over 2,100 cells/items of  100 percent population and housing counts and characteristics for each 
geographic area. Each of the STF2 files will include a set of tabulations for the total population and 
separate presentations of tabulations by race and Hispanic origin

Geographic Area
 File A    In MA's: counties, places of 10,000 or more inhabitants, and census tracts/BNA's.  In the 
              remainder of each State: counties,  places of 10,000 or more inhabitants,  and census 
              tracts/BNA's

 File B    States (including summaries such as urban and rural), counties, places of 1,000 or more 
              inhabitants, county subdivisions, State parts of American Indian areas, and Alaska Native areas

 File C    U.S., regions, divisions, States (including summaries such as urban and rural), counties, places of
              10,000 or more inhabitants, county subdivisions of 10,000 or more inhabitants in selected States,
              all county subdivisions in New England MA's, American Indian and Alaska Native areas, MA's, 
              UA's

STF3 

This file primarily contains sample data inflated to represent the total population.  In addition, the file contains 
100 percent counts and unweighted sample counts of persons and housing units.  STF3B and STF3C have identical tables and format except for the omission of 100 percent counts for population and housing in STF3B.

Geographic Area
 File A     States, counties,county suubdivisions, places, census tracts/BNA's, BG's.  Also Alaska Native 
                areas and State parts of American Indian areas

 File  B     Five-digit ZIP Codes within each State

 File  C    U.S., regions, divisions, States, counties, places of 10,000 or more inhabitants, county subdivisions 
                of 10,000 or more inhabitants in selected States, American Indian and Alaska Native areas, MA's, 
                UA's

 File  D    CD's of the 103rd Congress by State; and within each CD:  counties, places of 10,000 or more 
                inhabitants, county subdivisions of 10,000 or more inhabitants in selected States

Public-Use Microdata Sample - PUMS

Public-use microdata samples are files that contain records for a sample of housing units, with information 
on the characteristics of each unit and the people in it.  In order to protect the confidentiality of respondents, 
the Bureau excludes identifying information from the records.  Within the limits of the sample size and 
geographic detail provided, these tapes permit users with special needs to prepare virtually any tabulations 
of the data they may desire.

This data is available in two Microdata Samples.

 File  5%    County groups, counties, county subdivisions, and places with 100,000 or more inhabitants

 File  1%    Metropolitan Areas ( MA's) and other large areas with 100,000 or Areas (1990) more 
                  inhabitants

Current Population Survey (CPS) -  Annual Demographic File

This file, also known as the Annual Demographic File, provides the usual monthly labor force data, but 
in addition, provides supplemental data on work experience, income, and migration.  Comprehensive 
information is given on the employment status, occupation, and industry of persons 14 years old and over.  Additional data for persons 15 years old and older are available concerning weeks worked and hours 
per week worked, reason not working full time, total income and income components, and residence on 
March 1.  This file also contains data covering seven non-cash income sources:  food stamps, school lunch program, employer-provided group health insurance plan, employer-provided pension plan, Medicaid, 
Medicare, CHAMPUS or military health care, and energy assistance.  Characteristics such as age, sex, 
race, household relationship, and Spanish origin are shown for each person in the household enumerated.


Classification of Codes

Area Resource File System

The Office of Data Analysis and Management (ODAM) sponsors the maintenance of a health resources information system for the Bureau of Health Professions (BHPr).  This system, the Area Resource File 
System (ARFS), is designed to be used by health analysts and other professionals seeking consistent,
currents and compatible information or conducting research on the nation's health care delivery system.  
The Area Resource File System consists of four major components:  (1) the basic Area Resource File 
(ARF), which is a massive county-specific data base; (2) a State/National Timeseries data base; (3) a
microcomputer data series containing demographic, health facilities, and health professions data extracts 
for use on microcomputers; and (4) internal components that include detailed hospital data files and over 
50 detailed disciplinary support files.

FIPS Geographic Coding Schemes Information/ FIPS PUB 55, Update

FIPS files provide codes for named populated places, primary county divisions (such as townships and 
census county divisions), Indian reservations, and several kinds of facilities.  It also provides compatible
codes for counties and county equivalents.  Areas of the U.S. covered in this guideline are all 50 States, 
the District of Columbia, and the outlying territories of American Samoa, Guam, Northern Marinas, 
Puerto Rico, Trust Territory of the Pacific Islands, and U.S. Virgin Islands.  In addition to the entry name 
and its standard code, the Guideline provides other identifying and cross-referencing data for each file entry.  
A feature of the 9th Update is a complete listing of populated places from the files of the Geographic Names Information System (GNIS) of the U.S. Geological Survey.  Names matched with those in GNIS are 
identified by a special indicator.  Similarly, a complete listing of U.S. Post Offices is included,  and names matched to Post Office names also are identified by a special sentinel.

Geocodes

The geocode (geographic code) files provide state and county geographic codes.  These codes also identify 
city and metropolitan areas. 

ICD-9 Full and Abbreviated (ABB) Titles

The International Classification of Diseases, 9th Revision (ICD-9) was published by the World Health Organization (WHO) and is used by member nations to collect and report mortality and morbidity statistics.  While the 9th revision of ICD is acceptable for mortality use, it was determined by the DHEW that ICD-9 
was not sufficiently detailed for the morbidity requirements of the health care systems in the United States.  
Thus, the Commission on Professional and Hospital Activities through its affiliated division, the Council on 
Clinical Classifications (CCC), prepared a clinical modification of ICD-9, the International Classification of Diseases, 9th Revision,  Clinical Modification (ICD-9-CM).  ICD-9-CM was put into use in the United
States during 1979 and is completely compatible with the WHO version of ICD-9.

Standard Metropolitan Statistical Area (SMSA)

The Office of Management and Budget began defining metropolitan areas in 1950 in order to standardize 
federal statistical reporting activities.  The OMB definition is based on three criteria: population, density, 
and commuting.  A metropolitan area must have a central core of a certain population site and a densely populated adjacent area with a certain percentage of the area's residents commuting into the city for 
employment.  Attendant to this is the expectation of a high degree of social and economic integration 
between the city and the adjacent area.

U.S. Postal Service County-Zip Code Cross Reference File

The County Cross-Reference file is a product which provides a relationship between ZIP+4 codes and 
Federal Information Processing Standard (FIPS) county codes.  The file allows users who have assigned 
ZIP+4 codes to their address files to obtain county data at the ZIP+4 level.

City/County Databooks

The County and City Data Book Files are a machine readable presentation of the primary data contents 
of the printed CCDB (The Data Book). It consists of a series of 3042 character records, one for each 
state, county, SMSA,  census region, census division, and federal administrative region.

Each record contains all the data for one area.  The areas are sorted in  tables as they are in the data book.  
Each record contains the full alphabetic name of the area as well as the standard geographic codes which
apply.  Data from all tables  have been organized  into a single record format with any one item having the 
same location throughout.

Master Area Reference File - MARF 2

The Master Area Reference File (MARF) is the 1980 census counterpart of the Master Enumeration 
District List (MEDList) prepared for the 1970 census.  It links State or State equivalent, county or county equivalent, minor civil division (MCD)/census county division (CCD), and place names with their
respective geographic codes.  It is also an abbreviated summary file containing selected population and 
housing unit counts.

The second version of MARF (MARF 2) has the same geographic coverage as the first MARF and 
includes the following additional information:  FIPS place codes, latitude, and longitude coordinates for geographic areas down to the BG/ED level, land area in square miles for geographic areas down to the 
level of places or minor civil divisions (11 selected States) with a population of 2,500 or more, total 
population and housing count estimates based on sample returns, and per capita income for all geographic 
areas included in the file.

Mortality and Morbidity

Compressed Mortality File

The Compressed Mortality File (CMF) is a county level national mortality and population data base
spanning the years 1968-87.  Differential mortality trends can be easily and efficiently examined because
of the compact nature of this file.  The mortality data base of the CMF is derived from the U.S. micro-data
death records for this period.  The variables included on the condensed file are-  1)county of residence,
2) year of death, 3) race (white, black, other for 1979-87, white, all other for 1968-78), 4) sex, 5) age
group at death (15 age groups), and 6) underlying cause of death (4-digit ICD code).  The number of
records was reduced by counting records that were identical with respect to underlying cause of death,
age group at death, race, sex, and county of residence and then adding a count field.   The population data
base of the 1968-78 CMF is derived from annual estimates for each U.S. county by 5-year age groups,
race, and sex.  These estimates reflect adjustments based on the 1980 Census and were prepared by the
Bureau of the Census with modifications by NCHS.  The population data base of the 1978-87 CMF is
derived from intercensal estimates prepared by Richard Irwin of Demo-Detail, with modifications by
NCHS.  To permit the calculation of  infant mortality rates, NCHS live-birth data were substituted for the
estimates of the population under one year of age.

Fetal Death 

The Fetal Death data file is maintained by calendar year.  The information on fetal deaths was abstracted
from the Report of Fetal Death forms received from the States by the National Center for Health Statistics (NCHS) and this file contains a record for each form received.  Data from New York, excluding  New York 
City, were submitted in machine readable form.  All other data were coded and keyed by the  U.S. Bureau
of the Census.

Mortality Follow-Back Survey

The National Mortality Followback Survey (NMFS) is a national sample of approximately 1 percent of the
U.S. resident deaths of persons 25 years of age or more.  Information on about 18,500 deaths occurring in
1986 was obtained by mail questionnaires, telephone or personal interviews of the next-of-kin of the
decedent or others familiar with the decedent's  lifestyle.  The survey reports on the care in the last year of
life and (1) socioeconomic differentials in mortality, (2) the association between risk factors and mortality,
(3) health care provided in the last year of life,  and (4) the reliability of certain items reported on the death certificate.

Mortality Detail Data (1962-67)

Mortality (underlying cause of death) data include all deaths occurring within the United States.  Deaths 
of U.S. civilians and deaths of members of  the Armed Forces occurring outside the United States are not included.  Data are obtained from certificates filed for deaths occurring in each State.  Causes of death for 1962-63 were coded according to the International Statistical Classification of Diseases, 1955 Revision, 
Volume I.  

Mortality Detail Data

Vital statistics data relating to mortality provide demographic and cause-of-death data for deaths occurring 
during the calendar year.  The data are based on information abstracted from all death certificates filed in
vital statistics offices of each State and the District of Columbia.  Data were obtained from all certificates 
for 1968-71 and for 1973-78.  Data were obtained from a 50-percent sample of certificates for 1972.  
Causes of death for 1968-  were coded according to the Eighth Revision of the International Classification 
of Diseases, Adapted for use in the United States. 

Multiple Cause of Death Data (1968-78)

Mortality (multiple cause of death) data include all deaths occurring within the United States.  Deaths of
U.S. civilians and deaths of members of the Armed Forces occurring outside the United States are not
included.  Data are obtained from certificates filed for deaths occurring in each State.  Data were obtained
from all certificates for 1968-72 and 1973-78.  Data were obtained from a 50 percent sample of certificates
for 1972.  A detailed data tape file is available for each year.  Death certificate numbers are not on the tapes.  Causes of death for 1968-78 were coded according to the Eighth Revision, International Classification of Diseases, Adapted.  

Multiple Cause of Death Data 

Data were obtained from all certificates for 1979-80, and 1983-88.  Multiple cause data for 1981 and 
1982 were obtained from a 50 percent sample of certificates from 19 registration areas, and for other 
States, data were obtained from all certificates.  The user must be aware that the multiple cause files and 
the underlying cause files for 1981 and 1982 differ in that underlying cause files were processed on a 100 
percent basis.  A detailed data tape file is available for each year.  Death certificate numbers are not on 
the tapes.  Causes of death for 1979-  were coded according to the Ninth Revision, International 
Classification of Diseases.  Data items contained in the files include:

Subset of Mortality Data

These data files are subsets of the Mortality (underlying cause of death) data from  NCHS.  The data include 
all deaths occurring within the United States.  (Deaths of U.S. civilians and deaths of members of the Armed Forces occurring outside the United States are not included.)  The data were obtained from certificates filed 
for deaths occurring in each State.  Data were obtained from a 50-percent sample according to the Eighth Revision, International Classification of Diseases, Adapted.   Causes of death for 1979-87 were coded 
according to the Ninth Revision, International Classification of Diseases.

Natality Data

Linked Birth/Infant Death Data, Birth Cohorts

The Linked Birth/Infant Death Data Set consists of two separate data files.  The first file includes linked 
records of live births and infant deaths for the birth cohort -- also referred to as the numerator file.  The 
second file is the live birth file -- referred to as the denominator file.  The files are offered as a numerator/ denominator data set to give users the means to compute infant mortality rates.  The linked file is comprised 
of deaths to infants who died before their first birthday.  Infant death records were extracted from the NCHS mortality statistics files.  Linked birth records were extracted from a file that contained the NCHS natality statistical file,  a small number of late-filed birth certificates, and certificates from selected States that were 
needed to match to an infant death record.  This file is not identical with the NCHS natality statistical file.

Natality Detail Data

The Natality data include all births occurring within the U.S.  The data are obtained from  certificates filed 
for births occurring in each State.  Data were obtained from a 50- percent sample of certificates from  1969-1971.  Starting in 1972 all records were included for States that participated in the Vital Statistics Cooperative Program (VSCP).  The number of States participating in the VSCP increased from 4 in 1972 
to 46 in 1984; beginning in 1985, all States and the District of Columbia participated.  Each record in the  Natality dataset contains a weight field which is designed to inflate tabular totals to the national birth figures.

Natality Detail Data, Subset

These data files are subsets of the Detail Natality data from NCHS.   The Natality data include all births 
occurring within the U.S.  The data are obtained from certificates filed for births occurring in each State 
were obtained from certificates filed for births occurring in each State.  Data were obtained from a 
50-percent sample of certificates from 1969-1971.  Starting in 1972 all records were included for States 
that participated in the Vital Statistics Cooperative Program (VSCP).  The number of States participating 
in the VSCP increased from 4 in 1972 to 46 in 1984; beginning in 1985, all States and the District of 
Columbia participated.


Registries

Surveillance, Epidemiology and End Results (SEER)

Surveillance, Epidemiology and End Results (SEER) Population Data  This population file was constructed 
at the National Cancer Institute (NCI)  for the analysis of SEER and other databases. It contains population 
data for a specified set of years and geographic areas.  A continuing project of the National Cancer Institute (NCI), the SEER Program collects cancer data on a routine basis from designated population-based cancer registries in various areas of the country.  Trends in cancer incidence, mortality, and patient survival in the
United States, as well as many other studies, are derived from this data bank.  The geographic areas 
comprising the SEER Program's data base represent an estimated 9.6% of the United States population.  
By the end of 1988, the data base contained information on 1.5 million cases diagnosed since 1973; approximately 120,000 new cases are accessed yearly.

Surveillance, Epidemiology and End Results (SEER) Population Data

This population file was constructed at the National Cancer Institute (NCI) for the  analysis of SEER and 
other databases.  It contains population data for a specified set of  years and geographic areas.  The source 
of these population may be the actual counts provided by the U.S. Census Bureau for one of the decennial 
census years, 1970 or 1980; estimates for intracensal years or special racial groups provided by the U.S. 
Census Bureau;  or estimates computed by NCI.

Health  Surveys

Behavioral Risk Factor Surveillance System

The Behavioral Risk Factor Surveillance System (BRFSS) is an on-going random-digit-dialed telephone 
survey used to determine the prevalence among adults 18 and older of behaviors and practices--such as 
cigarette smoking,  seat belt use, blood cholesterol screening, high blood pressure control,  physical activity, weight control, alcohol use, and drinking and driving--which are related to the leading causes of death in 
the US.  To maximize comparability, methods and questionnaires are standardized across participating states.

Hispanic Health and Nutrition Examination Survey

Hispanic HANES (HHANES) was conducted on a nationwide probability sample of approximately 
16,000 persons, aged 6 months-74 years, in the non-institutionalized population of eligible Hispanics:  
Mexican-Americans in the southwest; Puerto Ricans in the New York area (defined as selected counties 
in New York, New Jersey, and Connecticut); and Cuban-Americans in Dade County (Miami), Florida.  
Of this sample, 11,653 persons were examined:  7,462 Mexican-Americans, 1,357 Cuban-Americans, 
and 2,834 Puerto Ricans.  Examinations were conducted from July 1982 through December 1984.  
Hispanics were included in past health and nutrition examinations, but not in sufficient numbers to produce estimates of the health of Hispanics in general nor specific data from Puerto Ricans, Mexican-Americans, 
or Cuban-Americans.  All examinees had a medical history, dental exam, body measurements, a dietary
interview, and numerous laboratory tests on blood and urine specimens.  Children six and over had vision 
and hearing tests.  Most of the other specialized tests, such as gallbladder ultrasound, glucose tolerance,
electrocardiogram, and liver disease tests, were given to a selection of those 20 years or older.

National Health and Nutrition Examination Survey I

The first Health and Nutrition Examination Survey (HANES 1), conducted during the period 1971-75, 
was designed to measure the nutritional status and health of the U.S. population ages 1-74 years and to 
obtain more detailed information on the health status and medical care needs of adults ages 25-74 years
in the civilian non-institutionalized population.  The information obtained during the course of the survey 
consisted of:  a detailed dietary interview;  general medical histories and detailed histories of respiratory 
disease,  cardiovascular disease, and arthritis; health care needs and general well-being questionnaire; a 
general medical examination; dental, dermatological,  and opthalmological examination; anthropometry; 
vision tests and speech tests; x-ray of the hands, knees, hips and chest; blood and urinary laboratory tests; electrocardiogram; spirometry and pulmonary diffusion tests; and tap water samples.  There were a total of 32,331 sample persons of which 31,973 were interviewed and 23,808 were examined.

National Health & Nutrition Examination Survey I, Follow-up Data

The first Health and Nutrition Examination Survey (HANES I), conducted during the period 1971-75, was designed to measure the nutritional status and health of the U.S. population ages 1-74 years and to obtain
more detailed information on the health status and medical care needs of adults ages 25-74 years in the
civilian non-institutionalized population.  The 1982-84 HANES I Epidemiologic Follow-up Study population
is comprised of the 14,407 persons aged 25-74 years at the time of the HANES I survey.  Tracing was successfully completed on 93 percent of the cohort.  Personal interviews including weight, pulse, and blood pressure measurements were conducted with traced, surviving subjects.  Interviews with proxy respondents
were conducted if the subject was deceased or incapacitated.

National Health and Nutrition Examination Survey II

The second Health and Nutrition Examination Survey (NHANES II), conducted during the period 1976-80,
was designed to measure and monitor the nutritional status and health of the U.S. population ages 6 months through 74 years.  A similar survey, NHANES I, was conducted from 1971-1975.  During NHANES II,
data were collected by means of a household questionnaire, medical histories,  dietary questionnaires, a
physical examination, spirometry trials, electrocardiograms, body measurements,  audiometry, speech and 
allergy tests, x-rays, a medication/vitamin usage questionnaire, a behavior questionnaire and laboratory 
analyses of blood and urine samples.  There were a total of 27,801 sample persons of which 25,286 were interviewed and 20,322 were examined.

National Health Interview Survey Data, Core Data

The National Health Interview Survey is a continuing nationwide survey of the U.S. civilian non-
institutionalized population conducted in households.  Each week a probability sample of households is interviewed by trained personnel of U.S. Bureau of the Census to obtain information about the health and
other characteristics of each living member of the sample household.  During a year the sample is composed
of 36,000 to 46,000 households including 92,000 to 135,000 people depending on the year.

Information is obtained on the number of restricted activity days, bed days, work or school loss days, and
all physician visits occurring during the 2-week period prior to the week of the interview.  Data are also
obtained on the acute and chronic conditions that are responsible for these days or visits.  Respondents are
asked about long term limitation of activity and chronic conditions related to this disability.  All conditions are coded according to International Classification Of Diseases, using the limited diagnostic detail available
from a household respondent.  Data are obtained on all hospital episodes during the prior 12 months,
including length of stay and whether or not surgery was performed.

National Health Interview Survey Summary Data (1969-81)

This is a summary file containing information from the NHIS Public Use Tapes from 1969 through 1981. 
The file contains information on all persons interviewed for NHIS who were 30 years of age or older at
the time of interview.  The file includes variables of specific substantive interest  (e.g., ethnicity) for which information was available for at least 8 years.  Variables were selected only once for this file, though parallel entries existed on the input files that were merged to produce this file.  Variables that related only to data processing and collection such as interview number, etc. were excluded.  The file includes conditions that
were defined as chronic by NCHS and caused limitation of activity either as the primary cause or as a
secondary cause.

National Health Interview Survey Supplement Data

Current health topics are added each year to the National Health Interview Survey's (NHIS) basic
questionnaire.  The current health topics generally change each year.  These changes facilitate a
response to the need for population-based data on current or emerging health issues and coverage
of a wide variety of topics.  Some of the topics include:

 Adoption
 Aging
 AIDS
 Alcohol
 Cancer Control/Epidemiology
 Child Health
 Dental Health
 Diabetes
 Health Insurance
 Immunization
 Longest Job Held
 Mental Health
 Occupational Health
 Polio
 Smoking
 
National Ambulatory Medical Care Survey

The National Ambulatory Medical Care Survey (NAMCS) is a nationwide survey designed to meet the
needs for objective, reliable information about the provision and use of ambulatory medical care services
in the United States.  Findings are based on a sample representative of all ambulatory office visits to
physicians in the United States who are engaged in patient care in an office setting; physicians who are not engaged in patient care in an office setting, physicians in government service, and physicians in the specialties
of anesthesiology, pathology, and radiology are excluded from the survey.  Specially trained interviewers
visited the physicians prior to their participation in the survey, provided them with survey materials, and
thoroughly instructed each physician and staff member in the methods and definitions to be used.  During a randomly assigned 7-day period, data for a systematic random sample of visits were recorded by the
physicians or their staff on an encounter form provided for that purpose.  Data were obtained on selected demographic characteristics of patients, several clinical aspects of the visit, including medications (if any),
and physician specialty and type of practice.

National Hospital Discharge Survey

The National Hospital Discharge Survey is conducted by the National Center for Health Statistics.  It
provides a continuous sample of hospital discharge records, collecting medical and demographic
information for calculating statistics on hospital utilization.  The survey consists of data abstracted from
the face sheets of the medical records for sampled inpatients discharged from a national sample of
nonfederal short-stay hospitals, located in the 50 States and the District of Columbia.

National Maternal and Infant Health Survey

The National Maternal and Infant Health Survey (NMIHS) was conducted by the National Center for
Health Statistics to study factors related to poor pregnancy outcome, such as adequacy of prenatal care; inadequate and excessive weight gain during pregnancy; maternal smoking, drinking, and drug use; and
pregnancy and delivery complications.

National Medical Care Utilization and Expenditure Survey

The National Medical Care Utilization and Expenditure Survey (NMCUES) was a panel survey designed
to collect data about the U.S. civilian non-institutionalized population in 1980.  Information was obtained
on health,  access to and use of medical services, associated charges and sources of payment, and health insurance coverage.  NMCUES consisted of three survey components.  The National Household
Component comprised about 6,000 randomly selected households that were interviewed five times during
14 months in 1980-81.  The State Medicaid Household Component files in California Michigan, New York,
and Texas (1,000 household in each State).  Each household was interviewed five times during 14 months in 1980-81.  The Administrative Records Component was used to obtain information on program eligibility and payments for Medicare for persons receiving Medicare and Medicaid.  The NMCUES Public Use Data
Files contain only respondent data from the National Household survey.  These data are from a sample of
17,123 persons representing the civilian non-institutionalized population of the United States.

National Medical Expenditures Survey

NMES provides national estimates of health status and estimates of insurance coverage and the use of
services, expenditures, and sources of payment for the period from January 1 to December 31, 1987. 
The reports of health care expenditures and insurance coverage obtained in the household surveys are
supplemented by additional surveys;  most important among these are the Health Insurance Plan Survey
of employers and insurers of consenting household survey respondents, and the Medical Provider Survey
of physicians, osteopaths, and inpatient and outpatient facilities, including home health care agencies, which
were reported as providing services to any member of the non-institutionalized population sample.  A
Medicare Records component is used to provide a record check on 1987 eligibility status and claims
information of all Medicare beneficiaries, including those in the institutional  population.

National Nursing Home Survey

The purpose of the National Nursing Home Survey (NNHS) is to collect baseline and trend statistics
about nursing facilities, their services,  residents, discharges, and staff.  The resulting published statistics
will describe the Nation's nursing facilities and the health status of their residents.  These data are used for studying the utilization of nursing facilities, for supporting research directed at finding effective means for
treatment of long-term health problems, and for setting national policies and priorities.  Data for the NNHS 
were collected from a nationally representative sample of 1,220 nursing and related care homes using a combination of personal interview and self-enumeration techniques.

National Survey of Ambulatory Surgery

The National Survey of Ambulatory Surgery was undertaken to obtain information about the use of 
ambulatory surgery.  Ambulatory, or outpatient, surgery has increased in the United States since the early 
1980's.  Two major reasons for this increase were advances in medical technology and cost containment initiatives.  This survey, conducted by the National Center for Health Statistics (NCHS) and implemented in 1994,  covers ambulatory surgery procedures performed in hospitals and free-standing ambulatory surgery centers in the United States.  A brief description of the survey design and data collection procedures is given below.   A more detailed description of the survey design, data collection procedures, and the estimation 
process will be published in a forthcoming report from the NCHS.

Youth Risk Behavior Survey

The Youth Risk Behavior Survey (YRBS), was conducted as a follow-back to the National Center for 
Health Statistics' 1992 National Health Interview Survey.  The YRBS was sponsored by the Division of Adolescent and School Health, National Center for Chronic Disease Prevention and Health Promotion 
This survey is one piece of a larger system of research, the Youth Risk Surveillance System, that was 
developed to monitor the major risk behaviors of American youth.


General Data Sets

National Children and Youth Fitness Study,  Phase I / Phase II

The National Children and Youth Fitness Study (NCYFS) was conducted under the auspices of the
Office of Disease Prevention and Health Promotion of the U.S. Public Health Service to assist in
addressing the health objectives for the nation related to fitness and exercise among youth.  The
objectives are outlined in the 1980 report, Promoting Health/Preventing Disease:  Objectives for the
Nation.  The NCYFS generated normative data by grade/sex and age/sex for nine fitness measures,
described the physical activity patterns of youth and provided a preliminary analysis of the relationships
between exercise and fitness.  Among the health promotion objectives motivating the study was the
development of a database and data systems that would permit definition of the relationships between
exercise patterns and health and the tracking of patterns of participation in physical activity.  The NCYFS database was designed with the health promotion objectives related to monitoring systems in mind.  The
data file contains 8,800 observations representing a national probability sample of fifth through twelfth
grade boys and girls.

The NCYFS II which dealth with 6- to 9-year olds, was the logical completion of NCYFS I, which was
limited to 10- to 18- year olds.  The NCYFS II generated normative data by grade/sex and age/sex for
nine fitness measures, described the physical activity patterns of youth, and provided a preliminary analysis
of the relationships between exercise and fitness.  Among the health promotion objectives motivating the
study was the development that would permit definition of the relationships between exercise patterns
and health and the tracking of patterns of participation in physical activity.  The NCYFS II database was designed with the health promotion objectives related to monitoring systems in mind.  The data file contains
 4,678 observations representing a national probability sample of fifth through twelfth grade boys and girls.

National Survey of Family Growth 4

The National Survey of Family Growth (NSFG), Cycle IV, was conducted by Westat,  Inc., under
contract to the National Center for Health Statistics (NCHS).  In-person interviews were conducted
with women 15-44 years of age,  of all marital statuses.  The data from the NSFG are used by the
National Center for Health Statistics as the basis for a series of reports on fertility, family formation, contraception, and related issues.  In addition,  agencies that support the NSFG use the data for their
own research programs.

North Carolina Medical Examiner Deaths

The North Carolina Medical Examiner (NCME) Deaths data include all deaths investigated by the North Carolina Medical Examiner's Office from 1972 through 1984.  The data have been extracted from the
original NCME tape files and put into a SAS file for CDC use.

U.S. Department of Agriculture

The most recent data base of food composition data.  It  contains only data for the most recent data base
of food  composition data.  It contains only data for the most recent update sections to Agricultural Hand-
book No. 8, section 17, 20, and 21, and values that have been updated since released in sections contained
on an earlier tape.  The format is the same as that used in the USDA Nutrient Data Base for Standard
Reference.  The data file format, a list of food descriptions and item numbers, which may be printed for use
as a coding manual, appear as the first data set on the tape.

A file to link foods already existing under an old code to the new code assigned in the updated section for
the same item has been included as the third data set on the tape.  The fourth file on the tape is a data set
of  food codes that have been deleted from  Release 7 of the Standard Reference.  The fifth file on the tape
is a data set of changes that have been made to the Release 7 of the standard reference file.  The changes
are in the same format as the regular records.  The sixth file on the tape is a set of nutrient additions to be
made to the release 7 version of the standard reference file.  These values are additions to the existing file
and not changes to the values that are there.  The format used is the same as that used for the data records
in the full file which allows one to merge these records with those already there.  There are 1086 records
in this file.