nci logo
NIH
U.S. National Institutes of Health National Cancer Institute

Race Recode Changes
For Data through 2005 (November 2007 Submission)

The algorithms for creating the race recode variables in the SEER incidence and US mortality data were modified starting with the November 2005 submission of data.  All of the variable names within the SEER*Stat and SEER*Prep software were modified for clarity and to avoid compatibility issues between submissions of data.

For incidence and mortality rate calculations, we recoded detailed race information into four major categories in order to make them compatible with available annual population estimates used as denominators for the rates. These categories are:

  • White
  • Black
  • American Indian/Alaskan Native
  • Asian or Pacific Islander

The available race codes for the fields in the underlying incidence and mortality data have changed over the years.  For some years, both the SEER incidence and NCHS mortality data have had a code available for “all other races”, when in fact every race was already represented, and therefore the “all other races” code was not needed.  However, cases/deaths were coded to this category.  In the past, when creating the race recodes, all cases/deaths with the “all other races” value have been treated as Asian or Pacific Islander.  These cases/deaths are now coded into a new race code in all of the race recodes. This code is labeled as:  “Other – unspecified (1991+)” for incidence data and “Other – unspecified (1978-1991)” for mortality data.  This race category does not have associated populations and is treated similar to "unknown" race in most cases.

If you are interested in reproducing our previous methodology, you can simply group the "Other – unspecified (year range)" category with the appropriate category (depending on the race recode you are working with).

The “Race/ethnicity” variable used to create the race recodes in the SEER incidence data has been revised for the data through 2005 (November 2007 submission). This field is created from the Race1 and Indian Health Service (IHS) Link Recode variables. If Race1 is white, unknown, or other and the IHS Link Recode is positive, then Race/ethnicity is set to American Indian/Alaskan Native, otherwise Race/Ethnicity is set to the Race1 value. This page shows the previous method: http://seer.cancer.gov/seerstat/variables/seer/yr1973_2004/race_ethnicity/

Spanish-Hispanic-Latino Ethnicity

Hispanic is not mutually exclusive from Whites, Blacks, Asian/Pacific Islanders, and American Indians/Alaska Natives.

Incidence data for Hispanics are based on NAACCR Hispanic Identification Algorithm (NHIA). When producing statistics using SEER Incidence data for Hispanic ethnicity, we exclude cases from the Alaska Native Registry and Kentucky.

For state exclusions that SEER uses when producing Hispanic (and non-Hispanic) mortality rates, see Policy for Calculating Hispanic Mortality.

Combining Race and Ethnicity in Rate Analyses

Some SEER incidence and mortality databases in SEER*Stat are now linked to both race (White, Black, AI/AN, API) and Hispanic origin within the same database.  While this provides the ability to produce rates for the 8 combinations of these variables, the SEER Program does not recommend using all of the combinations.  SEER only reports Hispanic/non-Hispanic rates for the races of all races combined, white, and non-white. 

American Indian/Alaskan Native Statistics

When producing statistics using SEER Incidence data for American Indians/Alaska Natives, SEER frequently only includes cases that are in a Contract Health Service Delivery Area (CHSDA).

The following spreadsheet has the CHSDA 2006 variable definition used in SEER*Stat: [MS Excel File] [PDF File]