nci logo
NIH
U.S. National Institutes of Health National Cancer Institute

Race Recode Changes
For Data through 2003 (November 2005 Submission)

The algorithms for creating the race recode variables in the SEER incidence and US mortality data were modified starting with the November 2005 submission of data.  All of the variable names within the SEER*Stat and SEER*Prep software have been modified for clarity and to avoid compatibility issues between submissions of data.

For incidence and mortality rate calculations, we have recoded detailed race information into four major categories in order to make them compatible with available annual population estimates used as denominators for the rates. These categories are:

  • White
  • Black
  • American Indian/Alaskan Native
  • Asian or Pacific Islander

The available race codes for the fields in the underlying incidence and mortality data have changed over the years.  For some years, both the SEER incidence and NCHS mortality data have had a code available for “all other races”, when in fact every race was already represented, and therefore the “all other races” code was not needed.  However, cases/deaths were coded to this category.  In the past, when creating the race recodes, all cases/deaths with the “all other races” value have been treated as Asian or Pacific Islander.  These cases/deaths are now coded into a new race code in all of the race recodes.  This code is labeled as:  “Other – unspecified (1991+)” for incidence data and “Other – unspecified (1978-1991)” for mortality data.  This race category does not have associated populations and is treated similar to "unknown" race in most cases.

If you are interested in reproducing our previous methodology, you can simply group the "Other – unspecified (year range)" category with the appropriate category (depending on the race recode you are working with).

The “Race/ethnicity” variable used to create the race recodes in the SEER incidence data has been revised.  Previously, this field was simply Race1 from the NAACCR file format.  Now this field is created from Race1, Race2, and the Indian Health Service (IHS) Link variable.  Race/ethnicity starts as Race1.  If Race1 is white and Race 2 is a specified non-white race, then the value from Race2 is used.  After this check, if Race/ethnicity is still white and there is a positive IHS Link, then Race/Ethnicity is set to American Indian/Alaskan Native.

Spanish-Hispanic-Latino Ethnicity

Hispanic is not mutually exclusive from Whites, Blacks, Asian/Pacific Islanders, and American Indians/Alaska Natives.

Incidence data for Hispanics are based on NAACCR Hispanic Identification Algorithm (NHIA). When producing statistics using SEER Incidence data for Hispanic ethnicity, we exclude cases from Hawaii, Seattle, Alaska Native Registry and Kentucky.

For state exclusions that SEER uses when producing Hispanic (and non-Hispanic) mortality rates, see Policy for Calculating Hispanic Mortality.

Combining Race and Ethnicity in Rate Analyses

Some SEER incidence and mortality databases in SEER*Stat are now linked to both race (White, Black, AI/AN, API) and Hispanic origin within the same database.  While this provides the ability to produce rates for the 8 combinations of these variables, the SEER Program does not recommend using all of the combinations.  SEER only reports Hispanic/non-Hispanic rates for the races of all races combined, white, and non-white. 

American Indian/Alaskan Native Statistics

When producing statistics using SEER Incidence data for American Indians/Alaska Natives, SEER only includes cases from Connecticut, Detroit, Iowa, New Mexico, Seattle, Utah, Atlanta, and the Alaska Native Registry and excludes cases diagnosed in 2003.