Importing SEER*Stat Data into DevCan, Exercise 2:

Related Materials:


Importing SEER*Stat Data into DevCan
Exercise 2

Step 1:  Prepare the Cancer Incidence data

  1. Start SEER*Stat.
  2. Start a new Rate Session.
  3. On the Data tab, select the database "Incidence - SEER 13 Regs Limited-Use, Nov 2004 Sub for Expanded Races (1992-2002)".
  4. On the Statistic tab, select Rates (Crude) as your type of statistic.
  5. Go to the Selection tab.
  6. Open the File menu and click Dictionary.
  7. Open the Race, Sex, Year Dx, Registry, County folder.
  8. Create a User-Defined variable based on "Race Recode Y". (You may have already created this variable in Exercise 1. If you saved it then, you can reuse it now.) It should have three groupings: "All Races" (which includes all available values), "White", and "Black". (Learn more about naming groupings to be imported into DevCan.) Call this variable "Race recode Y (All, White, Black)".
  9. Open the Site Specific Sequence Numbers folder.
  10. Create a User-Defined variable based on "Site - malignant (most detail)". It should include the following groupings:

    • Breast - mal
    • Cervix Uteri - mal
    • Corpus Uteri - mal
    • Uterus, NOS - mal
    • Ovary - mal
    • Vagina - mal
    • Vulva - mal
    • Other Female Genital Organs - mal

    These groupings should already exist; instead of creating them anew, you can simply delete all of the variable's other groupings.

    Call the new variable "Site - mal (most detail) - Female genital".
  11. When you are done creating the variables, Close the dictionary.
  12. On the Selection tab, Edit the Race, Sex, Year Dx, Registry, County (Pop, Case Files) selection statement to read:
    {Race, Sex, Year Dx, Registry, County.Sex} = 'Female'
    AND {Race, Sex, Year Dx, Registry, County.Year of diagnosis} = '2000','2001','2002'
  13. Edit the Other (Case Files) selection statement to read:
    {Site Specific Sequence Numbers.SS seq # - mal (most detail)} = 1
    AND {User-Defined.Site - mal (most detail) - Female genital} = 'Breast - mal','Cervix Uteri - mal','Corpus Uteri - mal','Uterus, NOS - mal','Ovary - mal','Vagina - mal','Vulva - mal','Other Female Genital Organs - mal'
  14. Do not check the "Select Only Malignant Behavior" or "Select Only the First Matching Record for Each Person" boxes. (Learn more about the "Select Only..." checkboxes.)
  15. On the Table tab, arrange the variables as follows:
    • Page
      • Site - mal (most detail) - Female genital
    • Row
      • Race recode Y (All, White, Black)
    • Column
      • Age Recode with <1 year olds

    Remember not to arrange the variables in a different order.
  16. Go to the Output tab.
  17. Enter a title for the matrix.
  18. Choose to Display Rates as Cases Per 100,000.
  19. Execute the session. Your matrix will be calculated and displayed in a new window.
  20. Save the matrix with a filename that identifies it as the Cancer Incidence matrix. Compare it to ssdc2_cancer_incidence.sim if necessary.
  21. Open the Matrix menu. Select Export, then Text File.
  22. Set up the options as follows:
    • Output Variables as: Numeric Representation
    • Line Delimiter: DOS/Windows (CR/LF)
    • Missing Character: Space
    • Field Delimiter: Tab
    • Check the boxes to Remove All Thousands Separators (Commas) and Remove Flags (Footnote), Prefix and Suffix Characters. Leave the other checkboxes unmarked.
  23. Export the matrix with a filename that identifies it as the Cancer Incidence data.

[Previous]  [Next]

Learn More...

  • Naming groupings to be imported into DevCan: DevCan expects that the first characters in the name of any grouping in an age variable will be the starting age of that grouping. So, for example, a grouping containing the ages 65 - 69 could be named "65-69" or "65 and up", but if it were named "Ages 65-69" or ">= 65", DevCan would not be able to import the variable. Note that this will cause a problem in the case of invalid data, since SEER*Stat uses the label "Invalid value(s)" for a grouping containing data in an invalid format. Non-age variables, such as we use in these tutorials, are not affected by this restriction.
  • The "Select Only..." checkboxes: In Exercise 1, you marked the "Select Only Malignant Behavior" and "Select Only the First Matching Record for Each Person" boxes on the Selection tab. In this exercise, you should leave them unmarked, because they are redundant with criteria you have already established in the selection statements. In particular, note that marking "Select Only the First Matching Record for Each Person" would limit your results to the first cancer of any of the types you specified, whereas you want to find the person's first of each of multiple types of cancer, which you have already indicated by specifying a sequence number of '1'.

Last modified:
15 Apr 2007
Search | Contact Us | Accessibility | Privacy Policy
DCCPS National Cancer Institute Department of Health and Human Services National Institutes of Health USA.gov: The US government's official web portal