SEER*Stat/DevCan Importing Tutorials
The following information and exercises will teach you how to:
- prepare cancer incidence and mortality data using SEER*Stat, and
- import it into DevCan for analysis.
These instructions assume that you have SEER*Stat installed,
and are familiar with how to use it. If you do not know how to perform SEER*Stat tasks such as
building selection statements or creating user-defined variables, you should work through the
SEER*Stat tutorials before starting these
exercises.
This page provides general instructions for the process of preparing data in SEER*Stat and importing
it into DevCan. To work through specific examples, see the exercises.
How to Import SEER*Stat Incidence and Mortality Data into DevCan
- Open two new Rate sessions in SEER*Stat.
- Select an Incidence database in one session, and a Mortality database in the other.
Be sure to choose databases with the same submission year, and which both
cover the time period for which you want statistics.
- In both sessions, select Rates (Crude) for your type of statistic.
- Define selection criteria in each session according to your interest.
Because the SEER incidence databases only contain data for the geographical areas covered by
the SEER registries, while the US Mortality
databases contain data for the entire United States, it is recommended that you define
appropriate geographical selection criteria if you are using the SEER incidence databases.
- When you edit the Incidence session, it is important that you
construct your selection statement so that it will find only one tumor of each kind
per person. You are interested in what percentage of the population has had a
particular type of cancer, not in how many tumors anyone has had.
(This is not an issue when you are working with a mortality database, because a
mortality database can only have one record per person.) If you are working with only one cancer site,
do this by marking the Select Only the First Matching Record for Each Person
box on the Selection tab. Multiple cancer sites are more complex; see Exercise 2
for details.
- On each session's Table tab, choose the variables by which
you would like your matrix to be organized.
- You must choose corresponding variables (for example, "Age at Diagnosis" and
"Age at Death", or "Site Recode" and "Cause of Death") and put them
in the same places (i.e., in the same order from top to bottom on the Table tab)
in both sessions.
- In order to provide DevCan with the necessary data in the correct format, you
must make the population-defining age variable the last variable listed (usually,
the last Column variable).
- In addition, the first age in the data must be 0, and
the first characters of the name of each age group must be the numeric representation of that group's
starting age. (e.g, a grouping named "00-04" is acceptable, but "Ages 00-04" is not).
DevCan will not allow the data to be imported if these conditions are not met. Please note: if you are
using the "Age Recode with <1 year olds" variable in the SEER databases, you will need to
create a user defined copy of this variable without the "Unknown" age group.
- Do not choose any variables that define the tumor itself. You have already
ensured that your results will include only one tumor per person, and you
do not want to risk excluding any of those tumors from the
table. However, it is safe to choose variables that define the person, such as those
in the first two categories ("Age at Diagnosis"
and "Race, Sex, Year Dx" for Incidence databases, and "Age at Death" and
"Race, Sex, Year Dth, State, Cnty, Reg" for Mortality databases).
- On each session's Output tab, set Display Rates as Cases Per to "100,000".
Title your matrix and adjust the other settings according to your preference.
- Execute both sessions and save the matrices. These are the Incidence of Cancer and Cancer Mortality matrices.
- In the Mortality session, go to the Selection and Table tabs and
remove all search criteria or variables which specify the particular cancer site(s).
Execute the session again and save the matrix under a different name. This is the All Causes of Mortality matrix.
- Export the matrices with the following settings:
- Output Variables as: Numeric Representation
- Line Delimiter: DOS/Windows (CR/LF)
- Missing Character: Space
- Field Delimiter: Tab
- Check the boxes to Remove All Thousands Separators (Commas) and
Remove Flags (Footnote), Prefix and Suffix Characters. Leave the other checkboxes unmarked.
- In DevCan, open the Database menu and select Import SEER*Stat Data.
- Use the Browse buttons next to each field to locate the appropriate exported SEER*Stat ".dic" files,
then Execute the task.
- When prompted, enter a new database name in which to save this data. You can use
this name later to retrieve the data you are importing.
- You may receive warning messages at this point, particularly if you are importing data on
multiple cancer sites. Check that the warning messages do not say anything unexpected before you proceed.
- Select your desired values for the listed variables, and use the drop-down list on the
toolbar to select how the statistics should be displayed.
- Execute the session. The results are displayed in the area at the bottom of the screen.
- You can Save and/or Print these reports as desired.
Mortality Data Only
If you want to create a database with only mortality information, you can omit the incidence
data. In that case, ignore the instructions that pertain to the Incidence of Cancer database, but
set up the Cancer Mortality and All Causes of Mortality databases as directed. When importing the data
into DevCan, you need not enter anything in the Incidence Dictionary File field.
[top]
Exercise 1 (one cancer site)
Exercise 2 (multiple cancer sites)
[top]
|