National Cancer Institute
Health Services & Economics Branch
Cancer Control and Population Sciences

Programming Support:

SEER-Medicare: Using SEER*Stat to Analyze SEER-Medicare Data

SEER*Stat is statistical software for the analysis of SEER and other cancer-related databases which can be used to view individual cancer records and to produce incidence, mortality, and survival statistics. A reason to use SEER*Stat rather than other statistical packages for the analysis of SEER-Medicare linked data is that SEER*Stat performs some statistical calculations which are either not available in other statistical packages (e.g. relative survival) or are cumbersome to perform (e.g. age-adjustment, linking the correct population denominators with incidence files). In addition, there are now various "back-end" statistical programs developed especially for the analysis of population-based cancer statistics which have been made to work easily with SEER*Stat output. For a summary of new methods and software for the analysis of population-based cancer statistics see:

The SEER*Prep software converts ASCII text data files to the SEER*Stat database format, allowing you to analyze your cancer data using SEER*Stat. If you are a new SEER*Prep user, see How to use SEER*Prep to Create a Database. Follow the steps below to create a database using SEER-Medicare data for use in SEER*Stat.

I. Install SEER*Stat and SEER*Prep

To use SEER*Stat to analyze your data, you must use SEER*Prep Version 1.9 or later and a compatible version of SEER*Stat. Current versions of the SEER*Prep and SEER*Stat software are available on the SEER Web site.

II. Download Files

  1. Download the appropriate SAS program from the table below. The number after 'infile' refers to the length of the PEDSF record. For example, if the PEDSF which you received has a record length 1234, you would need to download the "" program.
  2. Next, download the Database Description file compatible with the SAS program that you selected. This file provides SEER*Prep with a description of the SAS program's output file.

    SAS Program Database Description File SEERMed2.PEDSF650.dd SEERMed.PEDSF-674.dd

III. Preparing SEER-Medicare data for use with SEER*Stat

  1. Make the necessary modifications to the "Create.SEERPrep.infile*.sas" program. There are instructions at the top of the program in the comment block. Search for ‘USER' and make changes as described in the following steps.
    1. You will need to change the input and output filenames. These include the PEDSF sent to you by IMS, any analysis file(s) from which you wish to use variables, and the filename where you wish to save the output data from this program. Your analysis file (your selected cohort with your constructed variables) must contain the SEER Registry/Case number as a length 10 character string and the SEER Sequence number for the tumor as a length 2 character string. You may use multiple analysis files, but you will have to make more extensive changes within the program to handle this. If you do choose to use an analysis file, verify that the filename or libname statement has been un-commented.
    2. You will need to select an end date (month and year) for your study. These should be preset to the last year of Medicare claims files available, but you can select other values more appropriate to your analysis.
    3. If you have chosen to merge with an analysis file, you must now modify the ‘ANALYSIS MERGE SECTION'. This section of code assumes that the incoming file is a SAS data library. If you have a text file, you will need to add an input statement. You must select the variables you wish to add to the output file for analysis in SEER*Stat. PLEASE NOTE: the variables you select for use in SEER*Stat must be numeric. If you wish to use variables from multiple files, you must add additional code to input and merge more data sets. Be sure to un-comment this section (delete the 2 lines ‘/* USER comment out begins' and ‘USER comment out ends */').
    4. Again, if you have chosen to merge with an analysis file, you must modify the output statement where the lenX_n variables are shown. ‘lenX' shows the length of the variables which may be placed in the given location. You MUST put your variables in fields of the same length. SEER*Stat will not correctly handle data if you use this space in a manner not intended. Places have been provided for 10 variables of length 1 and 10 of length 2. There is also space for 5 variables each of length 3, length 4 and length 5. ‘_n' shows the count of the given length (len3_2 is the 2nd variable of length 3). This is just a convenience to make keeping track of your variables easier.
  2. Run the customized "Create.SEERPrep.infile*.sas" program.
  3. Use SEER*Prep to convert the data file you just created into a SEER*Stat database as described below.
    1. Open the database description file. Use the file menu to open the Database Description file that you donwloaded in Step II.
    2. Edit the Database Name and provide a meaningful name for your new database.
    3. Select the Input Case File(s) by using the ‘Add...' button to select the file you created with the SAS program.
    4. Set the study cutoff date. The date should match the end date of your study (see step 1b).
    5. You may edit the ‘User-specified nth length x' variables starting in column 300 to have a name and format consistent with your analysis variables. Remove the ‘User-specified nth length x' variables that you are not using.
    6. Save the modifications you have made to the "*.dd" file. Use Save As... to save the dd file with a meaningful name.
    7. Use SEER*Prep's verify feature to check your data. Review the verify report. If any problems require changes to the data, you will need to make these changes outside of SEER*Prep.
    8. Create a database using the Create file command in SEER*Prep. Be aware, if you select ‘Exclude for all invalid values' on the Record Exclusion screen, any record not found in your analysis file will be excluded from the SEER*Stat database.
  4. Start the SEER*Stat application. Your new database will be listed on the data tab in all appropriate session types. If it is not listed, check the Secondary Data Location specified in SEER*Stat's Preferences. It should match SEER*Prep's User-Defined Data location.

Last modified:
20 Nov 2007
Search | Contact Us | Accessibility | Privacy Policy  
DCCPS National Cancer Institute Department of Health and Human Services National Institutes of Health The US government's official web portal