SEER-Medicare: Calculation of Comorbidity Weights
To assist investigators with the calculation of comorbidity weights, NCI is providing two SAS macros. These macros will produce comorbidity weights as developed by Charlson et al (PDF) (J Clin Epidemiol 1993;46:1075-9).
The first decision that an investigator must make is whether to use claims only from the hospital file (MEDPAR) or whether also to include the diagnoses on claims submitted by physicians (carrier data), as described in Klabunde et al (J Clin Epidemiol. 2000 Dec;53(12):1258-67). The rationale for including diagnoses from the physician claims is that many more people see a physician than are hospitalized, thus increasing the possibility of identifying more comorbid conditions. If the MEDPAR data are the only file being used, then there is no need to use the first SAS macro and investigators may skip to the second macro.
This macro is only needed if both physician and hospital claims are being used to identify comorbid conditions. This macro requires that for physician and outpatient claims, a patient's diagnoses must appear on at least two different claims that are more than 30 days apart. The reason for this is that the diagnoses on the physician and outpatient claims have not been validated and it is possible that physicians may have recorded a diagnosis as being present, when the correct coding would be "rule out" the condition. Conditions that do not appear on two different claims are considered to be "rule out" diagnoses, and are not counted as comorbid conditions. This is necessary to prevent over-estimation of the comorbidity when using physician or outpatient claims.
This macro calculates Charlson comorbidity weights from the claims. The SAS macro considers the ICD-9 diagnosis codes, ICD-9 procedure codes, and HCPCS procedure codes on the claims. Researchers using hospital data with physician or outpatient data need to use the file produced from the first macro.
Building an Input File for the Macros
Regardless of which files an investigator decides to use, the files should be subset to include a limited number of variables as described below. All ICD-9-CM diagnosis codes on these records should be 5 characters long and ICD-9-CM procedure codes should be 4 characters long. Before invoking the SAS macros, decimal points or blanks occurring within the number should be removed from the code (ex. diagnosis code '123.4' becomes '1234 ').
Variables to retain for the macros:
- MEDPAR data - retain REGCASE, admission date, diagnosis codes (dx) 1-10, surgery codes 1-10 and length of stay. Set filetype=M.
- Carrier data - retain REGCASE, claim from date, HCPCS, diagnosis codes 1-5 (4 of the diagnoses are from the header of the claim and one is the line item diagnosis). The carrier data can have more than one claim for the same date of service and all claims for each date should be included. Set filetype=N.
- Outpatient data - retain REGCASE, claim from date, HCPCS, diagnosis codes 1-10, ICD procedure codes 1-10. Set filetype=O.
The final SAS file which combines data from the any of the above sources you choose to analyze should include:
||(admission from Medpar, claim from date from carrier and Outpat)
||(which is blank for MEDPAR records)
||(DX6-DX10 are blank for carrier)
||(these are all blank for carrier, procedure codes from Outpat)
|Length of Stay
||(which is blank for carrier and Outpat. This is only used if the record came from Medpar)
||(M-medpar, N-nch/carrier, O-outpat)