The Delay Model
For each of the cancer sites, many combinations of covariates were considered in prediction models of delay probabilities.
We evaluated the models by fitting the models using data from each of the annual data submissions between 1983 and
2006 and then predicting the counts for the 2007 submission. For each cancer site, the model that minimized the sum
of squared prediction errors was chosen as the default final model. However, to choose a more parsimonious model,
we added an additional selection step in which possible competing models were selected using the following criteria:
- the competing model had fewer number of parameters of the default model, and
- the percent change between the prediction errors of the competing and the default models per extra parameter (i.e.,
percent change in prediction errors divided by the difference in the numbers of parameters between the two models)
was less than 1 percent.
If more than one competing model met the criteria, the model with the smallest percentage change per extra parameter
was generally selected. However, if there are other competing models that had fewer parameters and the differences
between their percentage changes per extra parameter and the smallest one did not exceed 0.02, the competing model
with the fewest number of parameters (rather than the model with the smallest percentage change per extra parameter)
was selected. The chosen model was then refitted using all data (1983-2007 submissions, 1981-2005 diagnosis years)
to estimate delay distributions and calculate delay adjusted estimates of the cancer counts.
Age-adjusted (using the 2000 US standard million population) cancer incidence rates were then calculated with and
without adjusting for reporting delay. Joinpoint linear regression was used to obtain the annual percentage changes
for the 1975-2005 incidence rates for the data series with and without delay adjustment. Because the delay distribution
was assumed complete after 25 years, incidence rates for diagnosis years prior to 1982 were not reporting-adjusted.
In joinpoint regression analyses, up to three change points (i.e, 4 trend-line segments) were allowed, and these were
modeled to fall at either whole years or midway between diagnosis years. Change points were constrained to be at least
2 years away from both the beginning and the end of the data series and at least 2 years apart. Models were fitted
using weighted least squares (weighted by appropriate variances of age-adjusted incidence rates) of the joinpoint
regression software.
Results show that adjusting for delay tends to raise cancer incidence rates in more current reporting years. While
this adjustment increases the rate of change over the most recent diagnosis years, it probably will only rarely cause
the detection of a new joinpoint, although this is possible. See Clegg et al. (2002) for details on the impact of
reporting-delay adjustment to SEER cancer incidence rates.
|