Note from the National Guideline Clearinghouse (NGC): The National Institute for Health and Clinical Excellence (NICE) commissioned an independent academic centre to perform an assessment of the manufacturer's submission on the technology considered in this appraisal and prepare an Evidence Review Group (ERG) report. The report for this technology appraisal was prepared by the School of Health and Related Research (ScHARR), University of Sheffield (see the "Availability of Companion Documents" field).
Critique of Manufacturer's Approach
Clinical Effectiveness
Critique of Submitted Evidence Syntheses
The Strength of the Evidence (Internal Validity)
The search strategy was poorly designed (see the "Description of Methods Used to Collect/Select the Evidence" field) but the ERG have not determined that any relevant primary studies were missed as a result. The inclusion criteria were adequately defined, but the manufacturer's study selection and use of the published evidence seemed to work on a highly selective and arbitrary basis, in the reporting of outcome data (Sections 4.1.3, 4.1.6 and 4.1.7 of the ERG Report [see the "Availability of Companion Documents" field]). The manufacturer's approach to validity assessment appears to have been adequate, but the template they were asked to use by NICE has problems (Section 4.1.5 of the Assessment Report). The manufacturer's reporting of secondary outcomes, particularly adverse events was somewhat haphazard (Section 4.2.4.1 of the ERG Report).
Critical outcomes used in the model were poorly defined in the manufacturer's submission and not reported in the public domain. As the review team could not access the individual patient data from the pivotal trial, they are unable to validate the manufacturer's analysis of this data. Where comparisons with similar published outcomes are possible there is no evidence of any inexplicable discrepancies (Section 3.4.1 of the Assessment Report [see the "Availability of Companion Documents" field]).
Time is an important factor in breast cancer, which has a long natural history, with recurrences occurring out beyond 20 years. The median follow-up in the pivotal trial is only one year. Many consider disease-free survival a surrogate for long-term, all-cause mortality in breast cancer. This has only been empirically demonstrated in other classes of treatment (standard cytotoxics and tamoxifen). The manufacturer reasons, by analogy alone, that the empirically known, short-term harm-benefit profile of trastuzumab will result in a long-term harm-benefit profile similar to that empirically known for other classes of drug (Sections 3.5, 4.1.7 and 4.1.8.2 of the ERG Report).
The Applicability of the Results (External Validity)
Women at elevated risk of a cardiac event were not recruited to the clinical trials which evaluated trastuzumab (Sections 3.1.2 and 4.3.2 of the ERG Report [see the "Availability of Companion Documents" field]) .Those women who were recruited were intensively monitored. This puts the onus (and the additional cost of screening) on the National Health Service (NHS) to replicate an eligible population for whom the treatment will be as safe as in the clinical trials. If the current shortfall in cardiac monitoring capacity is not adequately addressed, women treated with trastuzumab will be at elevated risk of heart failure compared with those in the clinical trials.
A restrictive scope allowed the manufacturer to exclude from any serious discussion the FinHer study (Section 4.1.3.2 of the ERG Report [see the "Availability of Companion Documents" field]). The manufacturer rightly pointed out that the underlying anthracycline-containing regimen was different to any used in the NHS. However, cancer clinicians have noted that the nine week regimen examined in this study may facilitate lower cost, greater patient convenience, and reduced risk of cardiotoxicity, although the evidence is not as strong as that for 52 weeks.
Considerable heterogeneity of study populations in terms of the concomitant chemotherapies received and lack of knowledge about what regimens are in use in the NHS make generalisation from the published evidence problematic (Section 3.3.1, 3.3.2 of the ERG Report [see the "Availability of Companion Documents" field]), although the direction and extent of clinical effect seems relatively consistent across different baseline treatment programmes.
Cost-Effectiveness
Overview of Manufacturer's Economic Evaluation
A state transition cohort model was used to compare the lifetime impact of one year of adjuvant trastuzumab therapy to no trastuzumab following standard chemotherapy regimens based on the Herceptin Adjuvant (HERA) trial. The clinical effectiveness aspect of the model is based upon the HERA trial1 which was an international, multi-centre, randomised trial on women with HER2 positive primary breast cancer.
Additional Work Undertaken by the ERG
Clinical Effectiveness
The ERG carried out the following analyses which the manufacturer declined to undertake: (1) a meta-analysis of trials to derive a more precise estimate of treatment effect in terms of overall survival (Section 7.1.1 of the ERG report [see the "Availability of Companion Documents" field]), disease-free survival (Section 7.1.2 of the ERG Report), distant recurrence (Section 7.1.3 of the ERG Report) and cardiac toxicity (Section 7.1.4) of the ERG Report; and, (2) a critical evaluation of the role of the FinHer study in decision-making (Section 7.1.5 of the ERG Report).
For time-to-event outcomes, summary statistics from the published literature were meta-analysed using the method described by Parmar, Torri, & Stewart, (1998) with a fixed effects model. Heterogeneity between trial results was tested using the chi2 test and the I2 measurement. The chi2 test measures the amount of variation in a set of trials. Small p values (p<0.10) suggest that there is more heterogeneity present than would be expected by chance. I2 is the proportion of variation that is due to heterogeneity, rather than chance. Large values of I2 suggest heterogeneity. I2 values of 25%, 50%, and 75% could be interpreted as representing low, moderate, and high heterogeneity.
The Absolute Risk Reduction (ARR) and Numbers Needed to Treat for time-to-event outcomes were calculated using methods described by Altman and Andersen (1999). This method uses the numbers of patients still at risk (alive) at the time corresponding to the estimated probabilities (reported or imputed), or hazard ratios and 95% confidence intervals, to calculate confidence intervals for each statistic.
Cost-Effectiveness
As a result of the communication with Roche, the ERG has developed what they believe to be a reasonable revised base-case. Sensitivity analysis has also been carried out to ensure that the model results are robust. The analysis is described in Section 7.2 of the ERG Report (see the "Availability of Companion Documents" field).