Note from the National Guideline Clearinghouse (NGC): The National Institute for Health and Clinical Excellence (NICE) commissioned an independent academic centre to perform a systematic literature review on the technology considered in this appraisal and prepare an Evidence Review Group (ERG) report. The ERG report for this technology appraisal was prepared by the West Midlands Health Technology Assessment Collaboration, University of Birmingham (see the "Availability of Companion Documents" field).
Clinical Effectiveness
Critique of Manufacturer's Approach
This is based on a formal critical appraisal recorded in Appendix 1 of the ERG report (see the "Availability of Companion Documents" field). An additional appraisal (Appendix 2 of the ERG report) was conducted on the Cochrane review which was used as the foundation for the Schering-Plough review of clinical evidence. In addition two key included studies in both reviews were reappraised and the abstracted data rechecked, records of which processes are provided in Appendices 3 and 4 of the ERG report (see the "Availability of Companion Documents" field).
Description and Critique of Manufacturer's Approach to Validity Assessment
There is very limited information about how this was conducted in the submission. Although there appears to have been no recognised framework used to assess the threats to validity, the commentary does consider all the main aspects of randomised controlled trial (RCT) quality which one would expect a systematic review to examine. The ERG has confirmed that key included studies (ACT I and II and Jarnerot et al.) are substantially free from threats to internal validity as claimed. Additional information, not present in the published report of ACT I and II, reassuring about the quality of randomisation and blinding was obtained from the original trial reports and was consistent with the Schering-Plough submission. Although not highlighted in the submission, the ERG concurred with the preceding Cochrane review by Lawson et al., that information on randomisation and blinding was limited in the study by Jarnerot et al., and noted too that this study had been terminated early because of slow recruitment.
Description and Critique of Manufacturer's Outcome Selection
The submission provides information on most of the outcomes mentioned in the scope and on which data appears to have been collected in the included studies. An exception is information collected on EuroQol-5 dimensions (EQ-5D) in ACT I which only appears in the clinical trial report provided by Schering-Plough as additional information. This is useful in helping to gauge not just the statistical significance of the effect on quality of life occurring in the trial, but also the size of this effect too.
Information on colectomy rates is not clearly reported, particularly that relating to the ACT studies. There is inconsistency between the rates in ACT I and II and between the rates claimed at 54 weeks in ACT I and the information provided in the trial reports for the rate at 30 weeks.
The process of data abstraction is poorly described. The ERG have confirmed that for the data for one of the key studies (ACT I and II) there are some major data abstraction errors. Fortunately these errors do not seriously affect the interpretation of the clinical evidence, as the direction of effect is unaltered. The meta-analysis of the results of ACT I and II is affected, and the correct values are indicated in the next section.
Describe and Critique the Statistical Approach Used
The summary of the results in which results on common outcomes at similar time-points are presented together is the weakest component of the Clinical Evidence Section. The approach is basically qualitative, but there is little systematic attempt to draw out the overall patterns of results, and particularly to deal with the different scenarios (subacute and acute/"rescue") the included studies represent. Fortunately the limited number of included studies makes it possible for the reader to identify what the patterns are without much assistance.
There is some use of meta-analysis in summarising the results of ACT I and II alone. Unfortunately there are errors in this indicated in the table titled Corrections to: "Table 16. Pooled Results from ACT I/II trials" in the ERG report (see the "Availability of Companion Documents" field).
As already indicated, this does not alter the interpretation of the clinical evidence greatly. Some estimates of effect were underestimated. The major change concerns the estimates of heterogeneity which was very marked in the original analyses. There is still some heterogeneity in the revised analyses, particularly in the estimate of effect on clinical remission at 8 weeks. Although less noteworthy than originally, it is still an issue worth highlighting, particularly as ACT I and II were studies with virtually identical design.
Summary of Results
There is no succinct summary of results, and there is no attempt to consider whether the overall pattern of results differs depending on the circumstances in which infliximab is given.
The ERG offer a summary of results, based on the information presented in the submission (refer to section 4.2.1 of the ERG report [see the "Availability of Companion Documents" field]).
Critique of Submitted Evidence Syntheses
Concerning the primary research this seems generally robust, particularly for the subacute setting. There is some uncertainty about what the effects on colectomy rates are. However, the main challenge is understanding the magnitude of the effect of infliximab on a patient's health-related quality of life. The evidence on effectiveness in the acute situation is less robust, primarily because the number of patients investigated is still relatively small. However, the two studies in this category both apparently had problems recruiting patients.
The more important provisos concern the limitations of the method used to review the available research in the submission. These include:
- Poor recording of review method
- Errors in data abstraction
- Poor summary of included studies and errors in meta-analysis
- Failure to investigate heterogeneity between the results of the ACT studies
- No clear indication how the results of the included studies might vary between the subacute and "rescue" scenarios
- Possible over exaggeration of the measured effect on health-related quality of life in the ACT I and II studies
- Possible under emphasis of potential for infliximab to affect risk of malignancy in a group already at increased risk of malignancy.
The ERG appraisal has sought to compensate for these limitations within the constraints of the process. Ideally an independent systematic review would have been undertaken in parallel with the Schering-Plough submission.
Refer to Sections 4.1 and 4.2 and Appendices 1, 2, 3, and 4 of the ERG report (see the "Availability of Companion Documents" field) for more information on clinical effectiveness.
Economic Evaluation
Overview of Manufacturer's Economic Evaluation
The report of the cost-effectiveness work focuses almost entirely on the de novo model and economic evaluation undertaken by the manufacturer.
A Markov model has been built using Microsoft Excel to compare two treatment strategies, infliximab versus standard care, in terms of costs and quality-adjusted life years (QALYs). The patient group modelled has moderate to severe active ulcerative colitis (UC) and includes patients "who have had an inadequate response to conventional therapy including corticosteroids and 6-mercaptopurine (6-MP) or azathioprine (AZA), or who are intolerant to or have medical contraindications for such therapies." The main submission only considered patients in this category (although the manufacturer's clarification response included results for patients who are more severe, where surgery is the comparator considered). This modelling was undertaken, in part, using data from the ACT trials.
Two separate treatment strategies have been evaluated, strategies A and B, which differ in the assumption made about continuation of infliximab therapy. Strategy A modelled the continuation of infliximab in treatment responders who achieved and maintained remission or mild health states. In contrast, strategy B considered a narrower therapy continuation group defined as responders who achieve and maintain remission.
Sensitivity Analyses
Extensive one-way sensitivity analysis was undertaken to consider variation in utility values, time horizon of the model, the assumption concerning an average patient's weight, and discount rates.
Model Validation
The electronic version of the model was made available to the ERG in an executable form. The model has been run using the inputs stated in the manufacturer's report, and the same results have been obtained. The workings of the model have been audited and whilst the ERG have found some errors in programming, none of them are serious in that they do not change the results in a meaningful way.
Critique of Approach Used
Model Type and Structure
The use of a Markov model is appropriate as the disease is characterised by progression over time and so a modelling approach that can deal with transition between states and the timing of events is required.
A further issue relates to the consideration of adverse events in the model. The only adverse events considered explicitly in the model are those that led to discontinuation of the study drug. Other events described in ACT trial papers as 'serious adverse events', 'infections requiring antimicrobial treatment', and 'serious infections' (bacterial infection, etc.) are not accounted for in the model. The only model health state that considers adverse events is the temporary discontinuation state. Thus, any costs or dis-utilities associated with such serious adverse events associated with infliximab use have been ignored.
Sensitivity Analysis
The sensitivity analysis has explored the robustness of results to variation in some of the key parameters. The probabilistic sensitivity analysis (PSA) has been undertaken in a very partial manner, with distributions placed around selected parameters only. Further, the selection of normal distributions for the utility data appears arbitrary and has the potential to lead to values outside the acceptable range (e.g., utility values greater than 1). Errors in the interpretation of the PSA and calculation of the cost-effectiveness acceptability curve (CEAC) have been identified and detailed.
Refer to Section 5 of the ERG report (see the "Availability of Companion Documents" field) for additional information on economic evaluation.