Summary
Evidence Report/Technology Assessment: Number 58
Please Note: The evidence report this summary was derived from has not been updated within the past 5 years and is therefore no longer considered current. It is maintained for archival purposes only.
Under its Evidence-based Practice Program, the Agency for Healthcare Research and Quality (AHRQ) is developing scientific information for other agencies and organizations on which to base clinical guidelines, performance measures, and other quality improvement tools. Contractor institutions review all relevant scientific literature on assigned clinical care topics and produce evidence reports and technology assessments, conduct research on methodologies and the effectiveness of their implementation, and participate in technical assistance activities.
Select for PDF File (78 KB). PDF Help.
Overview / Reporting the Evidence / Findings / Conclusions / Future Research / Availability of the Full Report
Overview
An extensive literature documents a high
prevalence of errors in clinical diagnosis
discovered at autopsy. Multiple studies have
suggested no significant decrease in these errors
over time. Despite these findings, autopsies have
dramatically decreased in frequency in the United
States and many other countries.
In 1994, the last year for which national U.S. data exist, the
autopsy rate for all non-forensic deaths fell below
6 percent. The marked decline in autopsy rates from
previous rates of 40-50 percent undoubtedly reflects
various factors, including reimbursement issues,
the attitudes of clinicians regarding the utility of
autopsies in the setting of other diagnostic
advances, and general unfamiliarity with the
autopsy and techniques for requesting it,
especially among physicians-in-training.
The autopsy is valuable for its role in
undergraduate and graduate medical education,
the identification and characterization of new
diseases, and contributions to the understanding
of disease pathogenesis. Although extensive, these
benefits are difficult to quantify. This systematic
review studied the more easily quantifiable
benefits of the autopsy as a tool in performance
measurement and improvement. Such benefits
largely relate to the role of the autopsy in
detecting errors in clinical diagnosis and
unsuspected complications of treatment.
It is
hoped that characterizing the extent to which the
autopsy provides data relevant to clinical
performance measurement and improvement will
help inform strategies for preserving the benefits
of routinely obtained autopsies and for
considering its wider use as an instrument for
quality improvement.
This report does not attempt to address the
roles of the autopsy in medical education;
furthering medical research; quality control
within pathology; verification, second-opinion
consultations, and legal documentation of
findings; the bereavement process for surviving
family members; or other benefits that are
described in many of the sources listed in the
bibliography (Appendix F). In addition to being
difficult to quantify, these benefits apply
primarily to teaching hospitals. To address the
role of the autopsy as an outcome measure and
tool for quality improvement, the report focuses
on benefits likely to apply to all hospitals, such as
the detection of important diagnostic errors and
related quality problems.
Return to Contents
Reporting the Evidence
This report synthesizes the autopsy literature
as it relates to the following four key questions:
- To what extent does the autopsy reveal
important diagnoses that were clinically
unsuspected prior to death?
- To what extent does the autopsy provide a
useful performance measure or audit of clinical
diagnosis in general?
- What impact do autopsy findings have on
clinical performance improvement?
- To what extent are vital statistics compromised
by low autopsy rates?
To address the above questions adequately, we
also sought evidence pertaining to the properties
of the autopsy as a diagnostic test. Specifically, we
looked for any information describing autopsy
quality, accuracy, and precision or reproducibility.
It is important to note that, though the phrase
"diagnostic error" appears throughout this report,
the discrepancies between clinical and autopsy diagnoses to
which we refer do not necessarily represent errors in the sense
of mistakes, "slips," or other such terms. Some of these
discrepancies do undoubtedly result from failures to consider
an appropriately broad differential diagnosis, misinterpretation
of test results, and other quality problems, so that resulting
discrepant diagnoses detected at autopsy do warrant the label
"diagnostic errors." However, other such discrepancies clearly
represent acceptable limits to clinical diagnosis, based on the
performance of current technologies or the occurrence of
atypical clinical presentations. (In fact, one of the areas of
future research identified by this report involves characterizing
the relative distribution of these two types of clinical-autopsy
diagnostic discrepancies.) Despite these considerations, we use
the term "diagnostic errors" because it appears so commonly in
the autopsy literature.
Target Population
The patient population covered in this report includes all
patients (e.g., adult and pediatric, male and female, and so on)
in various settings, although predominantly consisting of
hospitalized patients. We did not specifically exclude medical
examiner cases, but few studies from the forensic literature
addressed the specific questions posed in this report.
Search Strategy
We conducted an extensive search of the MEDLINE®
database, supplemented by hand searches of article
bibliographies and consultation with experts in the field. For
articles published in languages other than English, we reviewed
the abstract (if available) to determine whether or not the
study reported methodologies or findings qualitatively
different from those described in the English-language
literature.
Study Inclusion Criteria
The autopsy literature consists entirely of observational
studies, rendering problematic the development of appropriate
inclusion and exclusion criteria, as the vast majority of
systematic reviews involve at least some randomized controlled
trials. In the absence of relevant and well-established quality
scoring systems, we adopted fairly minimal inclusion and
exclusion criteria. For studies reporting diagnostic error rates
detected at autopsy, we required:
- Well-defined patient samples consisting of consecutive or
randomly sampled autopsies meeting explicit criteria—convenience samples were excluded.
- Clinical diagnoses derived from autopsy request forms
submitted by clinicians or chart review performed by the
study investigators—clinical diagnoses derived solely from
death certificates were excluded.
- Classification schemes for discrepancies between clinical and
autopsy diagnoses conforming to one of three categories—potentially treatable causes of death ("Class I"), other major
missed diagnoses, and discrepant disease categorizations
based on standard international classification coding. These
classifications (defined further in the report) encompass the
majority of studies reported in the literature. Studies that
reported clinical diagnoses simply as "correct/incorrect" or
"confirmed/unconfirmed" were excluded.
Data Collection and Analysis
Articles identified from the literature search were stored in a
reference database and categorized according to the study
questions addressed. Structured abstraction forms were then
used to collect demographic data (pertaining to patients and
institutions), salient methodologic features and results. Each
article was abstracted by at least two of the four reviewers,
including three physicians and one non-physician research
assistant. One of the physicians reviewed all of the articles.
Return to Contents
Findings
To address the first key question pertaining to the extent to
which autopsies reveal clinically unsuspected important
diagnoses, we reviewed studies assessing the performance of
the autopsy as a diagnostic test. Given the generally accepted
role of the autopsy as the ultimate diagnostic standard for
many aspects of clinical care, the test characteristics of the
autopsy have received surprisingly little attention.
- The quality of the autopsy has received little systematic
study, with the only evidence pertaining to perinatal
autopsies, where two studies show that deficiencies relative
to reporting standards (i.e., a proxy measure for potentially
inadequate quality) appear to be common.
- The potential for error or disagreement in autopsy
interpretations has been assessed in only one small study. In
relation to the determination of principal diagnoses relating
to the cause of death in technically adequate autopsy,
diagnostic uncertainty persists in 1-5 percent of cases, although
rates of up to 40 percent have been reported, depending on the
type of autopsy cases, e.g., perinatal. Importantly, errors in
classification of autopsy diagnoses involving even a few
percent of cases substantially distort estimates of the
performance of clinical diagnosis when autopsy is used as
the gold standard.
- The reproducibility of judgments about errors in clinical
diagnosis as indicated by autopsy findings has only been
mentioned in passing in the autopsy literature. Studies from
the health care quality and medical error literature suggest
that reproducibility of similar types of judgments is likely
fair to moderate at best.
There is insufficient literature to address: a) the quality of
the autopsy, b) the technical adequacy in interpreting autopsy
findings, and c) the reliability of judgments made regarding
autopsy detected discrepancies. There is also no literature that
addresses the quality of training in autopsy pathology or the
ability of physicians to utilize autopsy findings.
In terms of the four main study questions:
- To what extent does the autopsy reveal important diagnoses
that were clinically unsuspected prior to death?
- The chance that autopsy will reveal a misdiagnosis that
may have affected outcome (i.e., a Class I error) was
10.2 percent (95 percent CI: 6.7-15.3 percent) using data from all studies
and the base values of time (1980), autopsy rate (overall
mean rate of 44.3 percent), country (U.S.) and case mix
(general autopsies). Restricting the analysis to data from
U.S. institutions only yielded a slightly higher point
estimate but almost entirely overlapping confidence
interval, 11.2 percent (95 percent CI: 6.9-17.5 percent). Adjusting for
changes in autopsy rates, and the effects of case mix and
the country, the probability of a Class I error showed a
relative decrease of 26.2 percent per decade (p=0.10).
- The base probability of the autopsy detecting a major
error in a given case was 25.6 percent (95 percent CI: 20.8-31.2 percent)
when data from all institutions were included. Using data
from U.S. institutions only, the probability of the autopsy
detecting a major error in a given case was slightly lower
at 24.0 percent, but with an almost entirely overlapping 95 percent
CI of 17.6-31.5 percent. Major error rates also showed a
similar decrease over time, but, in contrast to the results
for Class I errors, this relationship was statistically
significant. Relative to the base rate in 1980, the
prevalence of major errors exhibited a relative decrease of
28.0 percent (95 percent CI: 9.8-42.6 percent) per decade.
- The regression analysis supported the expected inverse
correlation between error rate and autopsy rate (i.e., that
lower autopsy rates produce higher error rates due to
selection of diagnostically challenging cases), but this
effect is relatively modest. Specifically, every 10 percent increase
in the autopsy rate is associated with a relative decrease in
Class I errors of 7.8 percent (p=0.18). For major errors, this
relationship was more substantial and statistically
significant, with every 10 percent increase in autopsies
associated with a relative decrease in major errors of 12 percent
(p=0.0003).
- Using the regression model to compute rates of autopsy-detected
diagnostic errors over a range of autopsy rates
and as a function of time, contemporary (year 2000)
autopsies detect Class I errors in 3.8-7.9 percent of cases and
major errors in 8.0-22.8 percent, of cases. These ranges reflect
variations in autopsy rates from 5-100 percent.
- The weak relationship between autopsy rates and error
rates in the general analysis was supplemented by review
of studies specifically addressing the issue of clinical
selection of diagnostically challenging or uncertain cases.
These studies indicated that clinicians cannot reliably
predict which autopsies will be of high diagnostic yield,
reinforcing the conclusion that the relatively unchanged
diagnostic error rates do not simply reflect competing
effects of medical progress (leading to fewer errors) and
fewer autopsies (leading to selection for cases likely to
have errors).
- Because of the recent interest in medical error and patient
safety, we specifically looked for studies that reported the
proportion of autopsies that detected clinically
unsuspected complications of care. These data were
usually mentioned in passing in these studies, with no
study specifically focusing on this issue. Thus, the extent
to which these complications contributed to death (and
even the extent to which they were truly unsuspected)
was often unclear. For this reason, and because of the
heterogeneity of the case mix in the relatively small
sample of studies reporting the relevant data, we did not
pool estimates for rates of autopsy-detection of
unsuspected complications of care. Nonetheless, the 11
studies that did provide data on this point indicated that
approximately 1-5 percent of autopsies disclose unsuspected
complications of care.
- To what extent does the autopsy provide a useful
performance measure or audit of clinical diagnosis in
general?
- Autopsy studies commonly report diagnostic "error
rates," but these error rates involve autopsied cases only.
It is commonly assumed that the true denominator of
interest is all deaths; hence the interest in increased
autopsy rates. However, the denominator of interest for
clinical performance measurement is, in fact, all patients
receiving care during the autopsy observation period.
Only one autopsy study provides any data on clinical
diagnoses for patients discharged alive from the hospital
during the same observation period as for the autopsy
series. Because of the importance of this question, we
searched extensively for studies outside the autopsy
literature per se for potentially relevant studies.
- Specifically, we looked for studies reporting clinical
diagnoses and other follow-up data on cohorts of patients
(e.g., all patients admitted to a given hospital during a
defined observation period), not just the diagnoses
obtained for patients who died and went to autopsy.
Supplementing autopsy findings with the results of ante
mortem diagnostic testing and/or clinical follow-up for
patients who did not die permits determination of the
numerator and denominator required to assess the
sensitivity of clinical diagnosis. Despite an extensive
search, we found appropriate studies for only five target
conditions: pulmonary embolism (PE), acute myocardial
infarction (MI), acute appendicitis, aortic dissection, and
active tuberculosis.
- Among these five conditions, the performance of clinical
diagnosis exhibited substantial variation, with excellent
performance only for acute MI and to a lesser extent PE.
Even for these two conditions, the high sensitivities
obtained likely overstate clinical performance, as focusing
on the dichotomous outcome of correct or incorrect
identification of one target condition (PE or MI)
obscures the extent to which other important conditions
are missed once these target diagnoses are ruled out. A
patient who is correctly identified as not having an MI
counts as a success, regardless of whether or not the
underlying cause of the patient's presenting complaint is
ever diagnosed.
- What impact do autopsy findings have on clinical
performance improvement?
- No intervention study has directly addressed the impact
of autopsy findings on clinical practice or performance
improvement. Consequently, the study objectives in this
regard were not met, including not being able to perform
a cost effectiveness analysis, as the effectiveness of the
autopsy in reducing errors and other quality problems
remains unknown. This does not invalidate the potential
role of the autopsy in relation to clinical practice or
performance improvement, but does reveal an important
gap in the literature.
- To what extent are vital statistics compromised by low
autopsy rates?
- Major error rates detected by autopsy indicate substantial
inaccuracies in death certificates and hospital discharge
data, both of which play important roles in
epidemiologic research and health care policy decisions.
Previous studies have suggested that these errors roughly
cancel each other out (i.e., for a given condition, false
positive and false negative diagnoses are roughly equal).
However, this finding has not been consistent across
studies. Even when present, this balancing effect applies
only when considering the most general of diagnostic
categories (i.e., cardiovascular, neoplastic, infectious,
metabolic, and so on). Thus, the current evidence is
adequate to suggest that the epidemiologic data for
important diseases such as myocardial infarction, breast
cancer, pneumonia, stroke, and so on, all contain
substantial inaccuracies—in the 20-30 percent range reported
for major errors.
Return to Contents
Conclusions
The findings of this review have different implications
depending on the level of analysis—individual clinicians,
hospitals, or the health care system as a whole. From the point
of view of the individual clinician, the chance that autopsy will
reveal important unsuspected diagnoses in a given case remains
significant. Moreover, clinicians do not seem able to predict
reliably cases in which such findings are more likely to occur.
Thus, clinicians have compelling reasons to request autopsies
far more often than currently occurs.
At the institutional level, the role of the autopsy is less clear.
The prevalence of missed diagnoses among autopsied patients
(or even all deaths) provides a numerator, but not a
denominator with which to assess the rate at which patients
with a given condition remain undiagnosed until death. Using
autopsy results to track hospital quality requires not only
explicitly defined error rates, but also data on the number of
patients discharged alive with diagnoses that appear among the
list of conditions first detected at autopsy. Clearly, though, the
unexpected findings at autopsy in specific cases are of interest
to institutions as a whole and not just the individual treating
clinicians.
However, no study has ever examined the impact of
performing autopsies (and communicating autopsy findings
back to clinicians) on institutional performance improvement.
This represents a major area for future research, but should not
detract from the finding that many institutions perform too
few autopsies to allow any meaningful assessment of local
diagnostic performance and other quality problems, no matter
how communication and feedback to clinicians occurs.
At the level of the entire health care system, existing
literature provides two compelling reasons to pursue autopsies.
First, results for the five conditions examined in this report
suggest that clinical diagnosis in routine practice may not
perform as well as is generally believed by clinicians or as
suggested by the literature assessing specific aspects of clinical
diagnosis (e.g., new tests) in research settings. Better
characterizing the performance of clinical diagnosis for
common conditions would clearly benefit the entire health
system and identify important targets for quality improvement
that could be pursued in a concerted manner.
The second benefit to the health care system as a whole
relates to vital statistics and other epidemiologic data. Vital
statistics impact important decisions about allocation of
funding for research and other aspects of health care policy.
The existing literature demonstrates that clinical diagnoses,
whether obtained from death certificates or hospital discharge
data, contain major inaccuracies compared with diagnoses
generated from postmortem findings. The use of autopsy data
to correct inaccuracies in epidemiologic data would likely
confer multiple benefits on the health care system as a whole.
Future Research
- Various aspects of the performance of the autopsy as a
diagnostic test (e.g., the reproducibility of findings between
pathologists) remain undefined and represent areas for
further research. More specifically relevant to the present
review is the inter-rater reliability for error classifications in
specific cases, i.e., establishing the extent to which
pathologists, clinicians or other peer reviewers agree that a
particular case does or does not involve a clinically
important diagnostic error.
- The causes of important diagnostic discrepancies remain
uncharacterized. This represents a very important area of
investigation. Discrepancies between efficacy and
effectiveness (i.e., differences between the performance of a
diagnostic or therapeutic procedure in routine practice
compared to the result in the research literature) have
diverse causes. Broadly speaking, though, discrepancies are
caused by a) quality problems related to underuse, overuse
and misuse of diagnostic or therapeutic procedures, and b)
patient factors, including atypical presentations and
complex interactions between comorbid conditions and
patient demographic factors. Neither of these categories are
captured in the "efficacy literature" (i.e., clinical trials), as
the nature of research settings make underuse, overuse or
misuse unlikely, and stringent patient selection reduces the
complexities of comorbid conditions and multiple
competing diagnostic considerations.
Autopsy data provide a window into discrepancies between
efficacy and effectiveness both for therapeutics (by detecting
clinically unsuspected complications of care) and diagnostics
(by detecting the diagnostic discrepancies discussed in this
report). In both cases, but perhaps especially the latter, the
autopsy can play a pivotal role in spearheading
investigations into the causes of these discrepancies. Where
discrepancies prove to present quality problems, the
institution benefits and, where they reflect differences
between the types of patients receiving care in routine
practice and clinical trials, the whole health system may
benefit from awareness of these findings.
- Future research should establish strategies for optimizing the
utility of the autopsy at the institutional level. No study has
ever directly assessed the impact of detecting errors in
clinical diagnosis on subsequent clinical performance. Thus,
future research should establish optimal methods of
involving clinicians in the autopsy process (or
communicating its results to them) and effective ways of
stimulating change based on autopsy findings. Until such
research is performed it is not clear to what extent autopsy
rates need to be increased as opposed to achieving
improvements in communication and utilization of
information generated from autopsies performed at current
rates.
- Future research should establish the optimal means of using
autopsy data to provide more accurate vital statistics and
other important epidemiologic data. The first step might
be to validate the findings suggested in this review, namely
that current vital statistics contain substantial inaccuracies.
Such an undertaking might involve funding a small number
of demographically diverse institutions to achieve high
institutional autopsy rates, with prospectively determined
protocols for autopsy performance and error classification.
Even one year's worth of data from such a project would
likely document substantial inaccuracies in vital statistics.
Continuing such a project could also provide ongoing
epidemiologic data, as well as more meaningful error rates
that could be used to fuel quality improvement efforts
throughout the health system. Such a program would not
replace autopsies as routinely performed elsewhere, that is,
this suggested research program would not be equivalent to
a system of regional autopsy centers performing autopsies
on behalf of other institutions. Rather, these centers would
act as surveillance centers for basic causes of death and
detection of quality problems and present numerous
opportunities for basic research into the pathogenesis of
acute and chronic illnesses.
Return to Contents
Availability of the Full Report
The full evidence report from which this summary was
taken was prepared for the Agency for Healthcare Research
and Quality (AHRQ) by the University of California at San Francisco-Stanford Evidence-based Practice Center (EPC), Stanford, CA, under Contract No. 290-97-0013. Printed copies may be obtained free of charge from the AHRQ Publications
Clearinghouse by calling 1-800-358-9295. Requesters should ask
for Evidence Report/Technology Assessment No. 58, The
Autopsy as an Outcome and Performance Measure.
The Evidence Report is also online on the National Library of Medicine Bookshelf, or can be downloaded as a PDF File (PDF File, 1 MB). PDF Help.
Return to Contents
AHRQ Publication No. 03-E001
Current as of October 2002
Internet Citation:
Autopsy as an Outcome and Performance Measure. Summary, Evidence Report/Technology Assessment: Number 58. AHRQ Publication No. 03-E001, October 2002. Agency for Healthcare Research and Quality, Rockville, MD. http://www.ahrq.gov/clinic/epcsums/autopsum.htm