Evidence: Its Meanings in Health Care and in Law

Summary of the 10 April 2000 IOM and AHRQ Workshop, "Evidence": Its Meanings and Uses in Law, Medicine, and Health Care

Clark C. Havighurst, Duke University School of Law; Peter Barton Hutt, Covington & Burling; Barbara J. McNeil, Harvard Medical School; and Wilhelmine Miller, Institute of Medicine


Notice of Copyright

This article was originally published in the Journal of Health Politics, Policy and Law. All rights reserved. This material may be saved for personal use only, but may not be otherwise reproduced, stored, or transmitted by any medium, print or electronic, without the explicit permission of the copyright holder. Any alteration to or republication of this material is expressly prohibited.

It is a violation of copyright law to reproduce any copyrighted information from this publication without first obtaining separate permission directly from the copyright holder who may charge fees for the use of such materials. It is the responsibility of the user to contact and obtain the needed copyright permissions prior to reproducing materials in any form.

Permission requests should be directed to:
Journals Division
Duke University Press
Box 90660
Durham, NC 27708
Fax: (919) 688-3524


Contents

Introduction
The Meaning of Evidence and the Practice of Evidence-Based Medicine
Rules of Evidence, Claimsof Medical Expertise, and the Daubert Trilogy
Coverage Policies, Medical Liability, and Cost-Effectiveness Analysis
Using Evidence Appropriately in Medicine, Health Care, and the Law
Appendix: Participants
References and Notes

Introduction

In April 2000 the Institute of Medicine (IOM) and the Agency for Healthcare Research and Quality (AHRQ) jointly hosted a one-day workshop to explore an intriguing and important intersection of medicine and law: the courtroom presentation of science-based medical evidence and expertise. This workshop was inspired by a concern that legal uses and interpretations of science-based medical evidence, particularly population studies and the findings of controlled clinical trials, may diverge substantially from the uses and interpretation of that evidence by the medical and health care researchers who produce it and of the practitioners and health plans that use it in making clinical decisions and policies.

Recognizing that a preliminary discussion among professions was needed even to describe the nature of their differences, the IOM and AHRQ, at the instigation of John M. Eisenberg, director of AHRQ, convened about twenty clinicians, epidemiologists, health services researchers, health plan executives, practicing and academic lawyers, jurists, and social scientists in the field of legal medicine (see appendix for participants). Participants and presenters were asked to formulate empirical research questions concerning both evidence-based medicine (EBM) and judicial practices that might increase familiarity with, and therefore promote greater reliance on, the use of science-based medical evidence by the courts. Workshop participants were further asked to identify policy issues relating to the application of evidence-based medical findings that were emerging in the context of congressional consideration of patient protection legislation and reform of health plan liability law.

The four background papers commissioned for this workshop provided the participants with a common frame of reference for the issues to be addressed during the day. These papers were the first drafts of the authors' articles in this special issue. The authors were variously asked to address the following questions:

In addition to these commissioned papers, Susan Haack, professor of philosophy at the University of Miami, provides a broad overview of judicial rulings on and interpretations of scientific evidence and expert testimony of the past century. In her article, "An Epistemologist in the Bramble-Bush: At the Supreme Court with Mr. Joiner," Haack argues that inquiry in the natural sciences is, in practice, similar to empirical inquiry of other sorts and cannot be distinguished by particular methodologies, contrary to what recent Supreme Court decisions have presumed. Substantive scientific knowledge, and not simply ascertaining that the proper scientific techniques or methods are followed in producing the evidence in question, is needed to determine the degree of warrant of a particular scientific claim or theory. In some cases, judges will not be able to avoid ruling on substantive scientific questions. Haack concludes her article by raising both practical and policy questions about the presentation of scientific evidence and expert testimony in the courtroom.

In "Proof and Policy from Medical Research Evidence," Cynthia D. Mulrow, professor of medicine at the University of Texas Health Science Center-San Antonio and director of the San Antonio Evidence-Based Practice Center (EPC), describes the evolution of what physicians take to be evidence for the practices they adopt and reviews the principles now accepted in the medical research community for evaluating medical research evidence. In expanding her paper for publication here, Mulrow has been joined by Kathleen N. Lohr, chief scientist at Research Triangle Institute (RTI) and director of the RTI-University of North Carolina EPC, as coauthor. Lohr has worked extensively in conceptualizing and establishing evaluative criteria for clinical practice guidelines. The article reflects this work as well as Mulrow's analysis of the evolution of EBM concepts and practice as presented at the workshop.

Daniel W. Shuman, professor of law at Southern Methodist University, whose research interests include scientific evidence and the law, was asked to consider the courts' use of scientific evidence, particularly the terms under which expert witnesses testify. Shuman's essay, "Expertise in Law, Medicine, and Health Care," contrasts two models of the judge's role in deciding what scientific evidence is presented to a jury: the traditional adversarial approach and the more recent "gatekeeper" approach. In particular, Shuman assesses how judicial gatekeeping has been influenced by the Supreme Court's trilogy of decisions on the admissibility of expert testimony under the Federal Rules of Evidence: Daubert v. Merrell Dow Pharmaceuticals (509 U.S. 579 [1993]), General Electric Co. v. Joiner (522 U.S. 136 [1997]), and Kumho Tire Co. v. Carmichael (526 U.S. 137 [1999]). Shuman concludes that these landmark decisions have had less of an impact on judges' prescreening of scientific evidence and expert testimony in medicine than might have been expected and offers several explanations for this. He further suggests several strategies for strengthening the scientific literacy of the legal profession and for enhancing the ability of lawyers and judges to identify the quality of scientific findings offered as evidence as an issue meriting judicial attention.

Peter D. Jacobson, professor in the Department of Health Policy and Management in the University of Michigan School of Public Health, presented the third paper at the workshop. Here his article "Cost-Effectiveness Analysis in the Courts: Recent Trends and Future Prospects," with coauthor Matthew L. Kanna, addresses the question of whether and how courts consider the use of cost-effectiveness analysis (CEA) and related evaluative techniques such as cost-benefit analysis and risk-utility analysis by health plans to establish coverage policies and justify medical treatment decisions. Finding little health care litigation that explicitly involved the application of CEA, Jacobson examines its role in product liability cases and how juries have reacted to its application by manufacturers in making product safety design and recall decisions. Finally, Jacobson considers factors that might precipitate the explicit use of techniques such as CEA by health plans in developing benefit packages and making medical necessity determinations.

In the final presentation of the workshop, "Evidence-Based Medicine and the Law: The Courts Confront Clinical Practice Guidelines," Arnold J. Rosoff, professor of legal studies and health care systems at the University of Pennsylvania's Wharton School, reviewed courts' treatment of clinical practice guidelines (CPGs) as evidence of a standard of care in contrast with the traditional use of customary (local) medical practice as the standard of care applied in malpractice cases. Rosoff draws lessons from the consideration given to CPGs in the courts for the likely fate of science-based medical evidence when presented in litigation of health plan coverage policies and medical necessity determinations. His discussion gives particular attention to a variety of professional cross-cultural conflicts and communication gaps. Conflicts arise not only between physicians or medical researchers testifying as to treatment standard of care and the lawyers prosecuting a malpractice case or health plan coverage dispute, but also between the "artful practitioner" and the physician-scientist or medical researcher regarding the application of research to individual patient care decisions. Rosoff emphasizes the need in such conflicts for a common vocabulary and understanding of the questions at issue in such cases, a task that this workshop only began to address.

The remainder of this introductory essay highlights issues raised by participants in discussions that followed the authors' presentations. Several workshop participants were asked to serve as "first respondents." Barbara S. Hulka, Kenan Professor in the Department of Epidemiology at the School of Public Health, University of North Carolina, and Judge Sam C. Pointer Jr., formerly chief judge of the U.S. District Court for the Northern District of Alabama, commented on Mulrow's and Shuman's presentations. Drummond Rennie, adjunct professor of medicine at the University of California-San Francisco, and David M. Eddy, senior advisor for health policy and management at Kaiser Permanente Southern California, followed with comments on Jacobson's and Rosoff's presentations. The discussion summary that follows is organized thematically, following the general order of the presentations.

Return to Contents

The Meaning of Evidence and the Practice of Evidence-Based Medicine

In his introductory remarks, Kenneth I. Shine, president of the IOM, characterized medicine as "the largest cottage industry in the United States," one in which evidence-based practice is still a relatively young and controversial concept. He noted that science should be applied not only in the practice of medicine but to that practice as well, as the IOM had recently done in a report to the Health Care Financing Administration on defining "medical necessity." This turned out to be an extremely difficult task, Shine said, because outcomes data are sparse.

Shine observed that the distinction between "efficacy," evidence of an effect under ideal conditions, such as double-blind, randomized controlled trials, and "effectiveness," evidence of what actually works in practice, is a subtle and important one for all users of evidence. Last, he noted the challenge that time constraints posed, not only for bringing science-based evidence to bear on legal proceedings but also for its utilization by clinicians overwhelmed with new information.

Following Shine, John M. Eisenberg described his agency's mission as the sponsorship of research that produces evidence about effectiveness of health care practices and the translation of that research into practice. Although physicians believe that their practices have always been evidence based, many in the research community do not concede that this is so. Eisenberg identified three different levels at which issues of applying evidence in health care arise:

  1. The clinical level, as practitioners make patient care decisions.
  2. The level of a health care system as, for instance, in selecting particular drugs as part of a formulary or deciding which treatments will be covered under a health plan.
  3. The level of public policy, both after the fact, in legal standards established in court cases, and prospectively, as in Medicare coverage policy determinations.

Eisenberg questioned whether the rules of evidence, that is, the ways in which evidence is brought to bear on the question at hand, are the same in each context. How research-based medical evidence is characterized and how it is used to address different types of questions in different settings were recurring themes over the course of the day.

Mulrow acknowledged that medical research evidence is just one type of evidence that clinicians take into account. She pointed out that even with the explosion over the past decade of published studies (over 2 million annually), of biomedical journals (30,000), and of controlled trials of medical therapies (perhaps a quarter million), research evidence remains limited and spotty; it cannot answer all questions of medical practice policy for all patients and conditions. EBM emphasizes a structured and critical examination of the medical research literature. Mulrow noted that research evidence can be ambiguous and requires interpretation and judicious weighting of its significance. Just how such evidence is assembled and interpreted depends on the use to which it will be put (see Mulrow and Lohr's article in this issue).

The process of changing medical practice in response to EBM is gradual and irregular, many participants noted. Eddy remarked that despite physicians' assumption that their practice is rooted in empirical science, the past three decades have produced incontrovertible evidence that clinical practice deviates from research-based recommendations: "All the studies of variations in practice patterns, . . . of inappropriate care, when you look at what doctors actually do, compared with what we know does and doesn't work, we found we missed the mark not 2, 3, or 5 percent of the time, but 10, 20, 57 percent of the time. It is all over the place. . . . We have to drop the old assumptions."

Mulrow introduced another theme that was echoed in the discussions following her presentation: the relationship of research-based evidence to CPGs and medical standards of care. These concepts, particularly the notion of standards of practice or standard of care, were perhaps the most problematic in terms of what physicians and health care researchers, on the one hand, and lawyers, on the other, understood them to be and do in the context of medical care, as Mulrow observed at the close of the workshop:

Standards are another way in which recommendations based on research evidence might be expressed:

"Standards" were thus understood by the discussants to be enforceable by courts in malpractice cases and other legal disputes. Courts have traditionally established legal standards applicable to health care by reference to customary medical practice and prevailing medical opinion, as testified to by medical experts. Although some medical expert testimony may reflect the expert's awareness of scientific literature as well as his or her knowledge of customary practice, courts have seldom treated such testimony as primarily scientific in character.

As Mulrow and Lohr indicate in their article, legal standards differ from scientific findings of efficacy and safety in potentially incorporating value judgments and cost considerations. However, as long as the legal system looks to medical custom as the principal source of standards, it can incorporate cost-benefit trade-offs only to the extent that physicians make such trade-offs in their clinical choices—as they may be doing increasingly under pressure from payers and managed care plans.

The discussants considered the extent to which CPGs might be used to narrowly prescribe physician practices in a particular health plan as, for instance, in the case of a plan that instructs its medical group to follow certain guidelines. Researchers in the field of EBM did not seem comfortable, however, with such prescriptive use of CPGs, which they characterize as nonbinding recommendations that reflect developments in EBM. The gap between thinking of CPGs as advice to clinicians and using them as prescriptive legal standards was not bridged by the discussants. CPGs, while potentially valuable in establishing standards of care in malpractice cases, have not often been employed in this fashion, as Rosoff notes in his article.

Return to Contents

Rules of Evidence, Claims of Medical Expertise, and the Daubert Trilogy

The sea change in the treatment of scientific evidence and expert witnesses anticipated in the wake of the Supreme Court's 1993 Daubert ruling has not yet been realized, Shuman argues in his essay. The traditional "adversarial" model of the American legal system, he writes, "assumes we are more likely to uncover the truth about a contested event as the result of the efforts of the parties who have a self-interest in the outcome of the investigation than from the efforts of a judge charged only with an official duty to investigate the case." Under this model, the lower court trial judge's role of ruling on the admissibility of expert testimony has focused primarily on the expert's qualifications, leaving assessment of the expert's methods and procedures, the substance of his or her testimony, to be determined by the jury. The trial judge has significant discretion in evaluating the expert's qualifications, Shuman noted, citing a 1977 Sixth Circuit Court decision: "But the only question for the trial judge who must decide whether or not to allow the jury to consider a proffered expert's opinions is 'whether his knowledge of the subject matter is such that his opinion will most likely assist the trier of fact in arriving at the truth.' " (United States v. Barker, 553 F.2d 1013, 1024 [6th Cir. 1977]).

This traditional approach was presumably supplanted, at least in the federal courts, by a new model grounded in the three Supreme Court decisions within the past decade on the admissibility of expert testimony under the Federal Rules of Evidence: Daubert v. Merrell Dow, General Electric Co. v. Joiner, and Kumho Tire Co. v. Carmichael. The new gatekeeper model requires the trial judge to impose a more rigorous standard for admitting expert testimony, "ensuring that an expert's testimony rests on a reliable foundation and is relevant to the task at hand (Daubert)." Many federal courts had relied on the District of Columbia Court of Appeals' 1923 decision in Frye v. United States (293 F. 1013 [D.C. Cir. 1923]), which relied on professional consensus or general acceptance within the relevant professional community to determine the admissibility of novel scientific evidence. Daubert held that the Federal Rules of Evidence revised this standard. Both Joiner and Kumho Tire, the second and third of the Supreme Court's decisions in this trilogy, made it clear that the trial judge has broad and almost unreviewable discretion in applying Daubert standards for assessing the admissibility of all expert evidence, Shuman explained.

Shuman's review of the impact of the Daubert decision in different areas of civil litigation revealed a mixed record. He finds that the threshold for admissibility of medical and other kinds of expertise has risen significantly in toxic tort and product liability cases but not in medical malpractice litigation. Although critics of medical malpractice suits had hoped that Daubert would eliminate unreliable expert testimony in such cases, Shuman argues that so long as expert (physician) testimony as to customary practice within the local community is accepted on its face (as it continues to be), Daubert challenges to admitting expert testimony as to the standard of care are not likely to be successful. For these courts, the scientific validity of practice standards is simply beside the legal point at issue. To put it more directly, if (as other discussants noted) medical practice does not routinely conform to the best and most current scientific evidence, the courts' continued focus on professional custom for setting the legally recognized standard of care reinforces this disregard of scientific knowledge.

Shuman noted that a number of stratagems can be employed by judges to sidestep the issue of the validity of scientific evidence in deciding a case, but that these appear to some to be applied selectively. He argues that the standard of scientific rigor has consistently increased in product liability and toxic tort cases, where it works to the benefit of defendants. In contrast, in criminal cases, where the prosecution and defense tend to rely on the very same experts in different cases, the Daubert-based challenges to medical testimony have not been raised. If EBM is to be taken seriously in medical quality of care and coverage cases, lawyers and judges must come to believe that scientific validity is truly what matters. If rigorous, empirically based evidence is insisted upon only in cases where it is advantageous for one class of litigants, the demand for better science is unmasked as merely a (possibly biased) legal tactic.

In responding to Daniel Shuman's argument as to the perception of bias in the application of Daubert-inspired standards of evidence, Richard O. Lempert suggested that the Daubert line of cases is a judicial response to docket pressure. The Daubert rule has become one way that the judiciary has, over the past twenty years, increasingly used summary judgment beyond its classic purpose, which was to resolve civil cases when there was no genuine issue of material fact. Daubert increases the number of situations in which summary judgment can be used to dispose of cases that would ordinarily be very lengthy and expensive to try.

In Joiner and Kumho Tire, the courts have said they will not question, except in the most extreme situations, the trial judges' exercise of discretion as to the admissibility of scientific evidence. According to Lempert, some early cases following Daubert suggest that when excluding "junk science" hurts tort plaintiffs, such questionable evidence will be excluded. By contrast, when the state offers junk science in criminal cases to better its chances of convicting a defendant, it will be admitted because that admission will seem to further justice. At some point the practice of using shifting standards of admissibility is going to become obvious and thus intolerable. A single standard must be consistently applied, Lempert argued. He also predicted that the bite of Daubert and its progeny would not be limited to junk science, but that it would be used to exclude good scientific evidence in situations where one party's scientific evidence, though valid, seemed overwhelmed by the evidence on the other side. The danger, arguably realized in Kumho Tire, is that courts, under the guise of deciding preliminary questions of admissibility, would be taking from juries issues that the Seventh Amendment gives to juries to resolve.

Lempert's final point was that although truth is the formal goal of the legal system, a plaintiff need not show his or her claims to be true to win a case and a defendant can prevail without proving the other party's claims are false. To prevail in a case, a party must demonstrate only that a preponderance of the evidence supports the claim. If there is weak scientific evidence for the plaintiff and no evidence for the defendant, is that enough to give the plaintiff a verdict? It depends either on whether the judge decides to exclude the plaintiff's weak evidence on the grounds that the science is not good enough to get to a jury, or on what the jury thinks of the evidence if it does hear it. Contrary to popular perception, overall tort juries show no noticeable pro-plaintiff bias, Lempert claimed, and plaintiffs who get to juries often lose there. A party can also prove its case almost indisputably with scientific evidence. These cases seldom reach court reporters because they are usually dropped or settled without trial, depending on which side has the benefit of the science.

A final set of issues raised by Shuman's review of the Daubert decision's impact on malpractice and coverage cases concerned the extent and rate of uptake of Daubert. As long as medical experts are perceived to be testifying only on the matter of customary practice, they need not be qualified as expert scientists, since the opinions they express relate to what doctors actually do, not what scientific evidence suggests they should be doing. Although David Eddy's observations (quoted earlier) about the incongruence of EBM and actual medical practice might be understood to suggest that custom is a poor source of legal standards, the courts, adhering to a presumption that professionals generally practice according to scientific principles, have yet to look explicitly beyond medical practice directly to scientific evidence to define standards in malpractice cases. Likewise, coverage determinations based on "medical necessity" are often made by reference to professional opinion. For these reasons, courts have not subjected medical testimony in malpractice and coverage disputes to the closer kind of scrutiny that scientific testimony receives under Daubert and its progeny.

Joseph S. Cecil, of the Federal Judicial Center, suggested that federal judges have been caught off guard by the expectations that have arisen following the 1993 decision, and that they have made more than a good faith effort at trying to engage in an informed discussion about the basis of opinion. Acknowledging the limitations of the research he undertook in support of his presentation, Shuman pointed out that because settled cases are unreported, the extent to which courts employed the Daubert standard could not be learned from reported and appellate rulings alone, as these reflected only a small fraction of all cases. More detailed knowledge of judicial performance in malpractice cases and coverage disputes might therefore reveal some spillover from the federal Daubert ruling in state malpractice cases or in cases involving coverage issues.

This and other acknowledged limits of what we can know prompted several discussants to propose empirical research studies of how judges, lawyers, and juries evaluate, understand, and draw inferences from probabilistic and science-based evidence. Lohr suggested that probability and statistics might be applied to protect against making erroneous inferences and that links can be made back to the law from science and medical research in terms of how people think about reasonable doubt and preponderance of evidence. She noted that thresholds for statistical significance may be set for different purposes in health outcome measures, in particular, self-report instruments that seek to measure quality of life. For example, the standard for making distinctions between groups is generally lower than the standard for individual patient decisions.

Return to Contents
Proceed to Next Section