The Outcome of Outcomes Research at AHCPR: Final Report

Introduction (The past is prologue to the future)

This report has been developed to contribute to ongoing internal discussions at the Agency for Health Care Policy and Research (AHCPR) and for consideration by those involved in Outcomes and Effectiveness Research (OER) in three main areas:

  1. Developing a framework for understanding and communicating the impact of OER on health care practice and outcomes.
  2. Identifying specific examples of projects that illustrate the research impact framework.
  3. Deriving lessons and options from past efforts that may help develop strategies that will increase the measurable impact of future research sponsored by AHCPR.

The paper first provides some background on the development of the original effectiveness initiative in the late 1980s, then describes a framework for organizing the various impacts that OER has on practice and outcomes. Using this framework, and a series of case studies collected by staff from the Center for Outcomes and Effectiveness Research (COER), we list broad accomplishments associated with funded efforts by AHCPR. We then discuss some major lessons that have been learned over the past decade regarding OER, and finally offer a number of specific recommendations that AHCPR should consider in strategic planning. The primary question toward which this analysis is targeted is: "How can the OER program at AHCPR most effectively advance the field of health services research (HSR), contribute to public health, and address the expectations of policymakers and stakeholders?"

The establishment of AHCPR and the OER program in late 1989 stimulated the development of a series of new methods to relate the processes of care to the outcomes that people experience and care about. While relatively simple in concept, OER in fact represented a significant departure from traditional clinical research, in response to payers' and policymakers' concerns that widely demonstrated practice variations represented important sources of "cost without benefit." From the outset, however, the boundaries of OER have not been sharply defined. AHCPR's funding line in the budget labeled "MEDTEP" (Medical Treatment Effectiveness Program) supported both OER and clinical practice guidelines development. In fact, AHCPR's enabling legislation reflects a tension between an expectation that research conducted in typical practice settings would lead to sustained changes in practice, and an opposing belief that a specific intervention in the form of guidelines would be needed to facilitate improvements in practice. Responses to internal efforts to solicit customer input for future priorities (e.g., a Federal Register notice published in 1996) indicate that many people outside of AHCPR still consider clinical practice guidelines to be an intrinsic component of "OER." Indeed, AHCPR's annual reports to Congress on the status of the MEDTEP program focused almost exclusively on guidelines rather than research.

The focus of this report, however, is the research program. Just as OER represents a new approach to evaluating clinical practice, so too this analysis attempts to evaluate the value of research investments over the past decade. Evaluation of research investments has not been a systematic component of the research enterprise in the United States (OTA, 1994), so there are few precedents to inform the approach used here. However, it is and was eminently clear that AHCPR's existence reflected a strong belief on the part of Congress and other stakeholders that "success" should be assessed in terms that move well beyond the traditional outputs of the research enterprise (i.e., publications). While one clear motivation for producing this report was a perception that OER specifically, and the work of AHCPR generally, is not always supported unequivocally by policymakers (particularly members of Congress), it is also our hope that this effort ("the outcomes of outcomes research") is an important first step to redefining the goals of OER, as well as an honest appraisal of prior successes and opportunities for improvement.

President John F. Kennedy, quoting the philosopher George Santyana, once remarked that those who could not learn history's lessons were doomed to repeat them. The most important purpose of this analysis is to build on what has been done well and to learn from what has been done less well in order to develop strategies to enhance the visible impact of sponsored research in the future.

Return to Contents

Background

History of the science

The expectations that surround outcomes research can best be understood in the context of events that contributed to the establishment of AHCPR in 1989. One major milestone was the implementation of the prospective payment system (PPS) for Medicare inpatient care in the mid-1980s. Soon after the system was in place, the public and policymakers became concerned about Medicare patients being forced out of the hospital because "their DRG had run out." (DRG is Diagnosis Related Group and is a lump sum payment made by the insurer to the hospital based on the illness of the patient rather than the number of days in the hospital or the type of care that was provided for the patient. The concern was that patients would be discharged if they had stayed in the hospital a certain number of days rather than when they were ready clinically to leave.) At congressional hearings on the quality of care under Medicare, the phrase "quicker but sicker" captured central concern about the impact of the new financial incentives. William Roper, who became HCFA Administrator in 1986, promoted the use of Medicare databases to monitor the quality of hospital care through measurement of mortality rates, readmission rates, and other adverse outcomes.

In the decade prior to institution of PPS, John Wennberg and others had been developing a conceptual framework and methods for exploring the impact of health care services on patient outcomes. In 1987, several meetings were convened by the Department of Health and Human Services (HHS) that included Roper, as well as Wennberg, David Eddy, Robert Brook, and others to explore whether Medicare databases could be useful on a large scale for quality monitoring and improvement. Wennberg's work on geographic variations in medical practice (McPherson et al., 1982), studies on appropriateness of care led by Brook (Leape et al, 1990; Chassin et al., 1987), and Eddy's analysis of the poor quality of medical evidence (Eddy and Billings, 1988) set the stage for a major Federal initiative to improve the knowledge base for medicine. Roper and others announced this effectiveness initiative in a New England Journal of Medicine article in 1988 (Roper et al., 1988). The major responsibility for carrying this initiative forward found an institutional home when AHCPR was established in 1989.

At congressional hearings in support of the potential value of OER, John Wennberg was a frequent witness. In one of his appearances, he described his work comparing patterns of practice and outcomes in Boston and New Haven, showing that the additional resources consumed in Boston were not associated with better outcomes when compared with the thriftier New Haven practice patterns. Wrapping up his testimony, he advised members of Congress that "If 10 Bostons could become New Havens, the savings to Medicare would amount to $500 million." By implication, for this to happen, the OER community would have to conduct "the necessary scientific studies that allow physicians to define optimum treatments..." (Wennberg, 1984). This framed the conceptual paradigm for (and expectations of) effectiveness research: Through the retrospective study of patterns of care, optimal treatments would be defined, and substantial economic savings would be achieved.

The effectiveness initiative itself represented an important hypothesis: Guidance for optimal medical practice could be gleaned from analysis of data routinely gathered in the process of delivering and paying for patient care. AHCPR is in part the institutional embodiment of that hypothesis, and the output of the past decade offers some empirical evidence with which to assess its validity.

While some influential members of Congress were convinced of the value of establishing a major new program in OER, much larger forces and stakeholders dominated the politics surrounding the establishment of AHCPR (Gray, 1992). With the proposal to establish the Agency virtually dead because of the requirement of the balanced budget amendment, the American Medical Association (AMA) pushed strongly to keep the Agency. The AMA was negotiating physician payment reforms, and probably supported AHCPR in part to rationalize its opposition to expenditure targets with automatic fee reductions. Support of OER demonstrated the medical profession's commitment to reducing waste through scientific study and evidence-based guidelines. There was not, however, a widespread and deep-seated belief among policymakers and stakeholders that Federal support for OER (and guidelines) was essential.

These few historical observations highlight at least three important themes about the policy context of OER that continue to be relevant. First, the effectiveness initiative was explicitly constructed around substituting analytic efforts for clinical trials. Database analysis, systematic literature review, decision analysis, and guideline development were the methodological staples of early OER. The extent to which questions about "what works" in health care could be answered with these methods is gradually becoming clearer. Second, policymakers were explicitly told that large, measurable savings would result from better studies of the effectiveness of health care. And third, the political support of organized medicine was context-dependent, and would not necessarily be dependable over time. There was not a deeply held commitment to OER in the policy or professional community that could be relied on when the value of this activity was re-examined by Federal policymakers.

Taken together, these themes highlight AHCPR/COER's current and future challenge. The substance of the work undertaken by this Agency is analytically complex, and often apparently resistant to easy translation for policymakers and clinicians. Despite this complexity, expectations continue to be very high that research done by the Agency will have clear, measurable impact on health care quality and costs.

Return to Contents

Definition of OER

The terms "outcomes research" and "effectiveness research" have been used to refer to a wide range of studies, and there is no single definition for either that has gained widespread acceptance. As these fields evolved, it appears that "outcomes research" emerged from a new emphasis on measuring a greater variety of impacts on patients and patient care (function, quality of life, satisfaction, readmissions, costs, etc). The term "effectiveness research" was used to emphasize the contrast with efficacy studies, and highlighted the goal of learning how medical interventions affected real patients in "typical" practice settings (OTA, 1994). Effectiveness studies sought to understand the impact of health care on patients with diverse characteristics, rather than highly homogeneous study populations. While the terms may have different initial roots, there does not appear to be much value in distinguishing these activities, and the field is generally referred to as OER.

For purposes of this paper, we have adopted the following definition:

OER evaluates the impact of health care (including discrete interventions such as particular drugs, medical devices, and procedures as well as broader programmatic or system interventions) on the health outcomes of patients and populations. OER may include evaluation of economic impacts linked to health outcomes, such as cost-effectiveness and cost utility. OER imphasizes health problem- (or disease-) oriented evaluations of care delivered in general, real-world settings; multidisciplinary teams; and a wide range of outcomes, including mortality, morbidity, functional status, mental well-being, and other aspects of health-related quality of life. OER may entail any in a range of primary data collection methods and secondary (or "synthetic") methods that combine data from primary studies (Mendelson et al., 1998).

Technically, studies that describe patterns of care without reporting "outcomes" might be more appropriately called health services research rather than OER. For example, a study that shows rates of cardiovascular procedures vary by race or gender might be an OER study if it also reported mortality for these demographic groups, but not if it reported only the utilization patterns alone. For this report, we consider descriptive studies of patterns of care to be part of the spectrum of OER studies. This is in part because these studies often provided the initial "map" that made subsequent outcomes studies possible, and in part because a focus on variations in practice was a critical stimulus for identifying important topics for further studies that did explore outcomes.

Return to Contents

Data sources

A number of resources were used to develop this report, including: review of published articles from grants, review of original conceptual papers for launching the "outcomes movement" as well as critiques of same; an analysis of private sector involvement in OER conducted by Lewin; a survey of Principal Investigators (PIs); interviews with selected PIs; interviews with a former Director of the Center for Medical Effectiveness Research, AHCPR; discussions with COER staff; and recommendations made by investigators and stakeholders at two expert meetings (January and October 1997). A framework for assessing the impact of funded studies was developed and used in the analysis, along with selected "all star" case studies of specific projects.

Return to Contents

Framework—From Research Findings to Clinical Excellence

Description

One major impetus for developing this report on the status of OER was the need to translate a growing body of research into relevant insights for policymakers (public policymakers, systems leaders, and clinical policymakers). In addition, the challenge of identifying evidence of impact on clinical practice, which often occurs long after the grant support has concluded, stimulated a clear need to determine where and when AHCPR-supported OER has influenced practice. This led to some careful thinking about the different types of results, or impact, that are prompted by OER. A framework was developed that outlines an idealized process by which basic findings in OER are linked over time to increasingly concrete impacts on the health of patients. This framework was then developed into a more detailed conceptual diagram (select to access Figure 1, 12 KB). Examples of the various levels are provided in Table 1.

Level 1 impacts: All effects of research studies that do not represent a direct change in policy or practice. This includes new tools and methods for research, instruments and techniques to assist clinical decisionmaking, and studies that identify areas in which scientific knowledge is needed. For example, some studies have produced analytic tools for use in other research or clinical practice, such as the VF-14 (the Visual Function-14 measure), the benign prostatic hyperplasia (BPH) symptom index, and severity adjustment methods such as the Total Illness Burden Index. Level 1 impacts are also produced when studies describe findings inconsistent with current clinical paradigms, and stimulate rethinking and questioning within a clinical specialty.

Level 2 impacts: A policy or program is created as a direct result of the research (e.g., use of the information by health plans, professional organizations, legislative bodies, regulators, accrediting organizations, etc.).

Level 3 impacts: A change in what clinicians or patients do, or changes in a pattern of care. Level 3a impacts are those that are demonstrated in a limited study population as a result of a specific intervention. Level 3b impacts are trends identified outside a formal research context.

Level 4 impacts: Actual impact on health outcomes (clinical, economic, quality of life, satisfaction). Level 4a impacts are those demonstrated in a limited study population as a result of a specific intervention. Level 4b impacts are those identified outside a formal research context.


Table 1. Levels of Impact and Examples

Level 1: Impact on knowledge base, future research. Adverse drug events occur in 6.5 percent of admissions and result in additional length of stay of 2.2 days and costs of $3244 (Bates et al., 1995; Bates et al., 1997).

In a study of long-term outcomes following lumbar disk surgery, a literature synthesis showed that there was better short-term relief with surgery than conservative care but after 4 years, outcomes were similar. Ten percent of patients underwent additional surgery (Deyo and Patrick, 1995).

Level 2: Impact on policies and change agents. Children receiving less expensive antibiotics for otitis media did as well or better than those receiving more expensive antibiotics. Led to development of guidelines by the American Academy of Pediatrics recommending less expensive antibiotics and HEDIS quality measure (Berman et al., 1997).

PTCA mortality is related to volume of procedures performed by the cardiologist and the hospital. Led to recommendations by the American College of Cardiology (ACC) and American Heart Association (AHA) to raise volume requirements for cardiologists (Hannan et al., 1997; Hirshfeld et al., 1998).

Level 3: Impact on clinical practice. Dissemination of information about indications for antenatal corticosteroids increased their use from 20-70 percent of appropriate cases; the increase was significantly more in hospitals with active than passive dissemination efforts (Goldenberg, 1998).

Developed VF-14 measure to assess indications for and outcomes after cataract surgery. Replaced visual acuity as gold standard. Now routinely used by ophthalmologists and required by the National Eye Institute for sponsored research (Steinberg et al., 1994).

Level 4: Impact on patient outcomes. Data feedback, training in continuous quality improvement and visits to other medical centers improved CABG mortality by 24 percent (O'Connor et al., 1996).

The Pneumonia Severity Index was used to triage patients with community-acquired pneumonia to inpatient or outpatient therapy. Patients triaged to outpatient care were more satisfied with their care and returned to work and usual activities more quickly. Outpatient care was safe and resulted in measurable savings (Fine et al., 1994).


Return to Contents

Implications

According to this model, impacts at lower levels may be prerequisites for achieving impact at higher levels. Improvements in outcomes (level 4) are built on a foundation of studies that have identified problems, created new analytic and measurement tools to explore those problems, and compared different approaches to managing the problem. A framework for evaluating the success of OER provides a context for linking progress in basic studies with changes in practice and improvement in outcomes. It is a conceptual model rather than a literal step-by-step description of the OER process. The process from scientific knowledge development to its practical application will rarely be as systematic and orderly—or strategic—as described in this framework. The main purpose of this framework is to emphasize the relationship between research that does not directly induce or document changes in patterns of care or improved outcomes and subsequent improvements in population health.

The levels of impact also make clear the challenge facing AHCPR in conveying to policymakers the value of OER. Judgments about the value of OER will depend heavily on the level of impact expected by sponsors, stakeholders, and policymakers. That perspective will determine whether level 1 or 2 impacts are understood to be important contributions or another example of wasted Federal research dollars.

Level 1 impacts clearly make a contribution to the health care knowledge base. For example, extensive literature reviews were done by the low birthweight Patient Outcomes Research Team (PORT) to document the lack of evidence of benefit for numerous popular interventions in pregnancy (NEJM, August 1998). This would be classified as a level 1 impact, but potentially of considerable importance, so that certain practices can be discouraged, and so that an appropriate research agenda for promising but unproven interventions can be begun.

Documentation of level 2 impacts provides suggestive evidence that a change in a health outcome will result, but may still be viewed as inadequate by policymakers. For example, the inclusion in HEDIS of the rate of post-myocardial infarction beta-blocker in the elderly will probably lead to greater use of that therapy. Randomized studies have already demonstrated that mortality will decrease with use of beta-blockers in these patients. In this case, the level 2 impact (a policy change by National Committee for Quality Assurance [NCQA]) is very likely to prompt improvement efforts in many organizations. Past experience suggests that the introduction of new quality measures is associated with successful interventions to alter practice, but this connection may not be self-evident to key decisionmakers.

Even when level 3 or 4 impacts are observed, it is rarely straightforward to link these changes in health care practice or outcome to studies that may have contributed to them. Other factors will usually need to be identified to provide an adequate explanation for why a specific health improvement occurs. The complexity of the process of health care decisions, and the health care system itself, ensures that any change in that system will be the consequence of multiple interrelated factors, some of which are controllable, and others of which are not. For example, the 50 percent reduction in the rate of prostate surgery over the past decade is associated with several important changes in knowledge about and treatment of prostatism (such as new drug therapy as well as improved understanding of the risks and benefits of treatment). While many observers credit the PORT investigators with the trend toward less aggressive surgical management of benign prostate disease, it is challenging to isolate the role of an individual factor when many forces are at work simultaneously.

The impact framework draws upon and highlights several observations about the process by which OER builds upon itself and influences health care policy and practice. First, it helps to explain the incremental nature of the process that begins with discovering which questions to ask and ends with improved health outcomes. A better understanding of this process by researchers and Agency staff may stimulate improved explanations of the value of research with low level impacts.

Second, the framework portrays a process of change that is more complex and subject to external influences than was understood when AHCPR was established. The simplified view widely held in 1989 was that clinicians directly incorporated new research findings into practice, and were therefore the primary audience for OER. Changes in the external environment of health care have made it clear that clinicians are influenced by multiple factors and forces, and that information is necessary but not sufficient to influence behavior (Davis et al., 1995). Level 2 of the framework identifies a number of "change agents" through which research findings may be transmitted to decisionmakers. This reflects the growing recognition that the policies, organization, financial arrangements, and other features of health care organizations play an important role in the translation of research into practice. In order to have practical value, research on medical effectiveness must be designed to capture the heterogeneity of organizations.

The first wave of OER studies sought to understand the relationship between patient characteristics and outcomes. It is now clear that understanding the relationship between characteristics of health organizations and outcomes is another requirement for producing useful effectiveness studies. Figure 2 (6 KB) provides a conceptual model which blends the levels of impact with this understanding of the importance of organizations.

Third, the complexity of the change process ensures that any beneficial changes that do occur will be difficult to trace back to OER studies that may have contributed to them. The Agency needs to be more purposeful about identifying high level impacts and tracing them back to related OER projects.

Fourth, the framework begins to provide some guidance for considering strategies to improve impact, and to assess any improvements that occur. While the process is long and complex, an important challenge of AHCPR is to accelerate this time frame. The framework offers one roadmap to be considered by researchers and funders in considering new studies and implementation efforts. Coordinated strategies to achieve level 3 and 4 impacts should be formulated early and updated often as part of the research process.

Finally, by clarifying different types of "impact" that occur at different levels, the framework defines accountability for the OER enterprise. We can identify what investigators mean by impact, and what policymakers understand impact to be. While these definitions may have differed in the past, the more detailed understanding of "impact" should help focus discussions of what still needs to be achieved.


Return to Contents
Proceed to Next Section