Background Article

Current Methods of the U.S. Preventive Services Task Force: A Review of the Process

Russell P. Harris, M.D., M.P.H.^a, Mark Helfand, M.D., M.S.^b, Steven H. Woolf, M.D., M.P.H.^c, Kathleen N. Lohr, Ph.D.^d, Cynthia D. Mulrow, M.D. ^e, Steven M. Teutsch, M.D., M.P.H.^f, and David Atkins, M.D., M.P.H.^g for the Methods Work Group Third U.S. Preventive Services Task Force.¹

Address correspondence to: Russell P. Harris, M.D., M.P.H., Cecil G. Sheps Center for Health Services Research, CB# 7590, 725 Airport Rd., The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7590. E-mail: rharris@med.unc.edu

This article originally appeared in the American Journal of Preventive Medicine. Select for copyright and source information.

Abstract
Introduction
Scope and Selection of Topics
   Scope
   Selection of Topics
Review of the Evidence
   Intensity
   Setting the Focus for Admissible Evidence
   Literature Search and Abstraction
Assessing Magnitude of Net Benefit
   Assessing Magnitude of Benefits
   Assessing Magnitude of Harms
   Assessing Net Benefits: Weighing Benefits and Harms
Extrapolation and Generalization
Translating Evidence into Recommedations
   General Principles
   Codes and Wording of Statements
Drafting the Report
External Review
Conclusion
Acknowledgements
References and Notes

Abstract

The U.S. Preventive Services Task Force (USPSTF/Task Force) represents one of several efforts to take a more evidence-based approach to the development of clinical practice guidelines. As methods have matured for assembling and reviewing evidence and for translating evidence into guidelines, so too have the methods of the USPSTF. This paper summarizes the current methods of the third USPSTF, supported by the Agency for Healthcare Research and Quality (AHRQ) and two of the AHRQ Evidence-based Practice Centers (EPCs).

The Task Force limits the topics it reviews to those conditions that cause a large burden of suffering to society and that also have available a potentially effective preventive service. It focuses its reviews on the questions and evidence most critical to making a recommendation. It uses analytic frameworks to specify the linkages and key questions connecting the preventive service with health outcomes. These linkages, together with explicit inclusion criteria, guide the literature searches for admissible evidence.

Once assembled, admissible evidence is reviewed at three strata: (1) the individual study, (2) the body of evidence concerning a single linkage in the analytic framework, and (3) the body of evidence concerning the entire preventive service. For each stratum, the Task Force uses explicit criteria as general guidelines to assign one of three grades of evidence: good, fair, or poor. Good or fair quality evidence for the entire preventive service must include studies of sufficient design and quality to provide an unbroken chain of evidence-supported linkages, generalizable to the general primary care population, that connect the preventive service with health outcomes. Poor evidence contains a formidable break in the evidence chain such that the connection between the preventive service and health outcomes is uncertain.

For services supported by overall good or fair evidence, the Task Force uses outcomes tables to help categorize the magnitude of benefits, harms, and net benefit from implementation of the preventive service into one of four categories: substantial, moderate, small, or zero/negative.

The Task Force uses its assessment of the evidence and magnitude of net benefit to make a recommendation, coded as a letter: from A (strongly recommended) to D (recommend against). It gives an I recommendation in situations in which the evidence is insufficient to determine net benefit.

The third Task Force and the EPCs will continue to examine a variety of methodologic issues and document work group progress in future communications.

Keywords: MEDLINE; preventive health services; evidence-based medicines; methods; practice guidelines.

Return to Contents

Introduction

The U.S. Preventive Services Task Force (Task Force/USPSTF) represents one of several efforts by governments and national organizations to take a more evidence-based approach to the development of clinical practice guidelines. Guidelines developed by an evidence-based approach tend to be based on conclusions supported more by scientific evidence than by expert opinion (1). Efforts are made to link the strength of recommendations to the quality of evidence; to make that linkage transparent and explicit, and to ensure that the review of evidence is comprehensive, objective, and attentive to quality (2).

Methods for reviewing the evidence have matured over the years as groups have gained experience in developing evidence-based guidelines. Systematic searches of multiple bibliographic research databases help ensure thorough and unbiased identification of the relevant literature. Predetermined selection criteria minimize bias and improve the efficiency of reviewing that literature. Quality criteria developed by methodologists guide judgments of weaknesses and strengths of individual research studies. Frameworks and models explicitly define methods for rating and integrating multiple pieces of heterogeneous evidence (3).

Methods for linking evidence and recommendations have also matured (4). Initially the recommendations of the USPSTF and other evidence-based groups were strongly correlated with the research design of the most important studies. An A recommendation, for example, usually meant that use of the preventive service was supported by a randomized controlled trial (RCT) (5,6). Guideline developers now understand the need to consider the evidence as a whole, including the trade-offs among benefits, harms, and costs and the net benefit relative to other health care needs for optimal resource allocation (7).

In the case of prevention, moreover, special scientific and policy considerations apply in reviewing evidence and setting policy. Preventive services require a distinctive logic in considering, for example, the incremental benefit of early detection or the ability of counselors to motivate behavior change. Because the populations affected by preventive care recommendations are often large and have no recognized symptoms or signs of the target condition, harms incurred by even a small percentage can affect a large number of people. Thus, the potential for doing greater harm than good must be taken seriously.

In the context of these methodologic advances and with an awareness of the many unresolved issues for which sound methods are lacking, the third Task Force formed a methods subcommittee (Methods Work Group). It comprises members of the Task Force, representatives of the Canadian Task Force on Preventive Health Care, staff of the two Evidence-based Practice Centers (EPCs) that support the Task Force, and staff of the Agency for Healthcare Research and Quality (AHRQ). The mission of the Work Group is to revisit methods used by previous U.S. Preventive Services Task Forces, to develop more sophisticated methods to be used in current work, and to understand better the theoretical considerations for problems that lack easy answers.

The discussions of this group and subsequent discussions by the entire Task Force have led to several modifications of Task Force methods and identified areas that need further examination. This article describes the methods in current use by the third USPSTF. As the Task Force identifies better ways to do its work, the Methods Work Group will explore additional revisions and refinements to its methods.

We discuss these changes in the sequence of steps of recommendation development: scope and selection of topics, review of the evidence, assessing the magnitude of net benefit, extrapolation and generalization, translating evidence into recommendations, drafting the report, and external review.

Return to Contents

Scope and Selection of Topics

Scope

In defining its scope of interest, the Task Force must consider types of services, populations of patients and providers, and sites for which its recommendations are intended. Clarifying these definitions has both methodologic and practical importance. Resource limitations make it impossible for the Task Force to review evidence for all services that prevent disease; the project must, therefore, set boundaries.

The third Task Force has retained the previous policy of focusing on screening tests, counseling interventions, immunizations, and chemoprevention delivered to persons without recognized symptoms or signs of the target condition.

As in the past, this Task Force decided not to make recommendations concerning services to prevent complications in patients with established disease (e.g., coronary artery disease and diabetes). It does, however, make recommendations for preventing morbidity or mortality from a second condition among those who have a different established disease.

The Task Force does make recommendations for people at different levels of risk for a condition. Many people in the general population have one or more risk factors for the Task Force's target conditions. Because the balance between benefits and harms sometimes differs between people at higher risk and those at lower risk, Task Force recommendations may vary across these different groups.

Although the Task Force does not conduct systematic searches of evidence for services to prevent complications in people with established disease, it may cite such studies when they are relevant for people without established disease. Often, compelling evidence that screening tests and treatments can reduce morbidity and mortality comes from patients with extant disease rather than from asymptomatic populations. For example, the review of lipid screening (8) would be incomplete if it did not discuss studies of the efficacy of statins in patients with coronary artery disease.

The populations for whom Task Force recommendations are intended include patients seen in traditional primary care or other clinical settings (e.g., dieticians' offices, cardiologists' offices, emergency departments, hospitals, school-based clinics, urgent care facilities, student health clinics, family planning clinics, nursing homes, and homes). As before, the third Task Force has excluded consideration of preventive services outside the clinical setting (e.g., nonclinic-based programs at schools, worksites, and shopping centers), reserving this analysis to the work of the Centers for Disease Control and Prevention's (CDC) Guide to Community Preventive Services (9) effort. For selected topics, however, the Task Force may examine evidence from community-based settings to evaluate the effectiveness of interventions conducted in the clinical arena.

Return to Contents

Selection of Topics

In the second edition of the Guide to Clinical Preventive Services (6), the Task Force reviewed 70 preventive care topics, including more than 100 actual services. These had been selected on the basis of the burden of suffering to society or individuals and the potential effectiveness of one or more preventive interventions. The Task Force briefly considered using an explicit grading process for ranking the priority of topics, an exercise that was undertaken by the second Task Force with disappointing results, and for this reason the current Task Force did not pursue it.

Instead, the third Task Force started with the topics reviewed in the second Guide to Clinical Preventive Services (6). From the 70 topics, the EPCs, AHRQ, and Task Force leaders identified 55 likely to have new evidence or continued controversy. For these 55 topics, the EPCs undertook limited literature searches and prepared brief summaries of the new evidence, current controversies, and critical issues. The EPCs prepared similar summaries of 15 new topics suggested by previous Task Force members, the public, outside experts, federal agencies, and health care organizations. AHRQ and the EPCs also invited about 60 private health and consumer groups and federal agencies to rate the need to update old chapters and to nominate new topics.

Based on this information, the USPSTF ranked the priority of topics at its first meeting in November 1998. It initially assigned 12 topics to the two EPCs (six to each EPC) for review and has subsequently added more topics in a phased schedule (Table 1).

Table 1. Topics completed or under review by the third U.S. Preventive Services Task Force.

Evidence-based Practice Center: Research Triangle Institute—University of North Carolina.

Updates:

Screening for and treating adults for lipid disorders.
Screening for type 2 diabetes mellitus.
Counseling in the clinical setting to prevent unintended pregnancy.
Counseling to promote a healthy diet.
Screening for visual impairment in children aged 0 to 5 years.
Screening for depression.
Screening for cervical cancer.
Screening for prostate cancer.
Screening for colorectal cancer.
Aspirin chemoprevention for the primary prevention of cardiovascular events.
Screening for hypertension.
Screening for gestational diabetes.
Screening for asymptomatic coronary artery disease.
Screening for dementia.
Screening for obesity.
Screening for suicide risk.
Counseling to prevent dental and periodontal disease.

New:

Chemoprevention of breast cancer.
Screening for developmental delay.

Evidence-based Practice Center: Oregon Health Sciences University.

Updates:

Screening for breast cancer.
Screening for skin cancer.
Counseling to prevent skin cancer.
Screening for family violence.
Screening for problem drinking.
Counseling to prevent youth violence.
Postmenopausal hormone chemoprevention.
Screening for chlamydial infection.
Universal newborn hearing screening.
Screening for lung cancer.
Screening for ovarian cancer.
Screening for iron deficiency anemia.
Screening for neural tube defects.
Screening for asymptomatic carotid artery stenosis.
Screening for Down syndrome.
Screening for osteoporosis.
Counseling to promote physical activity.

New:

Screening for bacterial vaginosis in pregnancy.
Counseling to promote breastfeeding.
Vitamin supplementation to prevent cancer and cardiovascular disease.

The responsible EPC assigns a lead author and a variable number of additional local personnel to each topic. The Task Force assigns two or three of its own members ("Task Force liaisons") to collaborate on the review. The local EPC group and the Task Force liaisons constitute the "topic team" for each review. The EPCs make certain that all topic team personnel are trained in Task Force methods and the content area of the review.

Return to Contents

Review of the Evidence

Intensity

Current methods for conducting systematic reviews emphasize a comprehensive literature search and evaluation and detailed documentation of methods and findings (10). An advantage of this approach is that it avoids the tendency of some guideline panels to cite evidence selectively in support of their recommendations. This approach also enables others outside the process to understand, judge, and replicate the interpretation of the evidence. The disadvantage of this approach is that it produces long, detailed reports of interest to a minority of readers and of limited value to busy clinicians. The process is also resource intensive and requires months of work and considerable expenditures for literature searches and staff. Despite the disadvantages, many evidence-based groups use this approach when reviewing evidence.

For a group such as the Task Force and its EPCs, which must examine multiple topics at once, limited resources and time require compromises in the intensity of reviews. Full-scale systematic reviews for every topic considered are not possible. One strategy for striking a balance, already noted, is topic prioritization. Another strategy, initiated by the second Task Force, is to focus the review on the questions and evidence most critical to making a recommendation.

Return to Contents

Setting the Focus for Admissible Evidence

Analytic Framework

The second Task Force introduced diagrams, called "causal pathways," to map out the specific linkages in the evidence that must be present for a preventive service to be considered effective. The third Task Force retained these diagrams, renaming them "analytic frameworks." The analytic framework (Figures 1 and 2) uses a graphical format to make explicit the populations, preventive services, diagnostic or therapeutic interventions, and intermediate and health outcomes to be considered in the review. It demonstrates the chain of logic that evidence must support to link the preventive service to improved health outcomes (11,12,13).

In the analytic framework, the arrows ("linkages"), labeled with a preventive service or a treatment, represent the questions that evidence must answer; dotted lines represent associations; rectangles represent the intermediate outcomes (rounded corners) or the health states (square corners) by which those linkages are measured. Figure 1 (13 KB) illustrates the analytic framework for a screening service, in which a population at risk (left side of the figure) undergoes a screening test to identify early-stage disease. A generic analytic framework for a counseling topic is given in Figure 2 (13 KB).

In Figure 1, an "overarching" linkage (arrow 1) above the primary framework represents evidence that directly links screening to changes in health outcomes. For example, an RCT of chlamydia screening established a direct, causal connection between screening and reduction in a pelvic inflammatory disease (14). That is, a single body of evidence establishes the connection between the preventive service (screening) and health outcomes.

When direct evidence is lacking or is of insufficient quality to be convincing, the Task Force relies on a chain of linkages to assess the effectiveness of a service. In Figure 1, these linkages correspond to key questions about the accuracy of screening tests (arrow 3), the efficacy of treatment (arrow 4 or arrow 5 for intermediate or health outcomes, respectively), and the association between intermediate measures and health outcomes (dotted line 6). Intermediate outcomes (e.g., changes in serum lipid levels or eradication of chlamydia infection as measured by a DNA probe) are often used in studies as indicators of efficacy; health outcomes are measures that a patient can feel or experience, including death, quality of life, pain, and function. Curved arrows below the primary framework (arrows 7 and 8 in Figure 1) indicate adverse events or harms (ovals). Each arrow in the analytic framework relates to one or more "key questions" that specify the evidence required to establish the linkage (see the legends for Figures 1 and 2). These questions help organize the literature searches, the results of the review, and the writing of reports.

As can be seen in Figures 1 and 2, the framework supporting a service is considered indirect if two or more bodies of evidence are required to assess the effectiveness of the service. For example, no controlled studies provide direct evidence that screening for skin cancer lowers mortality (15). To infer benefit, one must piece together evidence about the accuracy of the screening test, how much earlier screening detects skin cancer or its precursors than would be the case without screening, the existence of effective treatment, whether treatment at an earlier stage improves health outcomes, and the existence and magnitude of associated harms. These criteria are similar to those outlined by the World Health Organization (16) and by Frame and Carlson (17).

Admissible evidence

The third Task Force focuses its reviews primarily on the evidence most likely to influence recommendations. For example, it maintains the tradition of giving greater weight to evidence that preventive services influence health outcomes rather than intermediate outcomes. Although some intermediate outcomes (e.g., advanced-stage breast or colon cancer) are so closely associated with health outcomes that they are logical surrogates, many others (e.g., physiological changes or histopathologic findings) are less convincing because their reliability in predicting adverse health outcomes has weaker scientific support (18,19). Accordingly, the topic teams often do not fully review studies that do not address outcomes of interest.

The topic team determines the bibliographic databases to be searched and the specific inclusion and exclusion criteria (i.e., admissible evidence) for the literature on each key question. Such criteria typically include study design, population studied, year of study, outcomes assessed, and length of follow-up. Topic teams specify criteria on a topic-by-topic basis rather than adhering to generic criteria. If high-quality evidence is available, the topic teams may exclude lower-quality studies. Conversely, if higher-quality evidence is lacking, the teams may examine lower-quality evidence. In general, the topic teams exclude non-English language references.

The second Task Force reviewed studies published through 1995. Thus, literature searches to update these topics usually extend from 1994 to the present, although new or refocused key questions may extend the search to older literature. For new topics, all searches begin with 1966 unless topic-specific reasons limit the search to a shorter time span or require an examination of even older literature. If a search finds a well-performed systematic review that directly addresses the literature on a key question through a given date, the topic team may use this review to capture the literature for those dates. The team can then restrict its own search to dates not covered by the existing systematic review.

The topic team documents these strategies for sharpening focus—the analytic framework, key questions, and criteria for admissible evidence—in an initial work plan. This work plan is presented to the Task Force at its first meeting after the topic has been assigned, allowing the Task Force the opportunity to modify the direction and scope of the review, as needed.

Return to Contents
Proceed to Next Section