The U.S. Preventive Services Task Force (USPSTF) systematically reviews the evidence concerning both the benefits and harms of widespread implementation of a preventive service. It then assesses the certainty of the evidence and the magnitude of the benefits and harms. On the basis of this assessment, the USPSTF assigns a letter grade to each preventive service signifying its recommendation about provision of the service (see Table below). An important, but often challenging, step is determining the balance between benefits and harms to estimate "net benefit" (that is, benefits minus harms).
Table 1. U.S. Preventive Services Task Force Recommendation Grid*
Certainty of Net Benefit |
Magnitude of Net Benefit |
Substantial |
Moderate |
Small |
Zero/Negative |
High |
A |
B |
C |
D |
Moderate |
B |
B |
C |
D |
Low |
Insufficient |
*A, B, C, D, and Insufficient represent the letter grades of recommendation or statement of insufficient evidence assigned by the U.S. Preventive Services Task Force after assessing certainty and magnitude of net benefit of the service (see the "Rating Scheme for the Strength of the Recommendations" field.
The overarching question that the Task Force seeks to answer for every preventive service is whether evidence suggests that provision of the service would improve health outcomes if implemented in a general primary care population. For screening topics, this standard could be met by a large randomized, controlled trial (RCT) in a representative asymptomatic population with follow-up of all members of both the group "invited for screening" and the group "not invited for screening."
Direct RCT evidence about screening is often unavailable, so the Task Force considers indirect evidence. To guide its selection of indirect evidence, the Task Force constructs a "chain of evidence" within an analytic framework. Each arrow in the framework defines a key question, and each key question represents a link in the chain of evidence. Rectangles in the framework represent the intermediate outcomes (rounded corners) or the health outcomes (square corners); ovals represent harms. To form an unbroken chain, evidence must support each link in the chain, thereby connecting the target population (far left side of the framework) to the improved health outcome (far right side of the framework). For each key question, the body of pertinent literature is critically appraised, focusing on the following 6 questions:
- Do the studies have the appropriate research design to answer the key question(s)?
- To what extent are the existing studies of high quality? (i.e., what is the internal validity?)
- To what extent are the results of the studies generalizable to the general U.S. primary care population and situation? (i.e., what is the external validity?)
- How many studies have been conducted that address the key question(s)? How large are the studies? (i.e., what is the precision of the evidence?)
- How consistent are the results of the studies?
- Are there additional factors that assist us in drawing conclusions (e.g., presence or absence of dose-response effects, fit within a biologic model)?
The next step in the Task Force process is to use the evidence from the key questions to assess whether there would be net benefit if the service were implemented. In 2001, the USPSTF published an article that documented its systematic processes of evidence evaluation and recommendation development. At that time, the Task Force's overall assessment of evidence was described as good, fair, or poor. The Task Force realized that this rating seemed to apply only to how well studies were conducted and did not fully capture all of the issues that go into an overall assessment of the evidence about net benefit. To avoid confusion, the USPSTF has changed its terminology. Whereas individual study quality will continue to be characterized as good, fair, or poor, the term certainty will now be used to describe the Task Force's assessment of the overall body of evidence about net benefit of a preventive service and the likelihood that the assessment is correct. Certainty will be determined by considering all 6 questions listed above; the judgment about certainty will be described as high, moderate, or low.
In making its assessment of certainty about net benefit, the evaluation of the evidence from each key question plays a primary role. It is important to note that the Task Force makes recommendations for real-world medical practice in the United States and must determine to what extent the evidence for each key question-even evidence from screening RCTs or treatment RCTs--can be applied to the general primary care population. Frequently, studies are conducted in highly selected populations under special conditions. The Task Force must consider differences between the general primary care population and the populations studied in RCTs and make judgments about the likelihood of observing the same effect in actual practice.
It is also important to note that 1 of the key questions in the analytic framework refers to the potential harms of the preventive service. The Task Force considers the evidence about the benefits and harms of preventive services separately and equally. Data about harms are often obtained from observational studies because harms observed in RCTs may not be representative of those found in usual practice and because some harms are not completely measured and reported in RCTs.
Putting the body of evidence for all key questions together as a chain, the Task Force assesses the certainty of net benefit of a preventive service by asking the 6 major questions listed above. The Task Force would rate a body of convincing evidence about the benefits of a service that, for example, derives from several RCTs of screening in which the estimate of benefits can be generalized to the general primary care population as "high" certainty (see the "Rating Scheme for the Strength of Recommendations" field). The Task Force would rate a body of evidence that was not clearly applicable to general practice or has other defects in quality, research design, or consistency of studies as "moderate" certainty. Certainty is "low" when, for example, there are gaps in the evidence linking parts of the analytic framework, when evidence to determine the harms of treatment is unavailable, or when evidence about the benefits of treatment is insufficient. Table 4 in the methodology document listed below (see "Availability of Companion Documents" field) summarizes the current terminology used by the Task Force to describe the critical assessment of evidence at all 3 levels: individual studies, key questions, and overall certainty of net benefit of the preventive service.
Sawaya GF et al. Update on the methods of the U.S. Preventive Services Task Force: estimating certainty and magnitude of net benefit. Ann Intern Med. 2007;147:871-875.[5 references].