Summary of the Evidence

Breast Cancer Screening


By Linda L. Humphrey, M.D., M.P.H.a; Mark Helfand, M.D., M.S.a; Benjamin K.S. Chan, M.S.a; and Steven H. Woolf, M.D., M.P.H.b.

Address correspondence to: Linda Humphrey, Oregon Health & Science University, Mailcode BICC, 3181 SW Sam Jackson Park Road, Portland, OR 97201-3098; E-mail: humphrey@ohsu.edu.

This article originally appeared in the Annals of Internal Medicine. Select for copyright and source information.


The summaries of the evidence briefly present evidence of effectiveness for preventive health services used in primary care clinical settings, including screening tests, counseling, and chemoprevention. They summarize the more detailed Systematic Evidence Reviews, which are used by the U.S. Preventive Services Task Force (USPSTF) to make recommendations.


Contents

Epidemiology
Methods
Results
Discussion
Acknowledgments
References
Notes
Appendix

Epidemiology

Breast cancer is the second leading cause of cancer death among North American women. Approximately 1 in 8.2 women will receive a diagnosis of breast cancer during her lifetime, and 1 in 30 will die of the disease.1 Breast cancer incidence increases with age,1 and although significant progress has been made in identifying risk factors and genetic markers, more than 50 percent of cases occur in women without known major predictors.2-5

This review was commissioned to assist the current U.S. Preventive Services Task Force (USPSTF) in updating its recommendations on breast cancer screening. We focus on information that was not available in 1996, when the previous USPSTF examined the issue.6 Our goal was to critically appraise and synthesize evidence about the overall effectiveness of breast cancer screening, as well as its effectiveness among women younger than 50 years of age.

Return to Contents

Methods

The analytic framework, literature search, and data extraction are described in detail in the Appendix. Briefly, we searched the Cochrane Controlled Trials Registry, MEDLINE®, PREMEDLINE, and reference lists.6-8 for randomized, controlled trials of screening with death from breast cancer as an outcome. In all, we reviewed 154 publications from 8 eligible randomized trials of screening mammography and two trials of breast self-examination (BSE). We abstracted details about patient population, design, quality, data analysis, and published results at each reported length of followup. We also evaluated previous meta-analyses of these trials and of screening test characteristics and studies evaluating the harms associated with false-positive test results.

We used predefined criteria developed by the current USPSTF to assess the internal validity of the trials.9 Two authors rated the internal validity of each study as "good," "fair," or "poor." Disagreements were resolved by further review and discussion. In the USPSTF system, a study that meets all the criteria for internal validity is rated as good quality.9 The rating reflects a judgment that the results of the study are very likely to be correct. The fair-quality rating is used for studies that have important but not major flaws and implies that the findings are probably valid. A study that has a major flaw in design or execution—one that is serious enough to invalidate the results of the study—is rated as poor quality. We based our quality ratings on the entire set of publications from a trial rather than on individual articles.

The USPSTF criteria for internal validity are listed in Appendix Table 1. All of the mammography trials met the first three criteria: They clearly defined interventions, measured important outcomes, and used intention-to-treat analysis. Therefore, our quality ratings reflect differences among the studies on the remaining criteria:

  1. Initial assembly of comparable groups.
  2. Maintenance of comparable groups and minimization of differential loss to followup or overall loss to followup.
  3. Use of outcome measurements that were equal, reliable, and valid. The Appendix describes our approach to applying these criteria in more detail.

We conducted new meta-analyses to incorporate new information about the quality of the trials and longer followup results. Breast cancer is known for its biological heterogeneity.10 as well as for late recurrences.10 Thus, longer followup is relevant in evaluating mortality rates, particularly in younger women. In addition, for several of the trials, the most recent analyses correct flaws in earlier reports.

Six of the eight mammography trials were designed to assess the effectiveness of mammography over a broad age range, rather than its comparative effectiveness in various age subgroups. One trial specifically examined women 40 to 49 years of age because the earliest trial seemed to show no benefit in this subgroup. The USPSTF posed these questions for the meta-analysis:

We answered each question in two parts. First, using WinBUGS software (MRC Biostatistics Unit, Cambridge, United Kingdom), we constructed a two-level Bayesian random-effects model to estimate the effect size from multiple data points for each study and to derive a pooled estimate of relative risk reduction and credible intervals (CrIs) for a given length of followup.11 Second, we pooled the most recent results of each trial to calculate the absolute and relative risk reduction, using the results of the first analysis to estimate the mean length of followup.

To avoid bias that could result from excluding any data from valid studies, we included the results of all trials of fair quality or better in the base-case analysis. The disadvantage of this approach is that it combines results from two distinct types of studies. The six population-based trials randomly assigned women to an invitation-to-screening group or to a control group that received "usual care" and was followed passively. In these trials, women who were invited to screening but chose not to be screened were included in the analysis of the "screened" group. Two trials from Canada, the Canadian National Breast Screening Study-1 (CNBSS-1) and the Canadian National Breast Screening Study-2 (CNBSS-2), differed from the other six trials. First, the Canadian trials used mass media to recruit a sample of volunteers, and all women randomly assigned to mammography had mammography at least once.12,13 Second, in CNBSS-2, the control group was screened periodically with clinical breast examination (CBE). To estimate the relative risk reduction and the number needed to invite to screen to prevent one breast cancer death compared with usual care, we reanalyzed the data excluding the results of the Canadian studies.

Role of the Funding Source

This study was funded by the Agency for Healthcare Research and Quality (AHRQ). Agency staff and members of the USPSTF reviewed and made substantive recommendations about the analyses and final manuscript. Agency approval was required before the manuscript could be submitted for publication.

Return to Contents

Results

Description of Trials

The eight randomized trials of mammography identified in our review12-23 varied in recruitment of participants, mammography protocol, control groups, and size (Table 1; Printable Version: PDF File, 22 KB). Six trials examined the effectiveness of screening among women between 40 and 74 years of age; one trial enrolled women in their 40s, and one enrolled only women in their 50s. Four trials from Sweden tested mammography only,14-17,23-26 and the other four, from Canada, New York, and Edinburgh, Scotland, tested mammography and clinical breast examination.12,13,18-22,27

Study Quality

We found important methodologic limitations in all of the trials and rated all but one as fair, using USPSTF criteria. Table 1 lists the flaws of each trial and indicates how they influenced the overall ratings. The two reviewers rated the Swedish and Canadian trials as fair. Their initial ratings for the Edinburgh study and for the Health Insurance Plan of Greater New York (HIP) study differed. After extensive peer review, and detailed review of these trials' associated publications, the reviewers reached a consensus that the HIP study should be rated as fair and the Edinburgh study should be rated as poor.

The HIP trial (conducted from 1963 to 1966) was the first trial of breast cancer screening. It is difficult to critically appraise because publications that describe it differ in detail from more recent publications. We found several limitations of this trial, including inadequate description of allocation concealment and poor reporting of intervention and control group numbers. In addition, we found better ascertainment of clinical variables (including previous mastectomy) among the invitation-to-screening cohort than among the passively followed control group. However, we viewed this as an expected consequence of a study design in which a control group receives usual care and is not contacted. The screening and control groups differed from each other slightly in education, menopausal status, and previous breast lumps; however, the differences were not systematic and did not favor one group over the other. The strengths of the trial included intention-to-treat analysis, little contamination, and blind review of deaths. We did not find the faults severe enough to rate the study as poor quality and rated it as fair, which signifies that the results were probably valid at the time the study was conducted.

The Canadian trials met all of the USPSTF criteria for a rating of good quality, except for adequacy of allocation concealment. They differed from the other trials because all participants had a history and physical examination before randomization. This design permitted exclusion of patients who had a history of breast cancer and extensive examination of the baseline differences between groups. The Swedish trials all had limitations that resulted in a rating of fair rather than good. The Stockholm and Malmö trials, which were individually randomized, did not report whether allocation was concealed. The Gothenburg and Swedish Two-County studies, which were cluster randomized, had small differences in mean age between the invited and control groups. Such differences are expected to occur in a cluster-randomized trial, do not indicate failure of randomization or a problem in the trial execution, and can be adjusted for in statistical analyses.28 Both the Gothenberg trial and the Swedish Two-County Trial provided insufficient data to determine whether randomization distributed other important confounders equally among the groups, but comparison of overall mortality rates in the invited and control groups do not suggest that a major imbalance occurred.29

As originally conducted, the Swedish trials had important flaws related to measurement of the primary outcome measure, death from breast cancer. In the Swedish Two-County, Gothenburg, and Stockholm trials, review of deaths was unblinded and criteria for the assignment of cause of death were unclear. Another concern about the Swedish trials as a group related to screening of the control groups. Originally, the Swedish trials used the "evaluation" method of analysis, in which mortality rates in the screened population were calculated only for cancer diagnosed between the time of randomization and the last mammographic examination. When the evaluation method of analysis is used, control group screening can introduce bias unless it is performed concurrently with the final instance of mammography in the screened group.30,31 This method is inferior to the "followup" method of analysis, in which all deaths that occur after randomization are included in the analysis. The followup method of analysis dilutes relative benefit over time, particularly in studies that offered screening to the control group and in areas where widespread screening is adopted.

We considered these flaws to be adequately corrected in subsequent analyses by the trialists. In a 1993 overview of the trials, an independent end point committee used an explicit protocol to perform blind assessment of cause of death.32 Participants were linked to an external cancer registry and were excluded from the analysis if breast cancer had been diagnosed before the trial began. For the Swedish trials as a whole, death from every cause except breast cancer was similar in the compared groups.33 In the Swedish Two-County Trial, the reduction in rates of advanced breast cancer,34 which are not related to judgments about the causes of death, was similar to the reduction in breast cancer mortality rates.35 The overview also reanalyzed the data by using the followup method of analysis and found very little difference between the recalculated and original relative risk values. A recent review8 critical of the Swedish studies raised concern about bias in postrandomization exclusions, as evidenced by variation in the reported number of participants. This concern was effectively addressed in a recent update of these trials, which explained that this variation was due to the use of different methods for estimating the number of women in each birth cohort rather than to manipulation after randomization.23 The update also reported more recent results of the Swedish trials by using both the followup and evaluation methods of analysis.

We rated the Edinburgh study as poor quality because of a serious imbalance between the control and screened groups. General practitioners' practices were randomized in clusters without matching for socioeconomic factors. As a result, socioeconomic status, a predictor of stage at diagnosis as well as death from breast cancer, was significantly lower in the control group than in the mammography group. All-cause mortality was dramatically higher in the control group than in the screened group (20.1 more deaths per 10,000 person-years [Confidence Interval (CI), 13.3 to 26.9].29 This difference is close to 25 times larger than the difference in breast cancer deaths between the groups and confirms our assessment that the trial was severely flawed.

Sensitivity of Mammography

Since no gold standard can be applied to the entire screened population, the denominator used for estimating sensitivity is the total number of breast cancer cases diagnosed in a given interval. The results of recent, good-quality systematic reviews of the accuracy of mammography in the screening trials are summarized in Table 2.36,37 The overall sensitivity for all rounds of screening was lowest in the HIP trial. Otherwise, one study was not clearly better or worse than another. For a 1-year screening interval, the sensitivity of first mammography ranged from 71 percent to 96 percent. Sensitivity was substantially lower for women in their 40s than for older women.

The data in Table 2 cannot be applied to individual patients because they are not adjusted for several factors that are known to affect sensitivity. These include patient factors (use of hormone replacement therapy, mammographic breast density), technical factors (the quality of mammography, the number of mammographic views), and provider factors (the experience of radiologists and their propensity to label the results of an examination abnormal, the choice of followup evaluation for abnormal mammograms).36,38-42

Specificity and Positive Predictive Value

In the randomized trials, the specificity of a single mammographic examination was 94 percent to 97 percent.36,43-44 This indicates that 3-6 percent of women who did not have cancer underwent further diagnostic evaluation, typically a clinical examination, more mammographic views, or ultrasonography. The positive predictive value of one-time mammography ranged from 2 percent to 22 percent for abnormal results requiring further evaluation and from 12 percent to 78 percent for abnormal results requiring biopsy (Table 3).36,45,46 Estimates from community settings suggest a graded, continuous increase in predictive value with age. For example, among 31,814 average-risk women screened in California from 1985 to 1992, the positive predictive value for further evaluation was 1-4 percent among those 40 to 49 years of age, 4-9 percent among those 50 to 59 years of age, 10-19 percent among those 60 to 69 years of age, and 18-20 percent among those 70 years of age and older.47

Effectiveness of Mammography in Reducing Breast Cancer Mortality

Table 4 (Printable Version: PDF File, 8 KB) summarizes the most recent results from trials that included at least some participants older than 50 years of age. The four Swedish trials that compared two to six rounds of mammography with usual care23,26 reported 9 percent to 32 percent reductions in the risk for death from breast cancer. The results of the trials have changed little over time (Figure 1 (22 KB), 22 KB). The reduction was statistically significant in only one of these trials (the Swedish Two-County study) (relative risk [RR], 0.68; CI, 0.59 to 0.80).26 The number of times mammography was performed and the frequency of screening did not seem to explain the variation among the Swedish studies. A previous meta-analysis found little change when the individual trial results were adjusted for type of randomization and degree of adherence.48

Of the four studies that evaluated the combination of mammography and CBE (Table 4), three were of at least fair quality.12,13,18,27,49 The HIP trial reported a relative risk reduction that began 5 years after randomization and remained below 1 after 16 or more years of followup (RR, 0.79). The CNBSS-2, which compared annual mammography and CBE with annual CBE among women 50 to 59 years of age, showed no benefit 13 years after the study began.12,20 The CNBSS-1, which compared annual mammography and CBE with usual care in women 40 to 49 years of age, also showed no benefit.

In our meta-analysis of results from all age groups combined, we excluded the Edinburgh trial (which we rated as poor) and used the results from both Canadian trials. The summary relative risk was 0.84 (95 percent CrI, 0.77 to 0.91), equivalent to a number needed to screen of 1,224 (CrI, 665 to 2,564) an average of 14 years after study entry. To estimate the effectiveness of an invitation to screen compared with usual care, we also excluded the Canadian trials, which recruited volunteers. The relative risk reduction was 0.81 (CrI, 0.73 to 0.89), and the number needed to invite to screen was 1,008 (CrI, 531 to 2,128). The relative risks by year of observation (including trial plus followup time) are shown in Figure 1 (22 KB), which suggests a gradual decrease in benefit with longer observation time.

Effectiveness of Mammography Among Women 40 to 49 Years of Age

Since 1963, seven randomized, controlled trials have included women 40 to 49 years of age, approximately 200,000 participants. With the exception of one of the Canadian studies, none of the trials was planned to evaluate breast cancer screening in this age group and none had sufficient power. Two trials, the Stockholm trial and CNBSS-1, showed no benefit for this age group even with longer followup (Table 5; Printable Version: PDF File, 8 KB). The other five trials suggest a benefit (risk reduction, 13 percent to 42 percent), and one (the Gothenburg trial) observed a statistically significant risk reduction since 1996. These findings reflect results after 11 to 19 years of observation; the median period of active screening was 6 years (range, 4 to 15 years).

In our meta-analysis, excluding the Edinburgh trial, the summary relative risk was 0.85 (CrI, 0.73 to 0.99) after 14 years of observation, with a number needed to screen of 1,792 (CrI, 764 to 10,540) to prevent one death from breast cancer. Some might argue that the Canadian study should be excluded in calculating the number needed to invite to screen because its participants were prescreened volunteers who may have differed from the general population. When the Canadian study was excluded, the summary relative risk was 0.80 (CrI, 0.67 to 0.96) and the number needed to invite to screen was 1,385 (CrI, 659 to 6,060). Figure 1 (22 KB) shows an increasing screening benefit among this age group with a longer period of observation.

Among women 50 years of age or older, the summary relative risk was 0.78 (CrI, 0.70 to 0.87) after 14 years of observation, with a number needed to screen of 838 (CrI, 494 to 1,676) to prevent one death from breast cancer. As shown in Figure 1 (22 KB), the benefit has decreased with longer duration of followup.

We found seven meta-analyses of the effectiveness of mammography in women 40 to 49 years of age (Table 6; Printable Version, PDF File, 8 KB). 8,30,32,48,50-58 Our results, which reflect exclusion of one flawed trial, longer followup in six of the trials, and corrected results for the Swedish trials, were consistent with those of most previous meta-analyses. Two meta-analyses8,51, including one from the Cochrane Collaboration, produced results that differed substantially from ours. The Cochrane review reported a summary relative risk of 1.03 (CI, 0.77 to 1.38) but based this on only two trials.

Effectiveness of Mammography in Older Women

Direct evidence of effectiveness among older women is limited to two trials that included women older than 65 years of age. Both of these trials reported relative risk reductions among women 65 to 74 years of age (RR, 0.68 [CI, 0.51 to 0.89]25 and 0.7959 among women 70 to 74 years of age). In the recent Swedish overview, the summary relative risk among women 65 to 74 years of age was 0.78 (CI, 0.62 to 0.99).23,60

Clinical Breast Examination

The test characteristics of CBE, based on data from trials designed specifically for breast cancer screening, were recently reviewed.61 Sensitivity ranged from 40 percent to 69 percent, specificity from 88 percent to 99 percent, and positive predictive value from 4 percent to 50 percent when mammography and interval cancer were used as the criterion standard. One community study showed that over 10 years of biennial screening, 13.4 percent of women had false-positive results on CBE at least once; risk for such results was higher among women younger than 50 years of age.62

No trial has compared CBE alone with no screening. However, two randomized, controlled trials involving the use of mammography and CBE had mortality reductions of 29 percent and 14 percent.18,27,63 A controlled, nonrandomized United Kingdom trial of CBE and mammography showed a nonsignificant mortality reduction of 14 percent (RR, 0.86; CI, 0.73 to 1.01).64

What is the contribution of CBE to these reductions in mortality rate? Among studies showing a benefit of screening, mortality reductions in trials of CBE with mammography are similar to those in trials including mammography only. In the CNBSS-2, in which women 50 to 59 years of age were randomly assigned to annual CBE and mammography or to annual CBE,65 the relative risk for death was 0.97 (CI, 0.62 to 1.52).13 This suggests that mammography has little additive benefit in the setting of a careful, detailed CBE.

Breast Self-Examination

Because neither CBE nor mammography is 100 percent sensitive, breast self-examination (BSE) has been advised as an important screening method among women older than 20 years of age. However, its effectiveness in decreasing death from breast cancer has been controversial because evidence from clinical trials is limited. Observational studies evaluating BSE and breast cancer stage at diagnosis or death have had mixed results.45,66

In two randomized, controlled trials with 5 to 10 years of followup, both conducted outside the United States, breast cancer mortality rates were similar in women instructed in BSE and in noninstructed controls.67-69 Both studies involved large numbers of women who were meticulously trained with proper technique and had numerous reinforcement sessions; mammography was not part of routine screening in the countries involved. In both trials, physician visits and biopsy for benign breast lesions increased among those educated in BSE. To date, no studies have evaluated other potential adverse outcomes of BSE, such as anxiety and subsequent screening behavior.

Adverse Effects

The most frequently discussed adverse effects of mammography are the anxiety, discomfort, and cost associated with positive test results, many of which are false positive, and the diagnostic procedures they generate. For a woman undergoing regular mammography, cumulative specificity may be more relevant than the specificity of a single examination. In one community setting involving 2,400 women 40 to 69 years of age, 6.5 percent of mammography results requiring further evaluation were false positives (specificity, 93.5 percent). When evaluated on an individual basis, however, approximately 23 percent of women had at least one false-positive result on mammography requiring further work-up during 10 years of biennial screening (average of 4 mammograms per woman), indicating a 10-year cumulative specificity of 76.2 percent. For every $100 spent on screening, $33 was spent on the evaluation of false-positive results.62

Anxiety over an abnormal mammogram is documented in some,70-74 but not all,71,75 studies. These studies generally suggest that anxiety dissipates after cancer is ruled out, but some studies suggest that some women worry persistently.72,74-76 The anxiety associated with an abnormal mammogram does not seem to dissuade women from undergoing further screening.77 and may even be associated with improved adherence to recommended screening intervals.70,78,79 Many women are willing to accept the risk for false-positive results. In one survey, 99 percent of women understood that false-positive examination results occur with screening, although they underestimated the likelihood. Of importance, 63 percent stated that they would accept 500 instances of false-positive examination results to save one life.80

Some view diagnosis and treatment of ductal carcinoma in situ (DCIS) as potential adverse consequences of mammography. There is incomplete evidence regarding the natural history of DCIS, the need for treatment, and treatment efficacy, and some women may receive treatment of DCIS that poses little threat to their health. In a 1992 study, 44 percent of women with DCIS were treated with mastectomy and 23 percent to 30 percent were treated with lumpectomy or radiation.81,82 In one survey, only 6 percent of women were aware that mammography might detect nonprogressive breast cancer.80

Radiation exposure is also a potential risk associated with mammography.83 Using risk estimates provided by the Biological Effects of Ionizing Radiation report of the National Academy of Sciences, and assuming a 4 mGy mean glandular dose from each two-views-per-breast bilateral mammography, Feig and Hendrick estimated that annual mammography of 100,000 women for 10 years beginning at 40 years of age would induce no more than eight deaths from breast cancer.84 Women with an inherited susceptibility to ionizing radiation damage have higher risk for radiogenic breast cancer,10,85 although this has not been documented in association with mammography.

Return to Contents

Discussion

Fair-quality, relatively consistent evidence suggests that mammography screening reduces breast cancer death among women 40 to 74 years of age. We found no evidence that inclusion of CBE conferred greater benefit than mammography alone. We also found no evidence supporting the role of BSE in reducing breast cancer mortality.

Over the three decades in which mammography trial data have been available, critical reviewers and the investigators themselves have discussed limitations and irregularities in data reporting. One highly publicized review by the Cochrane Collaboration criticized the trials in regard to randomization, postrandomization exclusions, and determination of deaths from breast cancer.8 It found all but two of the trials, the Malmö trial and the Canadian trials, severely flawed or of poor quality and prompted some official bodies to question their support for screening mammography.

We identified many of the same design problems highlighted in the Cochrane review but reached different conclusions about their bearing on the validity of the findings. With the exception of the Edinburgh trial, we found inadequate evidence to conclude that the specific flaws identified introduced biases of sufficient magnitude or direction to invalidate the findings or to cause us to reject the inference that screening mammography reduces breast cancer mortality rates.

The effectiveness of screening in women 40 to 49 years of age is a longstanding controversy. In early years, it centered on the lack of evidence that observed risk reductions were statistically significant.6,52,86 That argument has dissipated over time as more evidence has shown a significant separation in survival curves with longer followup. The delay in the separation of those curves, however, has prompted some to question whether the observed benefits are due to the detection of cancer after 50 years of age, suggesting little incremental benefit from initiating screening at 40 years of age and exposing women to the harms of screening for an extra decade.87,88 We found little evidence to convincingly address this concern and some evidence that some benefit from screening women 40 to 49 years of age would be sacrificed if screening began at age 50 years.27,89

The use of 50 years of age as a threshold is somewhat arbitrary (except that it approximates the age of menopause). The risks for developing and dying of breast cancer are continuous variables that increase with age, and the greatest increase in incidence actually occurs before menopause.90,91 We found that the relative risk reduction achieved with mammography screening does not differ substantially by age, although the time required to obtain the benefit is longer for younger women. On the other hand, younger women have more potential years of life to gain by screening. Thus, the variable most affected by age is absolute risk reduction, which increases as a continuum with age while the number needed to screen decreases. The age of 50 years has no special bearing on this pattern, and some question the scientific rationale for treating women 40 to 49 years of age as a special entity.92

What emerges as a more important concern, across all age groups, is whether the magnitude of benefit is sufficient to outweigh the harms. The risk for false-positive results and their consequences decreases with age. Thus, although mammography at any age poses a tradeoff of benefits and harms, the balance between increasing absolute risk reduction and decreasing harms grows more favorable over time. The age at which this tradeoff becomes acceptable is a subjective judgment that cannot be answered on scientific grounds, since early evidence suggests that women will tolerate a high risk for false-positive results. As noted earlier, 63 percent of women in one study stated that they would accept 500 instances of false-positive results to save one life.80 On the basis of the results of our meta-analysis, we calculated that over 10 years of biennial screening among 40-year-old women invited to be screened, approximately 400 women would have false-positive results on mammography and 100 women would undergo biopsy or fine-needle aspiration for each death from breast cancer prevented.

A limitation of our meta-analysis is that we combined studies that used different methods of analysis. In the most recent report from the Swedish trials 23, Nyström and colleagues did not report individual study-level data using the followup method. The pooled followup analysis reported by Nyström and colleagues in 2002 suggest that the use of the followup method would have resulted in a smaller estimate of relative risk reduction.

Women older than 70 years of age have the highest incidence of breast cancer, and test performance in these women is likely to be similar to that in women 50 to 70 years of age. Therefore, theoretically, mammography should be at least as effective for women older than 65 years of age as it is for younger women. Offsetting this potential benefit, however, is the greater comorbidity observed in elderly persons. The potential benefit of early detection is unlikely to be realized in women who have other diseases that diminish life expectancy, in those who would not tolerate evaluation or treatment, and in those with impaired quality of life (for example, dementia).93 In addition, no data from randomized, controlled trials provide information about the morbidity associated with screening, followup, and treatment among women older than 74 years of age. Finally, a major concern in elderly women is the diagnosis and treatment of DCIS, since mortality rates from DCIS are low (1-2 percent at 10 years) and 99 percent of DCIS is treated surgically.94

The interval at which mammography was performed in the screening trials varied between 12 and 33 months, but annual mammography was no more effective than biennial mammography. Data from the Swedish Two-County Trial indicate that the period in which breast cancer can be detected before it presents clinically is shorter for women 40 to 49 years of age.95-97 Annual screening may be more important in this age group than in older women, but we found no direct proof for this hypothesis in the controlled trials that have been completed so far.

We found no evidence that CBE or BSE reduces breast cancer mortality. Whether the BSE trials are generalizable to the United States, where the use of CBE and mammography and the incidence of breast cancer are higher, is uncertain. It is also uncertain whether BSE might be beneficial to women who are not in the age ranges at which mammography is recommended or do not avail themselves of mammography. In the setting of CBE and mammography, the probability of finding a significant decrease in mortality rates is likely to be small.

In summary, when judged as population-based trials of cancer screening, most mammography trials are of fair quality. Their flaws reflect tradeoffs in planning that make the trial results widely generalizable but decrease internal validity. In absolute terms, the mortality benefit of mammography screening is small enough that biases in the trials could erase or create it. However, we found that although these trials were flawed in design or execution, there is insufficient evidence to conclude that most were seriously biased and consequently invalid.

Future research should be directed toward developing new screening methods as well as methods of improving the sensitivity and specificity of mammography. Methods of reducing surgical biopsy rates and complications of treatment should also be studied, as should communication of the risks and benefits associated with screening to patients. Finally, efforts to identify breast cancer risk factors with high attributable risk, as well as appropriate prevention strategies, should continue. Even in the best screening settings, most deaths from breast cancer are not currently prevented.

Return to Contents
Proceed to Next Section