Short Contents | Full Contents Other books @ NCBI


HSTAT: Guide to Clinical Preventive Services, 3rd Edition: Recommendations and Systematic Evidence Reviews, Guide to Community Preventive Services Guide to Clinical Preventive Services, 2nd Edition, 1996 Guide to Clinical Preventive Services: Second Edition (1996)

section one SCREENING

1. Screening for Asymptomatic Coronary Artery Disease

Burden of Suffering

Ischemic heart disease is the leading cause of death in the U.S., accounting for approximately 490,000 deaths in 1993. 1 The American Heart Association estimates that approximately 1.5 million Americans will suffer a myocardial infarction (MI) in 1995, and one third will not survive the event. 2 Atherosclerotic coronary artery disease (CAD) is the underlying cause of most ischemic cardiac events and can result in myocardial infarction, congestive heart failure, cardiac arrhythmias, and sudden cardiac death. Clinically significant CAD is uncommon in men under 40 and premenopausal women, but risk increases with advancing age and in the presence of risk factors such as smoking, hypertension, diabetes, high cholesterol, and family history of heart disease. Although mortality from heart disease has declined steadily over the past three decades in the U.S., 2 the total burden of coronary disease is predicted to increase substantially over the next 30 years due to the increasing size of the elderly population. 3 The cost of medical care and lost economic productivity due to heart disease in the U.S. has been projected to exceed $60 billion in 1995. 2

Angina is the most common presenting symptom of myocardial ischemia and underlying CAD, but in many persons the first evidence of CAD may be myocardial infarction or sudden death. 4 It has been estimated that 1-2 million middle-aged men have asymptomatic but physiologically significant coronary disease, also referred to as silent myocardial ischemia. 4,5 top link

Accuracy of Screening Tests

There are two screening strategies to reduce morbidity and mortality from CAD. The first involves screening for modifiable cardiac risk factors, such as hypertension, elevated serum cholesterol, cigarette smoking, physical inactivity, and diet (see Chapters 2,3, and 54-56). The second strategy is early detection of asymptomatic CAD. The principal tests for detecting asymptomatic CAD include resting and exercise ECGs, which can provide evidence of previous silent myocardial infarctions and silent or inducible myocardial ischemia. Thallium-201 scintigraphy, exercise echocardiography, and ambulatory ECG (Holter monitoring) are less commonly used for screening purposes. The efficacy of each of these tests may be evaluated by (a) its ability to detect atherosclerotic plaque, and (b) its ability to predict the occurrence of a serious clinical event in the future (acute MI, sudden cardiac death).

Several resting ECG findings (ST depression, T-wave inversion, Q waves, and left axis deviation) increase the likelihood of coronary atherosclerosis and of future coronary events. However, these findings are uncommon in asymptomatic persons, occurring in only 1-4% of middle-aged men without clinical evidence of CAD, 6,7 and they are not specific for CAD. One third to one half of patients with angiographically normal coronary arteries have Q waves, T-wave inversion, or ST-T changes on their resting ECG. 8-10 Conversely, a normal ECG does not rule out CAD. In the Coronary Artery Surgery Study, 29% of patients with symptomatic, angiographically proven CAD demonstrated a normal resting ECG. 11 Asymptomatic persons with baseline ECG abnormalities (Q waves, ST segment depression, T-wave inversion, left ventricular hypertrophy, and ventricular premature beats) have a higher risk of future coronary events. 6,12-19 However, prospective studies lasting between 5 and 30 years have found that symptomatic CAD develops in only 3-15% of persons with these ECG findings. 6,13,18,20 Furthermore, most coronary events occur in persons without resting ECG abnormalities. 6,7,18,21,22 Thus, routine ECG testing in asymptomatic persons, in whom the pretest probability of having CAD is relatively low, is not an efficient process for detecting CAD or for predicting future coronary events.

The exercise ECG is more accurate than the resting ECG for detecting clinically important CAD. Most patients with asymptomatic CAD do not have a positive exercise ECG, however. 23-26 ECG changes often do not become apparent until an atherosclerotic plaque has progressed to the point that it significantly impedes coronary blood flow. 24,27 In addition, most asymptomatic persons with an abnormal exercise ECG result (usually defined by a specific magnitude of ST-segment depression) do not have underlying CAD. 27,28 A 1989 meta-analysis found considerable variability in the accuracy of exercise-induced ST depression for predicting CAD (sensitivity 23-100%, specificity 17-100%). 29 Although several investigators reported that adjusting the ST segment for heart rate (ST/HR slope or ST/HR index) improves the ability to predict significant CAD 30-32 and future coronary events, 25 other studies have not shown an advantage. 33-37

The exercise ECG is also more accurate than the resting ECG in predicting future coronary events. While asymptomatic persons with a positive exercise ECG are more likely to experience an event than those with negative tests, 25,38-43 longitudinal studies following such patients from 4 to 13 years have shown that only 1-11% will suffer an acute MI or sudden death. 25,42,44,45 As with resting ECG, the majority of events will occur in those with a negative exercise test result. 24,26,44-47 The pathophysiology of acute coronary syndromes may explain the insensitivity of exercise ECG for subsequent coronary events. Unstable angina, MI, and sudden death often result from an acute, occluding thrombus precipitated by the rupture of a mild, non-flow-limiting plaque. 48-50 Among healthy men who subsequently developed symptomatic CAD after a negative screening test, 73% experienced a MI or sudden death as their initial manifestation. 24,45 In contrast, the majority of asymptomatic persons with a positive exercise ECG develop angina as their initial event. 5,24,45,51 Thus, while exercise ECG may predict the presence of more severe coronary stenosis and risk of angina in asymptomatic persons, it does not accurately predict risk of acute coronary events.

The addition of thallium-201 scintigraphy to conventional exercise testing improves its accuracy in detecting CAD, making it a useful diagnostic test in persons with symptoms of CAD. 52,53 However, the probability of CAD after a positive scan is low in asymptomatic persons, and most coronary events occur in those with a negative test result. 23,44 Because of these limitations and its expense, thallium-201 scintigraphy is not a practical screening test for asymptomatic persons. 23,44,52,54 The ambulatory ECG can detect episodes of ST-segment depression which may indicate silent ischemia in asymptomatic persons with CAD. These episodes, however, also occur commonly in healthy volunteers 55-57 and are not reliable predictors of future coronary events, even in asymptomatic or mildly symptomatic patients with documented CAD. 58,59 There have been no studies of exercise echocardiography in screening asymptomatic populations for CAD.

False-positive screening test results are undesirable for several reasons. Persons with abnormal results frequently undergo invasive diagnostic procedures such as coronary angiography. Abnormal test results may produce considerable anxiety. An abnormal ECG tracing may disqualify some patients from jobs, insurance eligibility, and other opportunities, although the extent of these problems is not known. Proposed strategies for reducing false-positive results include: performing workups in accordance with a Bayesian model 60 using discriminant functions to interpret the stress ECG 41 and targeting testing to high-risk groups. top link

Effectiveness of Early Detection

Although case-control and cohort studies show that asymptomatic persons with selected ECG findings are at increased risk of MI and cardiac death, 5,7,22,25,38-43 there is little evidence that routine screening is an effective means to reduce the incidence of acute coronary events in asymptomatic persons. Antianginal drugs such as nitroglycerin, beta -adrenergic blockers, and calcium channel blockers reduce the frequency and the duration of silent ischemia. 61-63 In a recent study, atenolol reduced the incidence of cardiac events (MI, cardiac arrest, or worsening angina) in patients who had both silent ischemia and CAD documented by angiography or prior MI 64,65 extrapolating these benefits to completely asymptomatic patients with silent ischemia on routine screening may not be justified, given their much lower risk of acute events. 46

Both aspirin therapy and drug treatment for high cholesterol reduce the incidence of MI and cardiac mortality in patients with symptomatic coronary disease, but the balance of risks and benefits of these therapies in asymptomatic patients is not resolved (see Chapters 2 and 69). Benefits are more likely to exceed risks in asymptomatic patients with underlying coronary disease, however, due to their higher absolute risk of MI and coronary death. New diagnostic techniques may prove more sensitive than angiography in identifying the mild-to-moderate plaques that are a risk factor for developing an acute occlusive thrombus. 66,67 Their utility will remain in question, however, until appropriate trials demonstrate that early detection and treatment of small coronary plaques is more effective than treatment based on identifiable risk factors (e.g., high blood pressure or high cholesterol) in asymptomatic patients. 48,49

Among patients with symptomatic coronary disease, coronary artery bypass grafting prolongs life compared with medical therapy in patients with left main coronary or three-vessel disease with poor left ventricular function. 11 The prevalence of high-risk coronary disease among asymptomatic persons, however, is very low while some patients may suffer a MI or sudden cardiac death as their initial manifestation of CAD, most patients with severe coronary disease initially develop angina. 5,45 As a result, it is not clear that the benefit of identifying a small number of individuals with severe coronary disease before they develop symptoms is sufficient to justify routine screening of large populations of asymptomatic persons. Recent randomized trials have demonstrated that percutaneous transluminal coronary angioplasty (PTCA) reduces the frequency of angina in patients with symptomatic CAD, but it does not reduce the incidence of MI or cardiac death. 68,69 The value of coronary angioplasty for asymptomatic coronary stenoses is not known.

A screening ECG has been recommended to provide a "baseline" to help interpret changes in subsequent ECGs. 70 Even when important differences are noted between the baseline ECG and a subsequent tracing, these do not necessarily reflect ongoing or recent ischemia. Using the development of a new Q wave on serial ECG as a criterion, the Framingham Study reported an annual incidence of unrecognized MI of 5.4/1,000 men aged 65-74. 71 Less specific changes develop more commonly than Q waves. Baseline ECGs are often not available when needed for comparison, nor do they significantly contribute to decision making for patients being evaluated for chest pain, 72-75 especially in those with no history of cardiovascular disease. 76 One large study found that a baseline ECG was available in 55% of patients evaluated for acute chest pain. 73 The availability of a prior ECG was associated with small but significant reduction in hospitalization rates for those patients who had chest pain not due to acute MI. Only a small subset of the asymptomatic population is likely to benefit from having a baseline ECG, however: those with baseline ECG abnormalities suggestive of ischemia who subsequently develop acute noncardiac chest pain. Savings from preventing a few unnecessary hospitalizations among these patients must be weighed against the high costs of routine ECG screening in the large population of asymptomatic persons.

Another argument for ECG screening is that the early identification of persons at increased risk for CAD on the basis of ECG findings may help to modify other important cardiac risk factors such as cigarette smoking, hypertension, and elevated serum cholesterol. 70 While the efficacy of risk factor modification is well established, 22,77 no studies have evaluated whether identifying high-risk patients with abnormal ECGs improves efforts to modify risk factors or leads to better clinical outcomes.

Periodic ECG screening is often recommended for persons who might endanger public safety were they to experience an acute cardiac event at work (e.g., airline pilots, bus and truck drivers, railroad engineers). Cardiac events in such individuals are more likely to affect the safety of a large number of persons, and clinical intervention, either through medical treatment or work restrictions, might prevent such catastrophes. No studies have addressed the efficacy of ECG screening in these persons, however.

Preliminary exercise ECG testing has also been recommended for sedentary persons planning to begin vigorous exercise programs, based on evidence that strenuous exertion may increase the risk of sudden cardiac death. The usual underlying cause of sudden cardiac death during exercise is hypertrophic cardiomyopathy or congenital coronary anomalies in young persons and CAD in older persons. Cardiac events during exercise in persons without symptomatic heart disease are uncommon, however, and exercise ECG may not accurately predict those who are at risk. Among over 3,600 asymptomatic, hypercholesterolemic middle-aged men who underwent submaximal exercise ECG during the Lipid Research Clinics Coronary Primary Prevention Trial, 62 (2%) subsequently experienced an acute cardiac event during moderate or strenuous physical activity during follow-up (average 7.4 years). 78 Although men with exercise-induced ECG changes were at increased risk, only 11 of 62 events occurred in men with an abnormal baseline exercise test (sensitivity 18%). Moreover, few of the men with abnormal test results experienced an activity-related event during follow-up (positive predictive value 4%). Although the negative predictive value of baseline ECG was high (over 98%), it was no better than multivariate analysis based on clinical risk factors alone. Given the low incidence of activity-related events in middle-aged men, and the uncertain benefit of restricting activity in those with abnormal exercise tests, the potential benefits of pre-exercise testing appear small. In populations at low risk for heart disease, any benefits of detecting the rare individual with asymptomatic CAD may be offset by adverse effects of labeling and exercise restrictions for the larger number of persons with false-positive ECG results. top link

Recommendations of Other Groups

The routine use of resting electrocardiogram to screen for CAD in asymptomatic adults is not recommended by the American College of Physicians (ACP) 79 or the Canadian Task Force on the Periodic Health Examination. 80 The American Academy of Family Physicians (AAFP) recommends a baseline electrocardiogram for men 40 years and older with two or more cardiac risk factors and sedentary men about to begin a vigorous exercise program this recommendation is under review. 81 A task force sponsored by the American College of Cardiology and the American Heart Association (ACC/AHA) recommends baseline testing for all persons over 40 years of age and for those about to have exercise stress testing. 82

The AAFP recommends exercise electrocardiography for those whose jobs are linked to public safety (e.g., pilots, air traffic controllers) or that require high cardiovascular performance (e.g., police officers, firefighters). 81 The American College of Sports Medicine recommends exercise ECG testing for men over age 40, women over age 50, and other asymptomatic persons with multiple cardiac risk factors, prior to beginning a vigorous exercise program. 83 The ACC/AHA recognize that the exercise ECG is frequently used to screen asymptomatic persons in some high-risk groups but concluded that there is divergence of opinion with respect to its usefulness. 84 The ACP does not recommend exercise testing with ECG or thallium scintigraphy as a routine screening procedure in asymptomatic adults. 79,85 top link

Discussion

Heart disease is the leading cause of death in the U.S., and interventions that produce even modest reductions in the incidence of acute ischemic events may have substantial public health benefits. Although the resting electrocardiogram can detect evidence of coronary heart disease in asymptomatic persons and identify individuals at increased risk of future coronary events, the ECG has important weaknesses as a screening test. The large majority of asymptomatic persons with abnormal ECG results do not have CAD and are at relatively low risk for developing symptomatic heart disease in the near future. Routine screening may subject many of them to the inconvenience, expense and potential risks of follow-up testing (i.e., cardiac catheterization or radionuclide imaging) to evaluate false-positive screening results. Although exercise testing is more sensitive and specific for high-grade coronary stenoses, the exercise ECG is too time-consuming and expensive for routine use in asymptomatic persons. Finally, neither resting nor exercise ECG reliably detects the mild to moderate atherosclerotic lesions which are often responsible for acute coronary events.

A second important problem with screening for asymptomatic CAD is the lack of evidence that earlier detection leads to better outcomes. The only interventions proven to reduce coronary events in asymptomatic persons are modifications of risk factors such as smoking, high cholesterol, and elevated blood pressure. These interventions, however, should be encouraged for all patients with modifiable risk factors, not only those with screening tests suggestive of CAD. The benefits of more invasive treatments for coronary stenosis (e.g., bypass surgery, angioplasty) are unproven in asymptomatic persons. For certain occupations, such as pilots and heavy equipment operators, where sudden death or incapacitation would endanger the safety of others, considerations other than benefit to the individual patient may favor screening. Although screening cannot reliably identify all persons at risk of an acute event, it may increase the margin of safety for the public.

To minimize the potential adverse effects of false-positive test results, routine screening with ECG should be avoided in populations where the prevalence of CAD is low, including most adults under 40, and middle-aged men and women without coronary risk factors. Even in high-risk individuals, the benefits of screening to identify asymptomatic CAD are unproven. For some persons, however, identifying those at high risk of coronary mortality may help guide treatment decisions (e.g., use of aspirin or cholesterol-lowering drugs).

There are major costs associated with widespread screening with resting ECG in asymptomatic adults, and use of other screening tests (ambulatory ECG, exercise testing, and echocardiography) would be substantially more expensive. 79 The inconvenience, expense, and potential risks of routine screening might be justified if it significantly reduced the incidence of MI and sudden cardiac death, but such evidence is not yet available. Until appropriate studies demonstrate a benefit of screening for CAD, identification and treatment of major cardiac risk factors such as hypertension, elevated serum cholesterol, and cigarette smoking remain the only proven measures for reducing coronary morbidity and mortality in asymptomatic persons.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against screening middle-aged and older men and women for asymptomatic coronary artery disease with resting electrocardiography (ECG), ambulatory ECG, or exercise ECG ("C" recommendation). Recommendations against routine screening may be made on other grounds for persons who are not at high risk of developing symptomatic CAD these grounds include the limited sensitivity and low predictive value of an abnormal resting ECG in asymptomatic persons, and the high costs of screening and follow-up. Screening selected high-risk asymptomatic persons (e.g., those with multiple cardiac risk factors) is indicated only where results will influence treatment decisions (e.g., use of aspirin or lipid-lowering drugs in asymptomatic persons). Screening individuals in certain occupations (pilots, truck drivers, etc.) can be recommended on other grounds, including possible benefits to public safety. The choice of specific screening test for asymptomatic CAD is left to clinical discretion: exercise ECG is more accurate than resting ECG but is considerably more expensive.

Routine ECG screening as part of the periodic health visit or preparticipation sports physical is not recommended for asymptomatic children, adolescents, and young adults ("D" recommendation).

Clinicians should emphasize proven measures for the primary prevention of coronary disease in all patients (see Chapter 3, Screening for Hypertension Chapter 2, Screening for High Blood Cholesterol Chapter 54, Counseling to Prevent Tobacco Use Chapter 55, Counseling to Promote Physical Activity and Chapter 56, Counseling to Promote a Healthy Diet).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Dennis L. Disch, MD, and Harold C. Sox, Jr., MD. top link

2. Screening for High Blood Cholesterol and Other Lipid Abnormalities

Burden of Suffering

Elevated blood cholesterol is one of the major modifiable risk factors for coronary heart disease (CHD), 1 the leading cause of death in the U.S. CHD accounts for approximately 490,000 deaths each year, 2 and angina and nonfatal myocardial infarction (MI) are a source of substantial morbidity. CHD is projected to cost over $60 billion in 1995 in the U.S. in medical expenses and lost productivity. 3 The incidence of CHD is low in men under age 35 and in premenopausal women (1-2/1,000 annually), 4 but climbs exponentially during middle age for both men and women. The onset of CHD is delayed approximately 10 years in women compared with men, probably due to effects of estrogen, 5 but women account for 49% of all CHD deaths in the U.S. 2 Clinical events are the result of a multifactorial process that begins years before the onset of symptoms. Autopsy studies detected early lesions of atherosclerosis in many adolescents and young adults. 6-10 The onset of atherosclerosis and symptomatic CHD is earlier among persons with inherited lipid disorders such as familial hypercholesterolemia (FH) 11 and familial combined hyperlipidemia (FCH). 12

Serum Cholesterol and Risk of Coronary Heart Disease.

Epidemiologic, patho-logic, animal, genetic, and clinical studies support a causal relationship between blood lipids (usually measured as serum levels) and coronary atherosclerosis. 1,13-15 Extended follow-up of large cohorts (predominantly middle-aged men) 16-18 provides evidence that CHD risk increases in a continuous and graded fashion, beginning with cholesterol levels as low as 150-180 mg/dL [a] this association extends to cholesterol levels measured as early as age 20 in men. 14,19 During middle age, for each 1% increase in total cholesterol, CHD risk increases by an estimated 3%. 20 High cholesterol (>=240 mg/dL) is also a risk factor in middle-aged women, but most coronary events in women occur well after menopause. 5,17,21-24 Some studies report that cholesterol alone is a weak predictor of CHD mortality in the elderly, 24a,190 but an overview of 24 cohort studies indicates that high cholesterol remains a risk factor for CHD after age 65, 23 with the strongest associations among healthier elderly populations followed over longer periods. 25-27 The association is weaker in older women than in men 23 and is not consistent for cholesterol levels measured after age 75. 28-31

Expert panels have defined high and "borderline high" (200-239 mg/dL) cholesterol to simplify clinical decisions. 1 Because CHD is a multifactorial process, however, there is no definition of high cholesterol that discriminates well between individuals who will or will not develop CHD. 32,33 Due to nonlipid risk factors, persons with cholesterol below 240 mg/dL account for the majority of all CHD events. 34,35 Among middle-aged men, 9-12% of those with cholesterol 240 mg/dL or greater will develop symptomatic CHD over the next 7-9 years, 34,36 but most of them have multiple other risk factors for CHD. 35 The excess (i.e., absolute) risk due to high cholesterol (and the probable benefit of lowering cholesterol) increases with the underlying risk of CHD. In a 12-year study of over 316,000 men aged 35-57, the excess CHD mortality attributable to high cholesterol was greatest in men over age 45, and in those who smoked or had hypertension. 16 The increase in CHD mortality associated with a given increment in serum cholesterol was steepest at very high values (>300 mg/dL). 16 Excess risk from high cholesterol is smaller in women, who have less than half the CHD risk as do men at any given cholesterol level. 17,23,37 Although the relative risk associated with high serum cholesterol declines with age, 17,23,28 the excess risk generally does not, due to the much higher incidence of CHD in older persons. 31,38,39 top link


[a] To convert values for serum total cholesterol, HDL-C, and LDL-C to mmol/L, multiply by 0.02586. Equivalent values for commonly used thresholds are 280 mg/dL = 7.2 mmol/L, 240 mg/dL = 6.2 mmol/L, 200 mg/dL = 5.2 mmol/L. top link

Other Lipid Constituents and Risk of Coronary Disease.

The risk associated with high total cholesterol is primarily due to high levels of low-density lipoprotein cholesterol (LDL-C), 1 but there is a strong, independent, and inverse association between high-density lipoprotein cholesterol (HDL-C) levels and CHD risk. 40-42 Low HDL-C increases risk even when cholesterol is below 200 mg/dL, 41 a pattern present in up to 20% of men with confirmed CHD. 43 In many studies, measures of HDL-C or the ratio of total cholesterol to HDL-C are better predictors of CHD risk than is serum cholesterol alone. 5,22,23,24a,41,44 High total cholesterol in association with high HDL-C (>=60 mg/dL) is common in older women (especially those taking estrogen) but is not associated with an increased risk for CHD. 1,41 The importance of triglycerides as an independent risk factor for CHD remains uncertain. 40,45 Three large studies reported strong associations between triglyceride levels over 200-300 mg/dL (2.26-3.39 mmol/L) and cardiovascular mortality in women, 21,22,24 but other analyses found no association after controlling for obesity, fasting glucose, or low HDL-C. 46 The combination of high triglycerides and low HDL-C often occurs in association with other CHD risk factors such as hypertension and diabetes and is associated with a high risk of CHD. 46a top link

Prevalence of High Cholesterol and Low HDL-C.

Serum total cholesterol and LDL-C increase 1-2 mg/dL per year in men from ages 20-40, 2 mg/dL per year in women from ages 40-60, 47 and an average 18% during the perimenopausal period, due in part to age-related increases in weight. 48 The prevalence of serum cholesterol 240 mg/dL or higher increases from 8-9% in adults under age 35 to nearly 25% for men age 55 and nearly 40% for women over 65. 49 Approximately 11% of men and 3% of women over age 20 have low HDL-C (<35 mg/dL) with desirable or borderline-high total cholesterol. 49 top link

Accuracy of Screening Tests

Both total cholesterol and HDL-C can be measured in venipuncture or finger-stick specimens from fasting or nonfasting individuals. Due to normal physiologic variation and measurement error, a single measurement may not reflect the patient's true (or average) cholesterol level. Stress, minor illness, posture, and seasonal fluctuations may cause serum cholesterol to vary 4-11% within an individual. 50 Laboratory assays are subject to random errors, due to variation in sample collection, handling, and reagents, and to systematic errors (bias), due to methods that consistently overestimate or underestimate cholesterol values. 51 In a survey of 5,000 clinical laboratories, 93% of the measurements were within 9% of a reference standard. 52 Desktop analyzers can produce reliable results, but some devices may not meet standards for accuracy. 53 Variation in training and operating technique can introduce additional error when instruments are used outside clinical laboratories. 54 Average bias for measurements based on capillary specimens compared to venous specimens was +4-7%. 55

As a result of these considerations, a single measure of serum cholesterol could vary as much as 14% from an individual's average value under acceptable laboratory conditions. 50 For an individual with a "true" cholesterol level of 200 mg/dL, the 95% range of expected values is 172-228 mg/dL. 56 Some authorities therefore recommend advising patients of their "cholesterol range," rather than a single value. 56 Where more precise estimates are necessary, an average of at least two measurements on two occasions has been recommended, and a third if the first two values differ by more than 16%. 50

Screening Children by Family History.

Although cholesterol levels in child-hood correlate moderately well with levels in adulthood (correlation coefficient 0.4-0.6), many children with elevated serum cholesterol (defined as serum cholesterol >=200 mg/dL or LDL-C >=130 mg/dL, the 90-95th percentile in U.S. children under 19 years) 57 do not have high cholesterol as adults. 58-60 Furthermore, the association between childhood cholesterol levels and CHD in adults has not been studied. Because of the familial aggregation of CHD and hypercholesterolemia, 57,61,62 some experts recommend screening for family history of either premature cardiovascular disease (age 55 or younger) or parental hypercholesterolemia (>=240 mg/dL) to identify a subset of children who are more likely to be at risk from hypercholesterolemia as adults. 57 Under this definition, only 25% of all children would be screened, but the predictive value of family history is limited: 81-90% of children with such histories have normal cholesterol. 63-66 Even when parental cholesterol has been measured and found to be elevated, most children have normal cholesterol values. 57,67,68

Parental and childhood cholesterol levels are highest in heterozygous FH (estimated prevalence 1 in 500), which is strongly associated with premature CHD. Up to 50% of men with FH develop clinical CHD by age 50. 69,70 Screening based on family history, as defined above, does not appear to be an efficient strategy for detecting FH, however. Many children would be screened, and few of those identified and treated for high cholesterol would have FH. 71 By itself, a parental history of premature CHD is likely to detect less than half of all children with FH. 70 Tracing and screening families of index cases with FH may be more cost-effective than population screening for FH. 72 top link

Screening for Other Lipid Abnormalities.

Measurements of HDL-C and triglyc-erides are less reliable than measurement of total cholesterol due to greater biologic and analytic variability. 73,74 The 95% range of expected values for an individual with HDL-C of 37 mg/dL is 29-45 mg/dL. 75 A survey of 250 laboratories found that one third of all HDL-C measurements varied more than 10% from a reference value. 76 Triglycerides must be measured on fasting specimens. Even then, intraindividual variation is greater than 20%, and a single measure is inadequate to categorize levels as high or normal. 73,74 Measurement of apolipoproteins (e.g., apoB) has been evaluated as a screening test for FH, familial coronary disease, and high LDL-C, but these assays are not yet widely available or adequately standardized. 57 top link

Effectiveness of Early Detection

No long-term study has compared routine cholesterol screening to alternate strategies (selective case-finding or universal dietary advice without screening) with change in cholesterol levels or CHD incidence as an outcome. The increase in cholesterol screening over the past decade in the U.S. has been accompanied by significant improvements in dietary knowledge, 77 fat consumption, 78 average cholesterol levels, 79 and CHD mortality, 14 but it is difficult to isolate the contribution of screening from other factors (e.g., public education, changes in food supply) that may account for these trends. In community- or practice-based trials, patients receiving risk-factor screening and targeted dietary advice had slightly lower average cholesterol levels (1-3%) than did unscreened controls at 1-3-year follow-up, but dietary interventions were limited. 80-82 Whether screening improves the effectiveness of routine dietary advice has been examined in two short-term studies where all subjects received counseling about diet cholesterol screening modestly improved mean cholesterol levels in one study but had no effect in the other. 83,84 In a school-based study in which all children received similar health education, cardiovascular risk-factor screening (including cholesterol measurement) was associated with improved dietary knowledge and self-reported behavior, but changes in lipid levels were not assessed. 85

The primary evidence to support cholesterol screening is the ability of cholesterol-lowering interventions to reduce the risk of CHD in patients with high cholesterol. These benefits are now well established for persons with preexisting atherosclerotic vascular disease. In individual trials and overviews of studies enrolling persons with angina or prior myocardial infarction (MI), cholesterol-lowering treatments slowed the progression of atherosclerosis, 86 reduced the incidence of CHD, 87,88 and reduced overall mortality. 89 In the first long-term trial of newer cholesterol-lowering drugs, treatment with simvastatin over 5.4 years reduced coronary mortality 42% and all-cause mortality 30% in 4,444 men and women with coronary disease. 90

The absolute benefit of treating high cholesterol in persons without cardiovascular disease, however, is much smaller due to the much lower risk of death or MI (annual CHD mortality 0.1-0.3% in middle-aged men with asymptomatic high cholesterol vs. 2-10% per year in patients with symptomatic CHD). 91 The risks and benefits of lowering cholesterol in asymptomatic persons -- primarily middle-aged men with very high cholesterol -- have been examined in trials using medications, modified diets in institutional patients, or outpatient dietary counseling, and in overviews of these trials.

Trials of Cholesterol-Lowering Drugs in Asymptomatic Men.

Three large, multicenter, placebo-controlled trials of lipid-lowering medications provide the best evidence that lowering cholesterol can reduce combined CHD incidence (fatal and nonfatal events) in asymptomatic persons. These trials enrolled hypercholesterolemic middle-aged men (age 30-59, mean cholesterol 246-289 mg/dL) and lowered total cholesterol 9-10% (and LDL-C 10-13%) over periods of 5-7 years. In the World Health Organization Cooperative Trial, treatment with clofibrate significantly reduced the incidence of nonfatal MI by 25%, 92 but this benefit was offset by significant increases in noncardiac and total mortality (40% and 30% respectively, p = 0.01). 93 The Lipid Research Clinics (LRC) Coronary Primary Prevention Trial reported a significant 19% reduction in cumulative incidence of MI and sudden cardiac death in patients treated with cholestyramine over 7 years (7.0% vs. 8.6%). 36 In the Helsinki Heart Study, treatment with gemfibrozil significantly reduced the 5-year cumulative incidence of cardiac events by 34% (2.7% vs. 4.1%). 94 Most of the benefit of gemfibrozil was confined to men with a high ratio of LDL-C to HDL-C (>=5) and triglycerides >200 mg/dL. 95 Effects on CHD mortality were not statistically significant in any of these trials. Two additional drug trials reported 1-3-year results in largely asymptomatic populations. 96,97 Roughly 30% of subjects had CHD at entry, however, and these patients accounted for most of the coronary events during follow-up. top link

Trials of Diet in Institutionalized Persons.

Demonstrating a clinical benefit of modern cholesterol-lowering diets in asymptomatic persons has proven difficult. In three controlled trials in institutionalized patients, fat-modified diets reduced serum cholesterol 12-14% with generally favorable effects on CHD over periods of up to 8 years. 98-100 Each of these studies used diets high in polyunsaturated fat, which have been associated with adverse effects, 15 and none excluded patients with CHD. As a result, their findings may not be applicable to currently recommended low-fat diets in asymptomatic persons.top link

Trials of Dietary Advice in Outpatients.

The only trials to examine the clinical benefits of a diet low in total and saturated fat in persons without CHD are multifactorial intervention trials, which offered dietary counseling, smoking cessation advice, and/or treatment of high blood pressure to middle-aged men. 101-105 Among Norwegian smokers with very high cholesterol levels (mean 320 mg/dL) and fat consumption (44% calories), dietary advice lowered cholesterol 13% and, in conjunction with smoking cessation, reduced CHD incidence by 47%. 103 The remaining trials achieved much smaller (0-5%) reductions in cholesterol and insignificant effects on CHD the benefits of intervention in some studies may have been limited by ineffective counseling and follow-up, 101,104 lower cholesterol levels at baseline, 101 or adverse effects of other therapies. 102,105 In the most systematic test of dietary counseling in adults, 10 weekly group sessions and periodic individual counseling were provided over 6 years to over 6,000 men (mean cholesterol 253 mg/dL). 102 Average cholesterol level declined 5% in men receiving counseling, but only 2% compared to controls. Greater changes were observed in men who lost at least 5 pounds and those with higher serum cholesterol at baseline, 106 but there was no significant reduction in CHD mortality or incidence in the intervention group. 102,107

Short-term metabolic studies and selected trials in patients with CHD indicate that reducing dietary saturated fat and/or increasing polyunsaturated fat intake can reduce elevated total and LDL-C as much as 10-20%. 108-110 Due to variable compliance, trials of diet counseling in the primary care setting have achieved much smaller and inconsistent average reductions in serum cholesterol in asymptomatic persons (0-4%). 80-82,111-116 Although larger changes have been reported in uncontrolled follow-up studies after cholesterol screening, 117,118 these results may be biased by selective or short-term follow-up and regression to the mean in persons with high cholesterol. Ongoing studies are examining the efficacy of cholesterol screening and intervention in primary care settings in the U.S. 119 More stringent diets can produce larger reductions in cholesterol, 120 but long-term data in asymptomatic persons are limited. Two trials in women at risk for breast cancer lowered total fat intake to 20% of calories and reduced total cholesterol 6-7% over 1-2 years. 121,122 top link

Overviews of Cholesterol-Lowering Trials.

At least 10 quantitative overviews (meta-analyses) of randomized trials have attempted to resolve uncertainties about the risks and benefits of lowering cholesterol, including effects on mortality. 18,88,89,91,123-128 Three recent overviews provide the most comprehensive analyses of long-term cholesterol-lowering trials published through 1993 35 diet and drug trials were included in one analysis, 91 28 in the second (excluding trials that used estrogens or thyroxine), 18,89 and 22 in the third (all trials achieving at least a 4% reduction in cholesterol for at least 3 years). 128 These overviews support a dose-response relationship between change in serum cholesterol and reduction in CHD incidence (fatal and nonfatal events combined) comparable to that predicted from epidemiologic studies: after 2-5 years of treatment, each 1% reduction in serum cholesterol yields a 2-3% reduction in total CHD, for both diet and drug interventions, and in patients with or without CHD at entry. 18,89,128

When only trials enrolling asymptomatic persons were analyzed, however, neither CHD mortality nor total mortality was significantly reduced by cholesterol lowering: 128 difference in total mortality among treated versus control subjects = +6%, 95% confidence interval (CI) -3% to +17%. 89 Moreover, noncardiac mortality was increased 20-24% among patients treated with lipid-lowering medications. 89,91,128 While observing similar effects of treatment, each overview offered distinct interpretations of these findings. Law et al. concluded that the increase in noncoronary mortality was most likely due to chance: the finding was of borderline statistical significance (p = 0.02), did not reflect any consistent cause of excess mortality among trials, and was independent of compliance with therapy. 89 Gordon attributed adverse effects to trials employing hormones or fibrate medications. 128 Davey Smith et al. concluded that lipid-lowering drugs reduced overall mortality in high-risk persons (i.e., persons with CHD), but were harmful in those at lower risk. 91 When trials were stratified by the observed CHD mortality in the control group, drug treatment was associated with a significant 20% increase in all-cause mortality in 10 trials enrolling low-risk subjects (CHD mortality <1% per year), including the WHO, LRC, and Helsinki studies. 91 A single trial (the WHO clofibrate study) accounts for nearly half of all patient-years of treatment in persons without CHD 128 and has a strong influence on results of any meta-analysis.

Due to methodologic concerns about combining results from trials employing different cholesterol-lowering drugs and diets, meta-analysis cannot prove or disprove possible harms from lipid-lowering medications. 129 These analyses, however, illustrate the importance of underlying CHD risk in determining whether expected benefits are likely to justify possible risks of treatment. Even if drugs are safe, the margin of benefit may be small for many persons with asymptomatic hypercholesterolemia. In the LRC and Helsinki trials, preventing one coronary death required treating 300-400 middle-aged men for 5-7 years. 36,94 The benefits of lipid-lowering medications on nonfatal CHD are more pronounced but must be weighed against the unpleasant and occasionally serious side effects of some drugs (see below). 92,127,130 The newest class of lipid-lowering drugs, HMG-CoA reductase inhibitors or "statins," lowers cholesterol more effectively and appears to be well-tolerated in trials lasting up to 6 years. 90,97 These drugs are more likely to have significant effects on mortality in patients without CHD, but long-term trials of these agents in asymptomatic persons have yet to report results. 131 top link

Cholesterol Reduction in Women.

Lipid-lowering medications and diet effectively lower cholesterol in women, 132 but no trial has specifically examined the benefits of cholesterol reduction in asymptomatic women. 133 Trials that included female subjects with CHD observed qualitatively similar benefits of cholesterol reduction on angiographic or clinical endpoints in women and men. 90,99,133 In the 4S trial, simvastatin significantly reduced CHD incidence, but not mortality, in women with CHD. 90 Two trials in women without CHD, with a cumulative enrollment of more than 6,000 women, observed no effect of drug or diet treatment on CHD incidence or mortality after 1-3 years 96,100 the short duration of follow-up may have limited the power of these studies to detect a difference. 133 Long-term data on drug therapy in women are limited, with the exception of estrogen therapy (see Chapter 68).top link

Cholesterol Reduction in Older Adults.

The benefit of lowering cholesterol in older persons has been questioned due to the weak association between serum cholesterol and all-cause mortality after age 60. 17,28,30 Associations between cholesterol and mortality in unselected elderly populations, however, are likely to be confounded by the increasing prevalence of chronic illnesses which increase mortality and independently lower serum cholesterol. 26,134,135 Direct evidence that cholesterol reduction is beneficial in asymptomatic older persons is not yet available, but cholesterol-lowering diets and medications reduced overall mortality 26-30% in persons over 60 with clinical CHD. 90,136,137 In two trials in patients without CHD that included older subjects, however, cholesterol reduction produced significant benefits in younger but not in older patients (over age 60 or 65). 96,98 Newer cholesterol-lowering agents are efficacious and well-tolerated in older patients. 90,138 A large multicenter trial is under way to examine the effectiveness of pravastatin and various antihypertensive medications in asymptomatic persons over age 60 with hypertension and high cholesterol. 138 There are few controlled trials of dietary counseling to lower cholesterol in older patients no significant change in cholesterol levels was observed among rural Medicare recipients offered diet counseling 139 or older patients receiving diet counseling and placebo medication. 138 top link

Cholesterol Reduction in Adolescents and Young Adults.

Determining the benefits of lowering cholesterol in children, adolescents, and young adults is difficult, due to their low near-term risk of clinical coronary disease. The assumption that early treatment is more effective than treatment begun later in life 57 rests on observations that early atherosclerosis is present in many adolescents and young adults, is associated with lipid levels, progresses with age, 6 and is difficult to reverse in middle age. 86 New evidence, however, suggests that much of the clinical benefit of lowering cholesterol can be achieved within 2-5 years of initiating therapy. 18 These benefits have been attributed to stabilizing "lipid-rich" lesions 87 and improving endothelial function, 140 and they suggest that the additional benefits of early drug therapy for hypercholesterolemia (i.e., before middle age) may not justify the added expense and possible risks of longer treatment. Intensive diet or drug intervention for adolescents and young adults with FH, although never tested in a prospective trial, has become standard treatment due to the very high levels of LDL-C and dramatically increased risk of premature CHD in persons with FH. 11,71 Even in FH, however, most clinical events occur in middle age (i.e., after age 40), and risk is variable: MI was rare before age 30 in men in one study, and onset of CHD is later in women and nonsmokers with FH. 69,141

Modified diets lower cholesterol in young adults, but the contribution of universal screening in motivating risk reduction in young persons is uncertain. Neither a multiple-intervention trial in Australian workers 142 nor a study of risk assessment in a general practice in the U.K. 116 demonstrated that screening and dietary advice led to long-term reduction in cholesterol levels in younger men (under age 35-40). The effectiveness of screening and dietary counseling has not been adequately studied in young adults and cannot be predicted reliably from studies in middle-aged men.top link

Cholesterol Reduction in Children.

Dietary fat intake in children is associated with total cholesterol and LDL-C levels, 143,144 but controlled trials have not consistently demonstrated that individual dietary counseling is effective in children. 145-147 Results from the largest trial reported that children with elevated LDL-C who received intensive family-oriented dietary counseling (30 sessions over 3 years) experienced a significant but modest (3.2 mg/dL) reduction in mean LDL-C compared with controls. 148 Uncontrolled studies of dietician counseling for hyperlipidemic children and adolescents have reported larger short-term reductions in mean cholesterol and LDL-C, 149-153 but such studies are prone to bias from regression to the mean and selective follow-up. Physical activity and fitness are associated with higher levels of HDL-C in children and adolescents, but controlled and uncontrolled trials 154-159 have reported inconsistent effects of exercise interventions on lipids. Drug therapy effectively lowers cholesterol in children, but side effects limit compliance with bile-acid resins, the only therapy currently recommended for routine use in children. 57 In one study of 80 children with FH or FCH, only 13% were still compliant with resin therapy after 3 years. 160 Ongoing studies are examining the safety and efficacy of newer agents in children. top link

Potential Adverse Effects of Screening and Intervention.

Measurement of serum cholesterol is safe and relatively inexpensive, but widespread screening may have some undesirable consequences. In populations in which the potential benefits of early detection may be small (e.g., low-risk young persons), the possibility of harm may influence decisions about universal screening. 161 Anecdotal reports have reported decreased well-being in persons diagnosed with high cholesterol (i.e., "labeling"), 162 but a prospective study did not confirm this effect. 163 Other possible adverse effects of screening include inconvenience and expense of screening and follow-up, opportunity costs to the busy clinician, misinformation due to inaccurate results, and reduced attention to diet in persons with "desirable" cholesterol levels. 164 The importance of possible adverse effects of screening has not been systematically studied.

The safety of cholesterol-lowering interventions is especially important in children and young persons. Dietary restrictions may reduce intake of calories, calcium, vitamins, and iron in children, 165-167 and failure-to-thrive due to excessively fat-restricted diets has been reported, albeit rarely, in children. 168,169 In the most comprehensive trial of dietary intervention in children, however, no adverse effects on growth, sexual development, psychological measures, iron status, or blood micronutrients were detected at 3-year follow-up. 148 Other controlled studies also support the safety of properly performed dietary intervention in children. 147,166,170 The elderly may also be at risk from modified diets if adequate intake of calories, calcium, and essential vitamins is not maintained, but these effects have not been directly examined.

The inappropriate use of drug therapy is of greater concern, especially in young persons in whom the benefit of early drug treatment may not justify the costs and possible risks. 18,161 According to a national survey of pediatricians and family physicians, one in six regularly prescribed drugs for hypercholesterolemic children, and a substantial number did so based on inappropriate criteria, or used drugs not routinely recommended for children. 171 Persons under age 40 accounted for over 1 million prescriptions for lipid-lowering drugs in 1992 172 gemfibrozil was the second most commonly prescribed lipid-lowering drug in the U.S. in 1992, 172 despite limited indications for its use 1 and important safety concerns. Fibrate medications (e.g., clofibrate and gemfibrozil) have been associated with an increase in gallstone disease, 92 adverse trends in CHD mortality 93,173 and cancer mortality in individual trials, 93,174 and a significant increase in noncoronary mortality in a recent overview of long-term trials. 128 HMG-CoA reductase inhibitors have not been associated with important adverse effects in trials lasting up to 6 years. 90 The safety of lifelong therapy with these agents cannot yet be determined several medications in this class have been reported to cause liver tumors in animal studies. top link

Early Detection of Other Lipid Abnormalities.

The importance of detecting low HDL-C or high triglycerides remains unproven, especially in persons with normal serum cholesterol. Weight loss in obese subjects, 132,175 smoking cessation, exercise, 176,176a and moderate alcohol consumption 177 can raise HDL-C and/or lower triglyceride levels. Some of these lifestyle interventions have only small effects, however, and most can be recommended independent of lipid levels. Most importantly, no trial has directly examined the benefit of raising HDL-C or lowering triglycerides. 40,45 Secondary analyses of several trials have attributed varying proportions of the clinical benefit of drug therapy to increases in HDL-C, 40,94,95 or reductions in triglycerides, 136 but all of the subjects had high total or LDL cholesterol. The benefit of drug treatment for low HDL-C and normal cholesterol has not been determined but is being studied in men with CHD. 43 top link

Recommendations of Other Groups

The National Cholesterol Education Program Adult Treatment Panel II, convened by the National Heart, Lung, and Blood Institute, recommends routine measurement of nonfasting total cholesterol and HDL-C in all adults age 20 or older at least once every 5 years. 1,178 The Canadian Task Force on the Periodic Health Examination concluded there was insufficient evidence to recommend routine cholesterol screening but endorsed case-finding in men 30-59 years old. 127 The American Academy of Family Physicians 179 recommends measurement of total cholesterol at least every 5 years in adults age 19 and older these recommendations are under review. The American College of Obstetricians and Gynecologists recommends periodic screening of cholesterol in all women over age 20, and in selected high-risk adolescents. 180 In guidelines revised in 1995, the American College of Physicians (ACP) concluded that screening serum cholesterol was appropriate but not mandatory for asymptomatic men aged 35-65 and women aged 45-65 screening is not recommended for younger persons unless they are suspected of having a familial lipoprotein disorder or have multiple cardiac risk factors. The ACP concluded that evidence was not sufficient to recommend for or against screening asymptomatic persons between the ages of 65 and 75, but it recommends against screening after age 75. 181

Selective screening of children and adolescents is recommended by the National Cholesterol Education Program Expert Panel on Blood Cholesterol Levels in Children and Adolescents, 57 the American Academy of Pediatrics (AAP), 182 the Bright Futures guidelines, 183 the American Medical Association Guidelines for Adolescent and Preventive Services (GAPS), 184 and the American Academy of Family Physicians. 179 Screening with nonfasting cholesterol in all children and adolescents who have a parental history of hypercholesterolemia, and with fasting lipid profile in those with a family history of premature cardiovascular disease, is recommended. These organizations recommend that children who have multiple risk factors for CHD (such as smoking or obesity) and whose family history cannot be ascertained be screened at the discretion of the physician. top link

Discussion

Elevated serum cholesterol is an important risk factor for CHD in men and women in the U.S., and there is now good evidence that lowering serum cholesterol can reduce the risk of CHD. Whereas measures that lower serum cholesterol and provide other health benefits (e.g., regular physical activity, reducing dietary fat, and maintaining a healthy weight) should be encouraged in all persons, cholesterol screening can identify high-risk individuals who are most likely to benefit from individualized dietary counseling or drug treatment. In addition, screening may help clinicians and patients identify priorities for risk factor modification and reinforce public awareness of the importance of a healthy diet.

Some important questions remain, however, about routine lipid screening in asymptomatic and low-risk persons, including when to begin screening and which constituents to measure. Overall, evidence is strongest for screening for high serum cholesterol in middle-aged men (ages 35-65), based on the reduction in coronary morbidity in trials enrolling asymptomatic men with very high cholesterol (mean 280 mg/dL). The epidemiology and pathophysiology of CHD is similar in men and women, suggesting that reducing high cholesterol levels will also reduce CHD in asymptomatic women. Extrapolations to premenopausal women may not be appropriate, given their low risk of CHD and the apparent protective effects of estrogen on CHD incidence. The optimal age to screen women is not known the later onset of hypercholesterolemia and CHD suggests that routine screening should begin around age 45.

Direct evidence that screening and intervention is effective in persons over age 65 is not yet available, but epidemiologic studies indicate that the risks of high cholesterol extend up to age 75. Given the high risk of CHD in the elderly, and the benefits of lowering cholesterol in symptomatic older men and women, screening may be reasonable in older persons who do not have major comorbid illnesses. Since individual cholesterol levels usually plateau by age 65 in women (and earlier in men), continued screening is less important in patients who have had desirable cholesterol levels throughout middle age.

There is not yet evidence that routine lipid screening is effective in reducing cholesterol levels or CHD risk in younger populations. Universal screening is an inefficient way to identify the small number of hypercholesterolemic young persons at risk for premature CHD, most of whom have multiple nonlipid risk factors or a history suggestive of familial dyslipidemia. Most "high-risk" young persons (excluding young men with FH) have a near-term risk of CHD well below that of hypercholesterolemic middle-aged men, 185 and are not appropriate candidates for early drug therapy. Screening young persons can provide information to help stimulate lifestyle changes, but promoting a healthy lifestyle (e.g., healthy diet, regular physical activity, etc.) is important for all young persons, including the majority with "desirable" cholesterol levels. International comparisons suggest that cholesterol levels explain only part of the strong association between diet and heart disease. 186a As a result, it is uncertain whether routine cholesterol screening in low-risk younger populations is of sufficient benefit to justify the inconvenience, costs, and possible risks of screening and treatment. In a study modeling benefits of cholesterol screening, a conservative strategy of screening only middle-aged men and others with multiple CHD risk factors produced benefits comparable to screening all adults over age 20 if interventions had adverse effects on quality of life, the more conservative strategy was preferable. 186 Should future studies demonstrate that routine screening and targeted interventions are more effective in the primary care setting than universal dietary advice in young persons, this would provide some additional justification for early screening.

The benefits of screening children are even less certain. Progression of atherosclerosis in childhood is limited, many children with high cholesterol are not hypercholesterolemic as adults, and it is uncertain whether or not reducing cholesterol levels in childhood will significantly alter the risk of CHD many years later. Given the limited effectiveness of dietary counseling, poor compliance with currently recommended drug therapy, and the potential for adverse reactions in children, widespread pediatric screening might result in more harm than good.

The benefit of measuring HDL-C or triglycerides at initial screening is unproven. Measures to lower high triglycerides or raise HDL-C (e.g., weight reduction in obese persons and exercise) have relatively modest effects and should be encouraged regardless of lipid levels. Measures of HDL-C and lipoprotein analysis improve the estimation of coronary risk and should be obtained to guide treatment decisions in patients with high total cholesterol. There is, however, no evidence that they significantly improve the management of patients who do not have high total cholesterol.

While a single cholesterol test is relatively inexpensive, the cumulative costs of screening can be substantial under protocols calling for measurement of HDL-C, periodic screening, and detailed evaluation and treatment of the large population with high cholesterol. To be effective, dietary interventions require regular follow-up and reinforcement. Under optimistic assumptions, tailored dietary therapy in middle-aged men is estimated to cost more than $20,000 per year of life gained, when costs of screening and follow-up are included. 187 Drug treatment of asymptomatic middle-aged men (assuming no important adverse effects) has been estimated to cost at least $50,000-90,000 per year of life saved. 35,188 HMG-CoA reductase inhibitors are substantially more expensive than earlier medications, but they lower LDL-C more effectively and also raise HDL-C. These agents may improve the cost-effectiveness of drug therapy for asymptomatic hypercholesterolemia, especially in high-risk men, 189 but the long-term safety and effectiveness of these agents in persons without CHD have not yet been established.

CLINICAL INTERVENTION

Periodic screening for high blood cholesterol, using specimens obtained from fasting or nonfasting individuals, is recommended for all men ages 35-65 and women ages 45-65 ("B" recommendation). There is insufficient evidence to recommend for or against routine screening in asymptomatic persons after age 65, but screening may be considered on a case-by-case basis ("C" recommendation). Older persons with major CHD risk factors (smoking, hypertension, diabetes) who are otherwise healthy may be more likely to benefit from screening, based on their high risk of CHD and the proven benefits of lowering cholesterol in older persons with symptomatic CHD. Cholesterol levels are not a reliable predictor of risk after age 75, however. There is insufficient evidence to recommend routine screening in children, adolescents, or young adults ("C" recommendation). For adolescents and young adults who have a family history of very high cholesterol, premature CHD in a first-degree relative (before age 50 in men or age 60 in women), or major risk factors for CHD screening may be recommended on other grounds: the greater absolute risk attributable to high cholesterol in such persons, and the potential long-term benefits of early lifestyle interventions in young persons with high cholesterol. Recommendations against routine screening in children may be made on other grounds, including the costs and inconvenience of screening and follow-up, greater potential for adverse effects of treatment, and the uncertain long-term benefits of small reductions in childhood cholesterol levels.

The appropriate interval for periodic screening is not known. Periodic screening is most important when cholesterol levels are increasing (e.g., middle-aged men, perimenopausal women, and persons who have gained weight). An interval of 5 years has been recommended by experts, 1 but longer intervals may be reasonable in low-risk subjects (including those with previously desirable cholesterol levels).

There is insufficient evidence to recommend for or against routine measurement of HDL-C or triglycerides at initial screening ("C" recommendation). For high-risk persons (middle-aged persons with high cholesterol or multiple nonlipid risk factors for CHD), measurement of HDL-C or lipoprotein analysis can be recommended to help identify individuals at highest risk of CHD, in whom individual diet or drug therapy may be indicated.

Decisions about interventions for high cholesterol should be based on at least two measures of cholesterol and assessment of the absolute risk of CHD in each individual. This assessment should take into account the age of the patient (higher risk in men over 45 and women over 55), results of lipoprotein analysis (or ratio of total cholesterol to HDL-C), and the presence and severity of other risk factors for CHD (see above). 178 More specific algorithms for risk assessment have been published. 185 Initial therapy for patients with elevated cholesterol is counseling to reduce consumption of fat (especially saturated fat) and promote weight loss in overweight persons. A two-step dietary program effective in lowering serum cholesterol has been described in detail elsewhere. 1 Benefits of drug therapy are likely to justify costs and potential risks only in persons at high risk of CHD (e.g., middle-aged men and postmenopausal women with very high cholesterol or multiple risk factors). The risks and benefits of drug therapy in asymptomatic persons over 65 have not yet been determined. In postmenopausal women with high cholesterol, estrogen therapy can lower LDL-C and raise HDL-C and is associated with lower risk of CHD in epidemiologic studies (see Chapter 68). Patients should receive information on the potential benefits, costs, and risks of long-term therapy before beginning treatment on cholesterol-lowering drugs.

All adults, adolescents, and children over age 2 years, including those with normal cholesterol levels, should receive periodic counseling regarding dietary intake of fat and saturated fat (see Chapter 56) and other measures to reduce the risk of coronary disease (see Chapters 3, 54, and 55).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH, and Carolyn DiGuiseppi, MD, MPH. top link

3. Screening for Hypertension

Burden of Suffering

Hypertension is usually defined as a diastolic blood pressure of 90 mm Hg or higher or a systolic pressure of 140 mm Hg or higher. 1 It is present in an estimated 43 million Americans and is more common in blacks and older adults. 1a Hypertension is a leading risk factor for coronary heart disease, congestive heart failure, stroke, ruptured aortic aneurysm, renal disease, and retinopathy. These complications of hypertension are among the most common and serious diseases in the U.S., and successful efforts to lower blood pressure could thus have substantial impact on population morbidity and mortality. Heart disease is the leading cause of death in the U.S., accounting for nearly 740,000 deaths each year (287 deaths per 100,000 population), and cerebrovascular disease, the third leading cause of death, accounts for about 150,000 deaths each year (58/100,000). 2 Milder forms of hypertension predict progression to more severe elevations and development of cardiovascular disease. 1,3,4 Coronary heart disease mortality begins to increase at systolic blood pressures above 110 mm Hg and at diastolic pressures above 70 mm Hg. 5 The prevalence of unrecognized and uncontrolled hypertension, and the mortality from cardiovascular disease, have declined substantially in the U.S. in the past several decades. 1

Treatable (also known as secondary) causes of hypertension such as aortic coarctation or renovascular disease also may be associated with severe consequences, including congestive heart failure, aortic rupture, or stroke. 6-9 There are no population data available for estimating the true prevalence of secondary hypertension. The incidence of coarctation of the aorta has been estimated at 0.2-0.6/1,000 live births and the prevalence at 0.1-0.5/1,000 children. 10-12 top link

Accuracy of Screening Tests

The most accurate devices for measuring blood pressure (e.g., intra-arterial catheters) are not appropriate for routine screening because of their invasiveness, technical limitations, and cost. Office sphygmomanometry (the blood pressure cuff) remains the most appropriate screening test for hypertension in the asymptomatic population. Although this test is highly accurate when performed correctly, false-positive and false-negative results (i.e., recording a blood pressure that is not representative of the patient's average blood pressure) do occur in clinical practice. 13 One study found that 21% of persons diagnosed as mildly hypertensive based on office sphygmomanometry had no evidence of hypertension when 24-hour ambulatory recordings were obtained. 14

Errors in measuring blood pressure may result from instrument, observer, and/or patient factors. 15 Examples of instrument error include manometer dysfunction, pressure leaks, stethoscope defects, and cuffs of incorrect width or length for the patient's arm size. The observer can introduce errors due to sensory impairment (difficulty hearing Korotkoff sounds or reading the manometer), inattention, inconsistency in recording Korotkoff sounds (e.g., Phase IV vs. Phase V), and subconscious bias (e.g., "digit preference" for numbers ending with zero or preconceived notions of "normal" pressures). The patient can be the source of misleading readings due to posture and biologic factors. Posture (i.e., lying, standing, sitting) and arm position in relation to the heart can affect results by as much as 10 mm Hg. 15 Biologic factors include anxiety, meals, tobacco, alcohol, temperature changes, exertion, and pain. Due to these limitations in the test-retest reliability of blood pressure measurement, it is commonly recommended that hypertension be diagnosed only after more than one elevated reading is obtained on each of three separate visits over a period of one to several weeks. 1

Additional factors affect accuracy when performing sphygmomanometry on children these difficulties are especially common when testing infants and toddlers under 3 years of age. 16-18 First, there is increased variation in arm circumference, requiring greater care in the selection of cuff sizes. 19 Second, the examination is more frequently complicated by the anxiety and restlessness of the patient. Third, the disappearance of Korotkoff sounds (Phase V) is often difficult to hear in children and Phase IV values are often substituted. Fourth, erroneous Korotkoff sounds can be produced inadvertently by the pressure of the stethoscope diaphragm against the antecubital fossa. Finally, the definition of pediatric hypertension has itself been uncertain because of confusion over normal values during childhood. The definition of hypertension in childhood is essentially arbitrary, based on age-specific percentile. 18 Age-, sex-, and height-specific blood pressure nomograms for U.S. children and adolescents have been published more recently, based on data from 56,108 children aged 1-17 years. 20

Self-measured (home) blood pressure and ambulatory blood pressure monitoring may provide useful information in special circumstances (e.g., research, persistent "white-coat" hypertension), but there is insufficient evidence at present to warrant their routine use in screening. 1,21-28 top link

Effectiveness of Early Detection

There is a direct relationship between the magnitude of blood pressure elevation and the benefit of lowering pressure. In persons with malignant hypertension, the benefits of intervention are most dramatic treatment increases 5-year survival from near zero (data from historical controls) to 75%. 29 Over the past 30 years, the results of many randomized clinical trials of the effects of antihypertensive drug therapy on morbidity and mortality in adult patients (>=21 years of age) with less severe hypertension have been published. 30-32 The efficacy of treating hypertension is clear, as demonstrated in a number of older randomized controlled trials in adults with diastolic blood pressures ranging from 90 to 129 mm Hg. 33-38 For example, in the Veterans Administration Cooperative Study on Antihypertensive Agents, middle-aged men with diastolic blood pressure averaging 90 through 114 mm Hg experienced a significant reduction in "morbid" events (e.g., cerebrovascular hemorrhage, congestive heart failure) after treatment with antihypertensive medication. 34

Persons with mild (Stage 1) to moderate (Stage 2) 1 diastolic hypertension (90-109 mm Hg) also benefit from treatment. 30,39-41 This was confirmed in the Hypertension Detection and Follow-Up Program, a randomized controlled trial involving nearly 11,000 hypertensive men and women, of whom 40% were black. 39 The intervention group received standardized pharmacologic treatment ("stepped care") while the control group was referred for community medical care. There was a statistically significant 17% reduction in 5-year all-cause mortality in the group receiving standardized drug therapy the subset with diastolic blood pressure 90-104 mm Hg experienced a 20% reduction in mortality. 39 Deaths due to cerebrovascular disease, ischemic heart disease, and other causes were also significantly reduced in the stepped care group. 42 Similar effects on all-cause mortality and cardiovascular events have been reported in other randomized controlled trials, such as the Australian National Blood Pressure Study (initial diastolic blood pressure 95-109 mm Hg) 40 and the Medical Research Council (MRC) trial (diastolic blood pressure 90-109 mm Hg). 41 In these two trials, the relative reduction in rates of stroke or other trial endpoints with treatment was similar in those with diastolic blood pressures <95 or 95-99 mm Hg and those with higher diastolic blood pressures, although the absolute benefit was less due to smaller initial risk of stroke and other diseases at lower blood pressures. Both trials included untreated control groups and did not report a significant reduction in deaths from noncardiovascular causes in the actively treated groups, confirming that the benefit was due to antihypertensive treatment rather than to other medical care.

Earlier studies included some subjects over age 65 years, but in insufficient numbers to permit firm conclusions. Four large, randomized placebo-controlled trials have since demonstrated conclusively the benefit of antihypertensive treatment in elderly subjects (aged 60-97 years). 43-48 Three of these studies included persons with diastolic blood pressures of 90-120 mm Hg, and among them reported significant reductions in all-cause mortality, 46 cardiovascular mortality, 43,46 cardiovascular events, 47 and strokes. 46,47 The Systolic Hypertension in the Elderly Program (SHEP) trial included over 4,000 subjects >=60 years of age with isolated systolic hypertension (systolic blood pressure >= 160 mm Hg, with diastolic blood pressure < 90 mm Hg), and reported significant reductions in the incidence of stroke, myocardial infarction, and left ventricular failure. 48 A meta-analysis combining these and other trials that included persons aged >=60 years demonstrated that antihypertensive treatment in elderly persons significantly reduced mortality from all causes (-12%), stroke (-36%), and coronary heart disease (-25%), as well as stroke and coronary heart disease morbidity. 49 This meta-analysis suggested reduced benefit with increasing age, although differences were not statistically significant. A second meta-analysis of randomized controlled trials in persons over age 60 years concluded that absolute 5-year morbidity and mortality benefits derived from trials were greater for older than for younger subjects. 50 This meta-analysis calculated that 18 (95% CI, 14-25) elderly hypertensive subjects needed to be treated for 5 years to prevent one cardiovascular event.

Treatment of hypertension is associated with multiple benefits, including reduced coronary heart disease and vascular deaths, but meta-analyses suggest it produces the largest reductions in cerebrovascular morbidity and mortality. 30-32,49,50 Improved treatment of high blood pressure has been credited with a substantial portion of the greater than 50% reduction in age-adjusted stroke mortality that has been observed since 1972. 1,51,52

Although the efficacy of antihypertensive treatment for essential (also called primary) hypertension has been well established in clinical research, certain factors may influence the magnitude of benefit from hypertension screening achieved in actual practice. Compliance with drug therapy may be limited by the inconvenience, side effects, and cost of these agents. 53,54 Serious or life-threatening drug reactions in the clinical trials were rare, but less serious side effects were common, resulting in discontinuation of randomized treatment (almost 20% by the fifth year in the MRC trial, 41 for example) or a substantial increase in patient discomfort. 34 Higher incidences of mild hypokalemia, hyperuricemia, and elevated fasting blood sugar have also been reported in treated individuals. 35 A population-based case-control study suggested an increased risk of primary cardiac arrest with certain diuretic regimens (e.g., higher doses, use without potassium-sparing therapy). 55 However, current drug regimens, including low-dose diuretics, are associated with fewer adverse effects and with favorable effects on quality of life. 55a Newer classes of drugs (e.g., calcium channel blockers, angiotensin-converting enzyme inhibitors) have not been assessed in long-term trials with clinical endpoints. Their effects on cardiovascular morbidity and mortality may differ from the effects reported in the clinical trials cited above, which used diuretics or beta-blockers.

Whether hypertension screening is equally effective for other populations or with treatments other than drugs is less clear. The benefits of hypertension treatment are less well studied in certain population groups, such as children (see below), Native Americans, Asians and Pacific Islanders, and Hispanics. The effects of nonpharmacologic first-line therapy (i.e., weight reduction in overweight patients, increased physical activity, sodium restriction, and decreased alcohol intake) on cardiovascular morbidity and mortality are unstudied. Although these nonpharmacologic therapies can sometimes lower blood pressure in the short term, 1,56-62a the magnitude of blood pressure reduction achieved is generally smaller than that achieved with drug therapy, and both the magnitude and duration of reduction in actual practice may be limited by biologic factors (e.g., varying responsiveness to sodium restriction) and the difficulties of maintaining behavioral changes (e.g., weight loss). Some of these interventions, such as sodium restriction, may also have adverse effects on quality of life. 63

The detection of high blood pressure during childhood is of potential value in identifying those children who are at increased risk of primary hypertension as adults and who might benefit from earlier intervention and follow-up. Hypertensive vascular and end-organ damage may begin in childhood, 64-69 although it is unclear how strongly these pathophysiologic changes are associated with subsequent cardiovascular disease. Prospective cohort studies have shown that children with high blood pressure are more likely than other children to have hypertension as adults. 70-78 Correlation coefficients from these studies were generally less than 0.5, however, suggesting a limited role for high blood pressure in childhood as a predictor of adult hypertension. Although controlled trials in children show that short-term (up to 3 years) effects on blood pressure can be achieved with changes in diet and activity, 79-82 studies demonstrating long-term changes in blood pressure are lacking. There are no trials showing that lowering blood pressure in childhood results in reduced blood pressure in adulthood. A relationship between lowering blood pressure during childhood and improved morbidity and mortality in later life is unlikely to be demonstrated, given the difficulty of performing such studies.

A relatively high proportion of children with hypertension have secondary, potentially curable forms. Among children and adolescents whose hypertension was evaluated in primary care centers, an estimated 28% had secondary hypertension (e.g., renal parenchymal disease, coarctation of the aorta). 69 This contrasts with hypertensive adults seen in primary care settings, of whom only 7% are estimated to have secondary hypertension. 83 Screening children and adolescents may be justifiable if the morbidity of these conditions is improved by early detection and treatment. Many causes of secondary hypertension in childhood are detectable by careful history-taking (e.g., preterm birth, umbilical artery catheter, chronic pyelonephritis, renal disease, bronchopulmonary dysplasia symptoms of cardiac, renal, endocrinologic, or neurologic disease) or physical examination (e.g., murmur, decreased femoral pulses, abdominal bruit). 69,84 Characteristic symptoms and signs, such as those of aortic coarctation, are often overlooked, however. 85-87 Numerous surgical case series suggest that delay in surgical repair of aortic coarctation increases the likelihood of irreversible hypertension, 88-94 although none of these series controlled for other differences between persons presenting early versus late in life. Uncontrolled studies indicate that some important causes of hypertension for which definitive cures are available, including coarctation and renovascular disease, may not be diagnosed until complications such as congestive heart failure, aortic rupture, or stroke occur. 6-9 Prognosis with early surgical intervention is improved compared with historical controls. 88,95 top link

Recommendations of Other Groups

Recommendations for adults have been issued by the Joint National Committee on Detection, Evaluation, and Treatment of High Blood Pressure, 1 and similar recommendations have been issued by the American Heart Association. 96 These call for routine blood pressure measurement at least once every 2 years for adults with a diastolic blood pressure below 85 mm Hg and a systolic pressure below 130 mm Hg. Measurements are recommended annually for persons with a diastolic blood pressure of 85-89 mm Hg or systolic blood pressure of 130-139 mm Hg. Persons with higher blood pressures require more frequent measurements. The American College of Physicians (ACP) 97 and the American Academy of Family Physicians (AAFP) 98 recommend that all adults 18 years and older be screened for hypertension every 1-2 years. The AAFP policy is currently under review. The ACP also recommends screening at every physician visit for other reasons, and that those in high-risk groups (e.g., diastolic 85-89 mm Hg, previous history of hypertension) be screened on an annual basis. The Canadian Task Force on the Periodic Health Examination recommends that all persons aged 21 years and over receive a blood pressure measurement during any visit to a physician ("case finding"). 99

The American Academy of Pediatrics (AAP), 100 the National Heart, Lung, and Blood Institute, 18 the AAFP, 98 Bright Futures, 101 the American Medical Association, 102 and the American Heart Association 103 recommend that children and adolescents receive blood pressure measurements every 1 or 2 years during regular office visits. The Canadian Task Force found insufficient evidence to recommend for or against routine blood pressure measurement in persons under age 21 years. 99 The AAP recommends against universal neonatal blood pressure screening. 104 top link

Discussion

It is clear from several large randomized clinical trials that lowering blood pressure in hypertensive adults is beneficial and that death from several common diseases can be reduced through the detection and treatment of high blood pressure. Estimates suggest that an average diastolic blood pressure reduction of 5-6 mm Hg in everyone with hypertension could reduce the incidence of coronary heart disease by 14% and the incidence of strokes by 42%. 30,31 At the same time, it is important for clinicians to minimize the potential harmful effects of detection and treatment. For example, if performed incorrectly, sphygmomanometry can produce misleading results. Some hypertensive patients thereby escape detection (false negatives) and some normotensive persons receive inappropriate labeling (false positives), which may have certain psychological, behavioral, and even financial consequences. 105 Treatment of hypertension can also be harmful as a result of medical complications, especially related to drugs. Clinicians can minimize these effects by using proper technique when performing sphygmomanometry, making appropriate use of nonpharmacologic methods, and prescribing antihypertensive drugs with careful adherence to published guidelines. 1,106-108

The diastolic blood pressure above which therapy has been proven effective (i.e., diastolic blood pressure > 90 mm Hg) is to a large extent based on the artificial cutpoints chosen for study purposes rather than on a specific biologic cutpoint defining increased risk. The coronary heart disease mortality risk associated with blood pressure occurs on a continuum that extends well below the arbitrarily defined level for abnormal blood pressure, beginning for systolic blood pressure above 110 mm Hg and for diastolic pressure above 70 mm Hg. 5 Nevertheless, many organizations outside the U.S. have been reluctant to recommend drug therapy for persons with diastolic blood pressures below 100 mm Hg who lack additional risk factors. 106,108-111 Drug treatment of mild hypertension is of particular concern for young adults: the evidence for therapeutic benefit comes primarily from several older trials 34,36,38 that included only a few individuals in their 20s, the potential adverse effects of decades of antihypertensive therapy are undefined, and the absolute benefits in young adults are likely to be limited given their small risk of stroke and coronary heart disease. For persons with mild hypertension, most recommendations suggest including age and/or the presence of other cardiovascular disease risk factors or concomitant diseases (e.g., smoking, obesity, renal disease, peripheral vascular disease) to modify treatment decisions. 1,106,108-111

Tracking studies and pathophysiologic evidence suggest there may be some benefit from early detection of primary hypertension in childhood, but there is insufficient evidence to support routine screening solely for this purpose. The lack of evidence is of concern because it is unclear whether a policy of routinely screening all children and adolescents to detect primary hypertension would achieve sufficient clinical benefit later in life to justify the costs and potential adverse effects of widespread testing and treatment. Potentially curable causes of hypertension, which account for a relatively large proportion of cases in young children, are often overlooked on history and physical examination, with rare but potentially catastrophic consequences. Evidence from case series and multiple time series indicate that early detection of secondary hypertension in childhood is of substantial benefit to the small number of patients affected.

CLINICAL INTERVENTION

Periodic screening for hypertension is recommended for all persons >=21 years of age ("A" recommendation). The optimal interval for blood pressure screening has not been determined and is left to clinical discretion. Current expert opinion is that adults who are believed to be normotensive should receive blood pressure measurements at least once every 2 years if their last diastolic and systolic blood pressure readings were below 85 and 140 mm Hg, respectively, and annually if the last diastolic blood pressure was 85-89 mm Hg. 1 Sphygmomanometry should be performed in accordance with recommended technique. 1 Hypertension should not be diagnosed on the basis of a single measurement elevated readings should be confirmed on more than one reading at each of three separate visits. In adults, current blood pressure criteria for the diagnosis of hypertension are an average diastolic pressure of 90 mm Hg or greater and/or an average systolic pressure of 140 mm Hg or greater. 1 Once confirmed, patients should receive appropriate counseling regarding physical activity ( Chapter 55), weight reduction and dietary sodium intake ( Chapter 56), and alcohol consumption ( Chapter 52). Evidence should also be sought for other cardiovascular risk factors, such as elevated serum cholesterol (Chapter 2) and smoking ( Chapter 54), and appropriate intervention should be offered when indicated. The decision to begin drug therapy may include consideration of the level of blood pressure elevation, age, and the presence of other cardiovascular disease risk factors (e.g., tobacco use, hypercholesterolemia), concomitant disease (e.g., diabetes, obesity, peripheral vascular disease), or target-organ damage (e.g., left ventricular hypertrophy, elevated creatinine). 1,106,108 Antihypertensive drugs should be prescribed in accordance with recent guidelines 1,106,108 and with attention to current techniques for improving compliance. 53,54

Measurement of blood pressure during office visits is also recommended for children and adolescents ("B" recommendation). This recommendation is based on the proven benefits from the early detection of treatable causes of secondary hypertension there is insufficient evidence to recommend for or against routine periodic blood pressure measurement to detect essential (primary) hypertension in this age group. Sphygmomanometry should be performed in accordance with the recommended technique for children, and hypertension should only be diagnosed on the basis of readings at each of three separate visits. 18 In children, criteria defining hypertension vary with age. 18 Age-, sex-, and height-specific blood pressure nomograms for U.S. children and adolescents have been published. 20

Routine counseling to promote physical activity ( Chapter 55) and a healthy diet ( Chapter 56) for the primary prevention of hypertension is recommended for all children and adults.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on material prepared for the Canadian Task Force on the Periodic Health Examination by Alexander G. Logan, MD, FRCPC, and Christopher Patterson, MD, FRCPC. top link

4. Screening for Asymptomatic Carotid Artery Stenosis

Burden of Suffering

Cerebrovascular disease is the third leading cause of death in the U.S., accounting for over 149,000 deaths in 1993. 1 Most stroke-related morbidity and mortality occur in older adults: 87% of all deaths and 74% of all hospitalizations occur in persons age 65 years or older. 2 Strokes can result in substantial neurologic deficits as well as serious medical and psychological complications. With an estimated prevalence of 3 million stroke survivors, 3 this illness places enormous burdens on family members and caretakers, often necessitating skilled care in an institutional setting. The direct and indirect costs of stroke in the U.S. have been estimated at $30 billion annually. 4 The principal risk factors for ischemic stroke are increased age, hypertension, smoking, coronary artery disease, atrial fibrillation, and diabetes. 4-7 Of these, the most important modifiable risk factors are hypertension and smoking. 8 Improved treatment of high blood pressure has been credited with the greater than 50% reduction in age-adjusted stroke mortality that has been observed since 1972 (see Chapter 3).

Population-based cohort studies have established that persons with carotid artery stenosis are at increased risk for subsequent stroke, myocardial infarction (MI), and death. 9,10 The risk of stroke is greatest for persons with neurologic symptoms such as transient ischemic attacks (TIAs), but is also increased in patients with asymptomatic lesions. The prevalence of hemodynamically significant carotid stenosis varies with age and other risk factors: population-based studies estimate that 0.5% of persons in their 50s and about 10% of those over age 80 have carotid stenosis greater than 50%. 11 The proportion of all strokes attributable to previously asymptomatic carotid stenosis seems to be small, however. In a study of 250 patients over age 60 with cerebral infarction, only 13% had ipsilateral carotid stenosis of 70% or greater. 12 top link

Accuracy of Screening Tests

Two methods are used to screen for carotid artery stenosis: clinical auscultation for carotid bruits and noninvasive studies of the carotid artery. Neck auscultation is an imperfect screening test for carotid stenosis. There is considerable interobserver variation among clinicians in the interpretation of the key auditory characteristics -- intensity, pitch, and duration -- of importance in predicting stenosis. 13 In addition, a cervical bruit can be heard in 4% of the population over age 40, but the finding is not specific for significant carotid artery stenosis. Between 40% and 75% of arteries with asymptomatic bruits do not have significant compromise in blood flow 14 similar sounds can also be produced by anatomic variation and tortuosity, venous hum, goiter, and transmitted cardiac murmur. 13,15-17 Finally, hemodynamically significant stenotic lesions may exist in the absence of an audible bruit. 13,15,18 Using 70-99% stenosis on a carotid angiogram as a reliable standard, a carotid bruit has a sensitivity of 63-76% and a specificity of 61-76% for clinically significant stenosis. 19

Persons with cervical bruits can be evaluated further with greater accuracy by noninvasive study of the carotid arteries. Older techniques (e.g., spectral analysis phonoangiography, continuous-wave or pulsed Doppler ultrasound, B-mode real-time ultrasound, oculoplethysmography, ophthalmodynamometry, periorbital directional Doppler ultrasound, and thermography) have been replaced largely by carotid duplex sonography which combines the capabilities of B-mode and Doppler ultrasound. A 1995 meta-analysis of 70 studies comparing the accuracy of noninvasive diagnostic tests to carotid angiography (the reference standard) concluded that carotid duplex, carotid Doppler, and magnetic resonance angiography (MRA) were equally effective in diagnosing carotid stenosis of 70% or greater: estimated sensitivity ranged from 83% to 86%, specificity from 89% to 94%. 20 Depending on the underlying population characteristics, the positive predictive value of carotid duplex ranges from 82% to 97%. 21 The performance of noninvasive tests for screening asymptomatic persons, however, has not been assessed in a prospective study. Although MRA seems to be quite sensitive and spares patients the risks of conventional angiography, it is unlikely to be a useful screening test due to costs (over $400) and inconvenience. 22 top link

Effectiveness of Early Detection

The rationale for testing for carotid artery stenosis is that persons with asymptomatic stenoses are not only at increased risk for cerebrovascular disease, 11,12 but that early detection can reduce morbidity due to cerebrovascular disease. According to this rationale there are several benefits to early detection of asymptomatic carotid stenosis. An awareness of the diagnosis might motivate patients to modify other risk factors (e.g., high blood pressure, smoking, physical inactivity). Performing carotid endarterectomy in some individuals might prevent subsequent cerebral infarction distal to the obstruction. Finally, antiplatelet drugs (aspirin and ticlopidine) might reduce stroke risk in asymptomatic individuals with carotid artery stenosis. No study has specifically compared a strategy of screening and early intervention in asymptomatic persons to intervening only in symptomatic patients (e.g., those with TIAs). The first symptom of carotid stenosis in some patients may be an irreversible stroke, however. A number of studies have examined whether interventions in asymptomatic persons can reduce the subsequent incidence of fatal and nonfatal stroke.

A bruit over the carotid artery is a fair indicator of vascular disease but a poor predictor that ischemic stroke will occur in its arterial distribution. The proportion of persons with asymptomatic bruits who will experience stroke is small: the annual incidence of stroke ipsilateral to a bruit and unheralded by TIAs is only 1-3%. 9,10,16,23-25 Higher grades of stenosis (assessed by sonography) are associated with increasing risk of neurologic events, rising to 5-7% per year with high-grade stenosis or total occlusion. 26,27 However, in those persons who will suffer a stroke, the degree of carotid stenosis does not always predict the risk of cerebral infarction, 16,23,28 or its location. 11,12 Carotid artery lesions may be less a predictor of atherothrombotic strokes than of generalized atherosclerotic disease persons with carotid artery disease are considerably more likely to die from ischemic heart disease than from a cerebrovascular event. 9,10

One of the major justifications for screening is the belief that carotid endarterectomy for high-grade, asymptomatic lesions detected through screening can prevent stroke. Three studies published before 1987 reported improved outcomes after endarterectomy. These studies provide poor quality evidence because they included previously symptomatic patients, were convenience samples derived from surgeons' practices, had inconsistent measurement criteria, or did not randomly assign patients. 17,29,30 Four more recent randomized trials have compared aspirin with endarterectomy in patients with asymptomatic carotid artery stenosis. The first study comparing aspirin alone with endarterectomy alone 31 enrolled only 71 patients before it was terminated due to excess MI in the surgical group no conclusions could be drawn regarding the effectiveness of endarterectomy for preventing stroke. 32 The second study, the 1991 European CASANOVA trial, randomized 410 patients with moderately severe stenosis (> 50% but < 90%) to treatment with aspirin/dipyridamole or aspirin/dipyridamole plus surgery. 33 The protocol was complex: some patients in both groups had contralateral symptoms, patients with stenosis greater than 90% were excluded, and 72 patients received therapy appropriate for the other group (more in the group randomized to surgery). There were no differences in the numbers of neurologic events and deaths between the two groups. The power of the study, however, was insufficient to exclude a clinically important benefit in the surgical group. 33 A third study, published in 1993, randomized 444 older veterans (mean age 64) with 50% or greater carotid stenosis to aspirin plus carotid endarterectomy or aspirin therapy alone. Patients who underwent carotid endarterectomy had lower rates of ipsilateral neurologic events, the primary endpoint: the combined incidence of TIAs, transient monocular blindness, and stroke was 8% in the surgery group versus 21% with aspirin only (p < 0.001), during an average follow-up of 48 months. The two groups had similar outcomes, however, using a combined endpoint of stroke or death from any cause. The power of the study was insufficient to exclude up to a 20% reduction in stroke in the surgically treated group. 34 The generalizability of this study was limited by the lack of female subjects and by the excessive morbidity and mortality in both groups (over 40% incidence of stroke or death in both groups over the 4-year follow-up).

The Asymptomatic Carotid Artery Study (ACAS), 35 funded by the National Institutes of Health, recently reported final results that provide stronger evidence of the benefit of endarterectomy for asymptomatic stenoses. 36 This multicenter study randomized 1662 patients with asymptomatic stenoses greater than 60% (mean stenosis 73%) to endarterectomy plus aspirin or to aspirin alone. Most patients (87%) were over age 60, and more than two thirds had coronary heart disease. The trial was stopped after a median follow-up of 2.7 years. The estimated 5-year risk for ipsilateral stroke or perioperative stroke or death was 5.1% for surgical patients and 11% for medically treated patients, a reduction in cumulative risk of 53% (95% confidence interval, 22 to 72). The absolute reduction in the combined incidence of major ipsilateral stroke, major perioperative stroke, or perioperative death, however, was considerably smaller (estimated 5-year risk of 3.4% in the surgery group vs. 6% in the medical group), not statistically significant (p = 0.13), and evident only in the fifth year of follow-up. Subgroup analyses suggest that endarterectomy may be less effective in women than in men (17% vs. 66% reduction in 5-year event rate), possibly due to higher perioperative complication rates (3.6% in women vs. 1.7% in men) neither of these differences between genders was statistically significant, however. The medical centers participating in this trial had been rigorously evaluated for the quality of patient management, and only surgeons with a perioperative complication rate of less than 3% among asymptomatic patients were allowed to participate. 37 Published studies have reported a perioperative mortality ranging from 1% to 3%, 33,34,38-40 and a perioperative stroke rate ranging between 2% and 10%, depending on patient characteristics and surgical expertise. 13,33,34,39-44 In six prospective trials of endarterectomy published after 1990, perioperative complication rates (stroke and death combined) range from 3% to 8%. 38 Complication rates seem to be lower in asymptomatic patients than in symptomatic patients, however. 38,40 A fifth trial of surgery versus medical management for asymptomatic carotid artery stenosis is still in progress. 11

Antiplatelet therapy with aspirin or ticlopidine offers a second possible intervention to reduce the risk of stroke in patients with asymptomatic carotid artery stenosis. Clinical trials have demonstrated a benefit of aspirin in reducing stroke among symptomatic patients (i.e., in persons with TIAs or stroke), 45-48 but observed no benefit on stroke in a large trial in asymptomatic physicians (prevalence of carotid disease unknown). 49 Among patients with asymptomatic carotid disease, who have a lower risk of ischemic events than do symptomatic patients, chronic aspirin therapy may not provide sufficient benefits to justify the documented risks of hemorrhagic complications (see Chapter 69). A multicenter prospective study comparing aspirin to placebo in asymptomatic patients with >50% carotid stenosis found no difference in stroke rates. 50 Ticlopidine is an alternative to aspirin in patients with risk factors for gastrointestinal hemorrhage, aspirin intolerance, and in patients who continue to have vascular events despite aspirin therapy, but its use is limited by high cost and small risk of neutropenia (approximately 1%). 51,52 The efficacy of ticlopidine in patients with asymptomatic carotid artery stenosis is not known.

Reducing serum lipids may slow the progression of carotid atherosclerosis and reduce clinical events. In a randomized trial enrolling patients with moderately elevated levels of LDL cholesterol (130-190 mg/dL) and early carotid atherosclerosis diagnosed by B-mode ultrasound, lovastatin induced regression of atherosclerosis and reduced total cardiovascular events compared to placebo. 52a Lipid-lowering drug therapy has not been examined specifically for treatment of advanced carotid stenosis, but is generally recommended for patients with high cholesterol and symptomatic vascular disease, based on its ability to reduce coronary heart disease mortality (see Chapter 2). No controlled studies have examined changes in the behavior of patients (e.g., smoking cessation or dietary modification) on learning the results of carotid artery examinations. top link

Recommendations of Other Groups

Although auscultation of the carotid arteries is widely considered a routine component of the physical examination, the Canadian Task Force on the Periodic Health Examination 53 recommended against screening for bruits in asymptomatic persons, based on the poor sensitivity and specificity of cervical bruits as an indicator of significant carotid stenosis. The American Academy of Family Physicians recommends auscultation for carotid bruits in people age 40 and older with risk factors for cerebrovascular or cardiovascular disease, those with neurologic symptoms (e.g., TIA) or those with a history of cardiovascular disease 54 this policy is currently under review. The 1988 guidelines of the American College of Physicians recommend that patients with asymptomatic bruits should not have further diagnostic testing but should be educated about potential symptoms of a TIA in the carotid circulation. 55 In 1988, an ad hoc multidisciplinary consensus panel involved in designing the ACAS study recommended a baseline noninvasive study of the carotid arteries in persons considered at high risk for extracranial carotid arterial disease. 56 In 1992, the Ad Hoc Committee of the Joint Council of the Society for Vascular Surgery and the North American Chapter of the International Society for Vascular Surgery recommended that patients with asymptomatic carotid artery stenosis greater than 75% who are otherwise healthy and have a projected life expectancy more than 5 years should be considered for surgery if the operative morbidity and mortality rates are less than 3%. 57 top link

Discussion

The effectiveness of routine screening and intervention to reduce morbidity from asymptomatic carotid artery disease remains uncertain. The most effective interventions to prevent stroke are smoking cessation and the identification and treatment of hypertension. Although screening will detect some patients with asymptomatic high-grade carotid lesions who may benefit from endarterectomy, such patients account for only a small proportion of all strokes. In addition, there are several reasons to be cautious about undertaking widespread screening in asymptomatic persons on the basis of the current evidence: 11,38,58 the risk of major stroke ipsilateral to stenotic lesions is relatively low without surgery (approximately 1% per year) the absolute reduction in major stroke and death due to surgery over 5 years in ACAS was small and not conclusive surgery may result in other nonfatal complications (cranial nerve injury, MI, etc.) and the low complication rate of the ACAS-selected surgeons is not likely to reflect the typical risk of endarterectomy in the community. If complication rates of surgery are higher or underlying risk of stroke lower than reported for the ACAS study, the risks of surgery for asymptomatic carotid artery disease may outweigh the benefits. Routine screening will also subject some patients without significant carotid disease to the risks of angiography (1% risk of stroke), due to occasional false-positive results of carotid ultrasound.

As a result, it is not yet clear whether widespread screening in the primary care setting will be an effective way to reduce morbidity and mortality from stroke. Noninvasive testing for carotid artery stenosis is expensive (over $150 for carotid duplex or Doppler ultrasound) 20 the cost of screening 50% of the population over age 60 in the U.S. has been estimated at over $7 billion. 58 Auscultation for bruits involves little direct expense and may detect a majority of patients with severe stenosis, but the costs of follow-up testing of all patients with asymptomatic bruits could be substantial. Revised cost-effectiveness analyses of various screening and treatment strategies for asymptomatic carotid disease are under way. Patients most likely to benefit from screening are older men (over age 60) who have other risk factors for stroke, no contraindications to major surgery, and access to high-quality vascular surgery centers. Evidence regarding the effectiveness of antiplatelet drugs for asymptomatic persons is not yet sufficient to make a recommendation.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against screening asymptomatic persons for carotid artery stenosis, using physical examination or carotid ultrasound ("C" recommendation). A recommendation may be made on other grounds to discuss the potential benefits of screening with high-risk patients (e.g., persons over age 60 at high risk for vascular disease), provided that high-quality vascular surgical care is available (surgical morbidity and mortality less than 3%). These other grounds include the increased prevalence of significant carotid disease, and the possible long-term benefit of endarterectomy in patients with asymptomatic stenosis greater than 60% when performed by qualified surgeons. Patients should be screened and counseled about other risk factors for cerebrovascular disease as discussed in other chapters (see Chapters 3 and 54).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Stephen Tabet, MD, MPH, Alfred O. Berg, MD, MPH, and David Atkins, MD, MPH. top link

5. Screening for Peripheral Arterial Disease

Burden of Suffering

Peripheral arterial disease (PAD) becomes increasingly common with age. An estimated 12-17% of the population over age 50 have PAD. 1-4 Increased mortality has been well documented in patients with PAD, a disease that is strongly associated with coronary artery disease and that shares many of the same risk factors. 1,2,5-9 Although only a small proportion of individuals with PAD and intermittent claudication develop skin breakdown or limb loss, pain and associated disability often restrict ambulation and the overall quality of life. 1,5 Persons at increased risk for PAD include cigarette smokers and persons with diabetes mellitus or hypertension. 1,5,9,10 Diabetic PAD is responsible for about 50% of all amputations. 1 top link

Accuracy of Screening Tests

There is evidence that a history of intermittent claudication and the palpation of peripheral pulses are unreliable techniques for the detection of PAD. 1,3,8,11 In one study, a battery of noninvasive tests for PAD was administered to 624 hyperlipidemic subjects aged 38-82. 7 In this population, the sensitivity and positive predictive value of a classic history of claudication were only 54% and 9%, respectively, when compared with the results of formal noninvasive testing. The sensitivity of an abnormal posterior tibial pulse was 71%, the positive predictive value was 48%, and the specificity was 91%. An abnormal dorsalis pedis pulse had a sensitivity of only 50% this artery is congenitally absent in 10-15% of the population. 11 The authors concluded that symptoms and abnormal pulses are not pathognomonic for PAD. 7 Greater accuracy has been achieved with noninvasive testing using Doppler ankle-arm pressure ratios, measurement of reactive hyperemia after exercise, pulse reappearance time, ultrasound duplex scanning, and plethysmography. 1,5,12,13 At present, however, additional data on sensitivity, specificity, and positive predictive value of these tests in asymptomatic populations are needed before noninvasive testing can be considered for routine screening. top link

Effectiveness of Early Detection

Because surgery for PAD is offered only to patients with symptomatic disease, the rationale for the early detection of asymptomatic PAD is that risk factor modification following detection might lower subsequent morbidity and mortality from PAD and systemic atherosclerotic disease. By virtue of its strong association with coronary atherosclerosis and coronary events, 5 the early diagnosis of PAD might also lead to the detection of asymptomatic coronary heart disease. Evidence of these benefits is lacking, however. There has been no research to examine whether the detection and treatment of asymptomatic persons with PAD reduces the morbidity or mortality observed in symptomatic patients. It is clear that certain interventions are beneficial in symptomatic persons. There is evidence, for example, that patients who stop smoking have marked improvement in PAD symptoms and reduced overall cardiovascular mortality. 1,14 Certain antithrombotic drugs may also be of benefit. 15 It is unknown whether treatment measures used in symptomatic patients are beneficial in asymptomatic patients. 1,6 Examples include walking programs, control of weight and blood pressure, correction of elevated serum lipids and glucose, proper foot care, and certain drugs. top link

Recommendations of Other Groups

There are no official recommendations for physicians to screen asymptomatic persons for PAD, although inspection of the skin and palpation of peripheral pulses are often included in the physical examination of the extremities. A recent international workshop sponsored by the American Diabetes Association and American Heart Association recommends annual screening for PAD in patients with diabetes. 16 top link

Discussion

Evidence is lacking that routine screening for PAD in asymptomatic persons is effective in reducing morbidity or mortality from this disease. Many of the behavioral interventions that might be prescribed after detecting PAD -- smoking cessation (Chapter 54), blood pressure control (Chapter 3), and exercise (Chapter 55) -- can be recommended without screening and are of proven value in the prevention of other atherosclerotic conditions, such as coronary heart and cerebrovascular disease. Screening by physical examination in the general population of asymptomatic adults, where the prevalence of PAD is low, is likely to produce a substantial number of false-positive results. Positive screening results will necessitate expensive noninvasive tests and may lead to potentially hazardous invasive tests such as arteriography. At the same time, it is not known whether the early detection of PAD in asymptomatic patients will result in more effective treatment of risk factors or better outcomes.

CLINICAL INTERVENTION

Routine screening for peripheral arterial disease in asymptomatic persons is not recommended ("D" recommendation). Clinicians should screen for hypertension (see Chapter 3) and hypercholesterolemia (Chapter 2), and they should provide appropriate counseling regarding the use of tobacco products (Chapter 54), physical activity (Chapter 55), and nutritional risk factors for atherosclerotic disease (Chapter 56). Clinicians should be alert to symptoms of PAD in persons at increased risk (persons over age 50, smokers, diabetics) and evaluate patients who have clinical evidence of vascular disease.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Stephen Tabet, MD, MPH, and Alfred O. Berg, MD, MPH. top link

6. Screening for Abdominal Aortic Aneurysm

Burden of Suffering

Approximately 8,700 deaths from abdominal aortic aneurysm (AAA) were reported in the U.S. in 1990, 1 but undiagnosed ruptured aneurysms are probably responsible for many additional cases of sudden death in older persons. Once rupture occurs, massive intraabdominal bleeding is usually fatal unless prompt surgery can be performed. A review of six case-series including 703 cases of ruptured aneurysm estimated that only 18% of all patients with ruptured AAA reached a hospital and survived surgery. 2 The large majority of deaths from AAA occur in older men and women men over 60 and women over 70 accounted for 95% of all deaths from AAA in a recent report. 2 Approximately 0.8% of male deaths and 0.3% of female deaths among persons over 65 years of age in the U.S. were attributed to AAA in 1990. 1

An aneurysm is usually defined as a focal dilation of the aorta at least 150% of the normal aortic diameter. 3 Given a normal aortic diameter in older men of 2 cm (range 1.4-3.0 cm), 4 an aortic diameter above 3 cm usually indicates an aneurysm. The pathogenesis of aneurysms is not completely understood, but well-established risk factors for AAA include increasing age, male gender, and family history of aneurysms. 3 The male to female ratio for death from AAA is 11:1 between ages 60 and 64 and narrows to 3:1 between ages 85 and 90. 5 Other possible risk factors include tobacco use, hypertension, peripheral vascular disease, and presence of peripheral arterial aneurysms. 3,6-8 In populations over age 60, estimates of prevalence range from 2% to 8% and increase with age. 8-10 A recent community study in England screened nearly 9,000 men and women aged 65-80 with ultrasound: 7% of men and 1% of women had an aneurysm at least 3 cm in diameter. 11 Among all patients, only 0.6% had aneurysms 5 cm or larger, and only 0.3% had aneurysms of 6 cm or more. 11 There are only limited data on the incidence of new aneurysms in a previously screened population. In one study, 189 men who had a normal ultrasound at age 65-66 years were rescreened 5 years later only 2 (1%) had an aortic diameter greater than 3 cm. 12

Few aneurysms less than 4 cm in diameter will rupture. 2,13,14 Overall, 3-6% of aneurysms greater than 4 cm in diameter will rupture annually, 14,15 but the rate of rupture is directly related to the size of the aneurysm. The natural history of most aneurysms is one of gradual enlargement growth rates have been estimated to average 0.2 cm/year for aneurysms under 4 cm, and 0.5 cm/year for those over 6 cm. 8 top link

Efficacy of Screening Tests

Two tests, palpation of the abdomen during physical examination and abdominal ultrasound, have been seriously advocated as screening tests for AAA. Other tests that can detect aneurysms -- plain radiographs of the abdomen, computed tomography (CT), and magnetic resonance imaging (MRI) -- are either not sensitive enough or are too expensive to be practical for screening in asymptomatic populations.

The accuracy of physical examination in detecting AAA is not completely known. Large aneurysms are easier to detect than small ones, and it is easier to detect aneurysms in thin people. Estimates of the sensitivity of physical examination in detecting AAA range from 22% to 96%. 17,18 The high sensitivity obtained in series of preoperative cases probably represents the preponderance of large aneurysms in this population. Lederle reported a sensitivity of 50% and a positive predictive value of 35% in a high-risk population screened in an internal medicine clinic (9% prevalence of AAA). 19 Four of five aneurysms greater than 5 cm diameter in this series were detected by palpation. In contrast, Allen reported a 22% sensitivity, 94% specificity, and positive predictive value of 17% in a population with a 5% prevalence of aneurysms. 17 No large-scale community-based studies of screening for AAA by physical examination have been reported.

Ultrasound is an extremely sensitive and specific test for AAA of all sizes, at least in cases where the diagnosis and size of the aneurysm can be confirmed at surgery. Reported sensitivities range from 82% to 99%, with sensitivity approaching 100% in some series of patients with a pulsatile mass. 16 In a small proportion of patients, visualization of the aorta will be inadequate due to obesity, bowel gas, or periaortic disease. Although ultrasound screening is noninvasive and relatively simple, compliance with invitations to be screened has been variable (50-64% attendance) in community screening trials. 7,11 Diagnostic abdominal ultrasound is currently expensive in the U.S. ($100-$175 per examination), but screening for AAA alone could probably be performed much more quickly and cheaply. 2 top link

Effectiveness of Early Detection

No prospective or retrospective controlled trials of screening for AAA that include outcome data have yet been reported. A pilot trial in England that offered screening at random to older subjects has enrolled 15,000 men and women, but it may not have sufficient power to prove a benefit in mortality. 8 The difficulty of identifying all deaths caused by AAA, combined with varying compliance with screening, may make it difficult to conduct definitive controlled trials of AAA screening. 8,20

Surgical resection and repair with an artificial graft is a very effective treatment for AAA. Among 13 large case-series of surgery for nonruptured aneurysms published since 1980, overall surgical mortality was 4% (range 1.4-6.5%) mortality during emergency surgery for rupture is much higher, averaging 49% (range 23-69%). 3 Mortality after elective surgery is often due to underlying cardiovascular disease in patients with AAA. If the patient survives the immediate postoperative period, long-term survival is comparable to similar persons without aneurysms, but late postoperative complications (graft infection, graft occlusion, and aortoenteric fistula) may result in additional deaths and morbidity. 3 The high prevalence of cardiovascular disease in patients with AAA and competing causes of morbidity and mortality in older patients may diminish the benefit of detecting asymptomatic aneurysms in older populations. Of 124 patients aged 65-80 who had large aneurysms detected in a community screening program, 27% were deemed unfit for surgery or died of other causes before surgery. 11 In recent series, up to 40% of patients undergoing surgery for nonruptured aneurysms had died within 6 years after surgery, primarily due to coronary heart disease or stroke. 3,21

Risk of elective surgery must be balanced against the risk of rupture of an untreated aneurysm, which is directly related to aneurysm size. Most vascular surgeons currently recommend surgery for asymptomatic aneurysms 5 cm or larger, since the risk of rupture (25-41% over 5 years) is substantially higher than risks from surgery. 3 While more aggressive management of smaller aneurysms (4-5 cm) has been recommended by some, 22 others have suggested that asymptomatic, slow-growing aneurysms under 6 cm can be successfully followed by serial ultrasound. 2,11 A large community-based screening program, which employed this conservative strategy over 8 years, observed two cases of rupture among 29 subjects with aneurysms 5-5.9 cm, for a rate of 1.5%/year. 11,23 A model fitted to data from 13 studies of untreated aneurysms supports a relatively low risk of rupture in aneurysms less than 6 cm estimated annual rates of rupture for aneurysms 4-4.9, 5-5.9, 6-6.9, and over 7 cm were 1%, 3%, 9%, and 25%, respectively. 2 These data, which are based largely on incidentally detected cases, may not reflect accurately the prognosis of asymptomatic aneurysms discovered by routine ultrasound screening. Furthermore, decisions to forgo surgery in patients with larger aneurysms were likely to have been influenced by factors (e.g., age, comorbidity, lack of symptoms) that may have independently influenced the risk of rupture. Trials are currently ongoing to determine the optimal management of patients with AAA that are 4-5.4 cm in size. 24 top link

Recommendations of Other Groups

The Canadian Task Force on the Periodic Health Examination 7 concluded that there was poor evidence to include or exclude screening for AAA in the periodic health examination of asymptomatic individuals. They noted that targeted physical examination may be considered prudent for men over 60, however, and that ultrasound screening could be considered in selected high-risk men over 60: smokers with other risk factors for AAA (hypertension, claudication, family history, or other vascular disease). top link

Discussion

No prospective or retrospective controlled trials of screening for AAA have yet been reported that include data on mortality or other clinical outcomes. At present, the only effective intervention available for patients with aneurysms is major abdominal surgery. Until further data are available from population-based screening trials, it is uncertain whether the projected benefit from preventing ruptured aneurysms is sufficient to justify the costs of widespread screening and the potential risks from increased surgery. While there is general consensus that resection is indicated for incidentally discovered, large aneurysms (6 cm or larger), these are relatively uncommon in the general population the appropriate management of smaller (4-5 cm) aneurysms remains controversial. Data from older case series may not be a reliable guide to the natural history of asymptomatic aneurysms discovered by ultrasound. For many older patients with small aneurysms, the risk of dying from coronary heart disease or stroke is much higher than the risk from ruptured AAA. 21

The benefits of routine screening will depend on other parameters that merit further research: the proportion of clinically important aneurysms that are detected without screening the sensitivity and specificity of abdominal palpation for detecting AAA in the primary care setting risk factors for rapid growth or rupture of AAA and long-term morbidity of patients undergoing elective surgery. Patient compliance with recommendations for follow-up or surgery will also directly influence the ability of screening to prevent ruptured aneurysms.

A recent cost-effectiveness analysis compared different screening protocols in a high-risk population of men between 60 and 80 years of age. 25 The authors concluded that a single screen for AAA by abdominal palpation might be considered cost-effective, but it would be of small clinical benefit (average increase in life expectancy of 0.002 year). A single screen with ultrasound was at the high end of the cost range that might be considered cost-effective ($41,550/year of life gained), and repeat screening was not cost-effective. They noted that, due to the variable quality of the available data, screening for AAA could prove to be very cost-effective or could actually cause a net harm. If low-cost screening ultrasound were available (vs. $150 average charge for diagnostic ultrasound in the U.S.), ultrasound screening would be much more cost-effective, and preferable to physical examination. 25

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening for abdominal aortic aneurysms with abdominal palpation or ultrasound ("C" recommendation). Recommendations against routine ultrasound screening in the general population may be made on other grounds, such as the low prevalence of clinically significant AAA and the high cost of screening. Although direct evidence that screening for AAA reduces mortality or morbidity is not available in any population, clinicians may decide to screen selected high-risk patients, due to the significant burden of disease and the availability of effective surgical treatment for large aneurysms. Men over age 60 who have other risk factors (e.g., vascular disease, family history of AAA, hypertension, or smoking) are at highest risk for AAA and death due to ruptured aneurysms. Screening is not indicated for patients who are not appropriate candidates for major abdominal surgery (e.g., those with severe cardiac or pulmonary disease). If screening is performed, it is not certain whether ultrasound or abdominal palpation is the preferred test. Abdominal palpation is less expensive but also less sensitive than ultrasound. Cost-effectiveness analysis suggests that repeat examination of individuals with a previous normal ultrasound is not indicated. 24

The draft of this chapter was prepared for the U.S. Preventive Services Task Force by Paul S. Frame, MD, and David Atkins, MD, MPH. top link

7. Screening for Breast Cancer

Burden of Suffering

In the U.S. in 1995, there were an estimated 182,000 new cases of breast cancer diagnosed and 46,000 deaths from this disease in women. 1 Approximately 32% of all newly diagnosed cancers in women are cancers of the breast, the most common cancer diagnosed in women. 1 The annual incidence of breast cancer increased 55% between 1950 and 1991. 2 The incidence in women during the period 1987-1991 was 110/100,000. 2 In 1992, the annual age-adjusted mortality from breast cancer was 22/100,000 women. 3 The age-adjusted mortality rate from breast cancer has been relatively stable over the period from 1930 to the present. 1,2 For women, the estimated lifetime risk of dying from breast cancer is 3.6%. 2 Breast cancer resulted in 2.2 years of potential life lost before age 65 per 1,000 women under age 65 in the U.S. during 1986-1988. 4 This rate was surpassed only by deaths resulting from motor vehicle injury and infections. Breast cancer is the leading contributor to cancer mortality in women aged 15-54, 1 although 48% of new breast cancer cases and 56% of breast cancer deaths occur in women age 65 and over. 2 As the large number of women in the "baby boom" generation age, the number of breast cancer cases and deaths will increase substantially unless age-specific incidence and mortality rates decline.

Important risk factors for breast cancer include female gender, residence in North America or northern Europe, and older age. 5 In American women, the annual incidence of breast cancer increases with age: 127 cases/100,000 for women aged 40-44 229/100,000 for women aged 50-54 348/100,000 for women aged 60-64 and 450/100,000 for women aged 70-74. 2 The risk for a woman with a family history of breast cancer in a first-degree relative is increased about 2-3-fold, and for women under 50 it is highest when the relative had premenopausally diagnosed breast cancer. 6-9 Women with previous breast cancer or carcinoma in situ and women with atypical hyperplasia on breast biopsy are also at significantly increased risk. 6,7,10-12 Other factors associated with increased breast cancer risk include a history of proliferative breast lesions without atypia on breast biopsy, late age at first pregnancy, nulliparity, high socioeconomic status, and a history of exposure to high-dose radiation. 6,7,10-12 Associations between breast cancer and oral contraceptives, long-term estrogen replacement therapy, obesity, and a diet high in fat have been suggested, but causal relationships have not been established. 6,7,13,14 top link

Accuracy of Screening Tests

The three screening tests usually considered for early detection of breast cancer are clinical breast examination (CBE), x-ray mammography, and breast self-examination (BSE). Estimates of the sensitivity and specificity of these maneuvers depend on a number of factors, including the size of the lesion, the characteristics of the breast being examined, the age of the patient, the extent of follow-up to identify false negatives, the skill and experience of the examiner or radiographic interpreter, and (in the case of mammography) the quality of the mammogram. Because multiple clinical trials have demonstrated the effectiveness of screening, measures of screening test performance (such as sensitivity and specificity) are primarily helpful in comparing trials, screening programs, and community practice. Uniform definitions, however, are necessary for such comparisons. For example, different studies may use similar definitions of sensitivity, such as the number of screen-detected cancers compared to the total of screen-detected cancers plus interval cancers, but one may use a fixed interval (e.g., 12 months) 15 and another a variable interval (e.g., time to next screen), 16 making direct comparisons difficult. The ability to detect interval cancers may also vary and will affect such estimates.

A review 17 of the current clinical trial data, published and unpublished, summarized screening test performance for mammography using uniform definitions. Sensitivity of mammography did not dramatically differ across the trials. Estimates from three Swedish trials using mammography alone averaged about 75%, while estimates for mammography combined with CBE ranged from 75% in the Health Insurance Plan of Greater New York (HIP) to 88% in the Edinburgh trial and the Canadian National Breast Cancer Screening Study in women aged 50-59 (NBSS 2). Specificity estimates ranged from 98.5% in the HIP trial to 83% in the Canadian NBSS 2. Sensitivity estimates for mammography alone and for combined screening with CBE have generally been 10-15% lower for women aged 40-49 compared with women greater than age 50. 15,17-19 Preliminary results from two North American demonstration projects suggest improved sensitivity of mammography, especially for women in their forties, with current mammographic techniques. 20 Significant variations in interpreter performance have also been observed. 21-23 In the Canadian trials, agreement was about 50% beyond that attributable to chance between radiologists at five screening centers and a single reference radiologist. 21

The effectiveness of CBE alone has not been evaluated directly, but comparisons of the sensitivity and specificity of this maneuver to that of mammography can be considered. The Canadian NBSS 2 was designed to assess the incremental value of mammography above a careful, thorough (5-10 minutes) CBE. 24,25 Preliminary results showing no incremental benefit highlighted the fact that higher sensitivity (88% for mammography plus CBE vs. 63% for CBE alone) 17 may not guarantee improved effectiveness. Specificity was comparable or slightly better for CBE alone. Sensitivity of CBE for women aged 40-49 (Canadian NBSS 1) was about 10% lower at initial screen compared to the estimate for women aged 50-59 (Canadian NBSS 2). 26 Specificity estimates were similarly lower for younger women.

Data regarding the accuracy of BSE are extremely limited. One report calculated an upper limit of sensitivity ranging from 12 to 25% by assuming all interval cases in the clinical trials were detected by BSE. 17 Using a similar approach, the overall sensitivity of BSE alone was estimated to be 26% in women also screened by mammography and CBE in the Breast Cancer Detection Demonstration Project (BCDDP). 27 Estimated BSE sensitivity decreased with age, from 41% for women aged 35-39 to 21% for women aged 60-74. 27 Thus, as currently practiced, BSE appears to be a less sensitive form of screening than is CBE or mammography, and its specificity remains uncertain. The sensitivity of BSE can be improved by training, as measured by the proportion of benign lumps 28 detected on human models and artificial lumps 29 on silicone breast models, although whether this improved detection on models translates into improved personal BSE performance is unknown.

Adverse effects of screening tests are an important consideration. False-positive tests, resulting from the effort to maximize disease detection, may have negative consequences including unnecessary diagnostic tests. In the Canadian trials there were 7-10% false positives from combined screening with mammography and CBE among women aged 40-49 and 4.5-8% among those aged 50-59. 24,30 In a study of the yield of a first mammographic screening among women, half as many cancers per 1,000 first screening mammograms were diagnosed in women aged 40-49 (3/1,000) compared to women aged 50-59 (6/1,000). 31 Yet, women aged 40-49 underwent twice as many diagnostic tests per cancer detected compared to women aged 50-59 (43.9 vs. 21.9). Women aged 60-69 had a higher yield from screening, with 13 breast cancers diagnosed per 1,000 first screening mammograms and 10.2 diagnostic tests performed per cancer detected.

Mammographic screening may also adversely affect psychological well-being. Increased anxiety about breast cancer after a false-positive mammogram has been reported both at short- and long-term follow-up in studies surveying groups of screened women. 32,33 No impact on compliance in obtaining future screening examinations was observed, however. Women who underwent a surgical biopsy as a result of a false-positive screening mammogram were more likely to report their workup as a stressful experience than were those who did not have a biopsy. 32

Excess breast cancers in populations that received doses of ionizing radiation significantly greater than currently delivered by mammography, such as survivors from atomic bombing in Japan 34 and patients with benign breast disease, 35 have raised concerns about the potential radiation risk from screening mammograms. There is no direct evidence of an increased risk of breast cancer from mammographic screening, however. Assuming a mean breast dose of 0.1 rad from a mammogram and extrapolating from higher doses of radiation, modeling suggests that in a group of 100,000 women receiving annual screening from ages 50 to 75, 12.9 years would be lost due to radiogenic cancers but 12,623 years would be gained through a 20% reduction in breast cancer mortality as a result of that screening. 34

Fewer data are available regarding adverse effects associated with CBE and BSE. A dramatic increase in false-positives was observed after instruction in BSE in a nonrandomized controlled trial evaluating performance on human models, 28 although no increase was found in a randomized controlled trial evaluating performance on silicone breast models. 29 The latter study also assessed the impact of training on variables other than detection performance on models. Adverse effects, such as unnecessary physician visits, heightened anxiety levels, or increased radiographic and surgical procedures, were not observed. 29 top link

Effectiveness of Early Detection

Seven randomized controlled trials 16,30,36-40 have evaluated the effectiveness of screening for breast cancer in women by either mammography alone or combined with CBE compared to no periodic screening. The age of participants at date of first invitation ranged from 40 to 74. The six trials 16,36-40 that included women aged >=50 showed a reduction in breast cancer mortality of 20-30% in the intervention group. The reduction was statistically significant in the Health Insurance Plan of Greater New York (HIP), 37 the Swedish two-county trials, 16 an overview of the Swedish trials, 40 and two meta-analyses of the trials. 41,42

The results of these six trials including women aged >=50 have convincingly demonstrated the effectiveness of mammographic screening (with or without CBE) for breast cancer in women aged 50-69. The HIP trial screened women aged 40-64 with annual CBE and two-view mammography. 37 For women who were over age 50 at the time of entry into the study, mortality from breast cancer in the intervention group was more than 50% lower than in the control group at 5 years, decreasing to a 21% difference after 18 years of follow-up. The Edinburgh trial 36 screened women aged 45-64 from 84 general medicine practices with two-view mammography and CBE on the initial screen followed by annual CBE and biennial single-view mammography. Preliminary results at seven years found a relative risk of 0.80 (95% confidence interval [CI], 0.54 to 1.17) for women aged 50 and older at entry. The results from 10-year follow-up showed little change. 17 An overview pooled the data through 1989 from the four Swedish randomized controlled trials of breast cancer screening with mammography alone. 40 All women diagnosed with breast cancer before randomization were excluded and endpoints were independently reviewed. Breast cancer mortality was reduced by about 30% for women aged 50-69 at entry using an endpoint of breast cancer as the underlying cause of death. A meta-analysis that included the most recently published results of these trials reported a 23% reduction in breast cancer mortality for women aged 50 and older. 42 A meta-analysis of European case-control studies done within screening mammography programs also reported significantly reduced breast cancer mortality among women aged 50 and older. 42

There are few data regarding the optimal periodicity of screening in this age group. Although an annual interval has been recommended by many groups, an analysis of data from the Swedish two-county study found little evidence that an annual interval would confer greater benefit than screening every 2 years for women over the age of 50. 19 This trial used mammography alone, but the reduction in breast cancer mortality was similar to that seen in the trials combining CBE with mammography. 36,37 The similar mortality reductions found in screening trials using periodicities ranging from 12 to 33 months in women aged >=50 suggests that biennial screening intervals are as effective as annual intervals. In a meta-analysis of the trials evaluating screening mammography, 42 the estimated reduction in breast cancer mortality was the same (23%) for screening intervals of 12 months and 18-33 months in women aged 50-74.

There is limited and conflicting evidence regarding the benefit of screening women aged 70-74. The Swedish two-county trial and BCDDP time series included women up to age 74 at entry, and each found a reduction of breast cancer mortality for the intervention group as a whole. 16,44 The Swedish overview, however, reported a relative risk of 0.98 (95% CI, 0.63 to 1.53) at 12-year follow-up for the age subgroup 70-74. 40 The wide confidence interval, due to small numbers in this subgroup analysis, does not preclude the possibility of a substantial benefit from screening in this age group. No clinical trials have evaluated screening in women over 74 years of age at enrollment.

Although all six trials found a benefit of screening among the total group of enrolled women who were 40-74 years at entry, 16,36-40 there is uncertainty about the effectiveness of screening women between the ages of 40 and 49. The Canadian NBSS 1 was specifically designed to address this uncertainty. 30 This trial compared combined annual mammography and CBE to an initial CBE among women aged 40-49 at entry. At 7-year follow-up, no benefit of annual screening was observed (RR = 1.36 95% CI, 0.84 to 2.21). This Canadian trial has been the subject of much criticism. 45-47 Possible irregularities with randomization have been refuted by its investigators. 48 An independent review is planned by the National Cancer Institute of Canada to determine whether the randomization was compromised. Although mammography quality issues have also been a concern, there is little evidence to suggest that the practices were inconsistent with the standards of the other clinical trials or community practice at the time of the study. 48 In addition, improvement in mammographic quality over the course of the study period was noted by both inside and outside observers. 48 The proportion of controls receiving mammography, 26%, can be compared to available estimates of 13% in the two-county trial, 24% in the Malmo trial (35% for women 45-49), and 24% in the Stockholm trial. 38,39,49 This contamination may nevertheless have reduced the trial's ability to detect a benefit from the screening intervention. An excess of node-positive cancers detected in the intervention group raised concerns about subject randomization. 30 While this may have been the result of chance, other contributing factors suggested by the investigators include under-ascertainment secondary to lower surgical dissection rates in the control group and incomplete breast cancer ascertainment at preliminary follow-up (although these possibilities are unlikely to account for all of the observed excess). 48 Although the effect of these factors should diminish with long-term follow-up, the results are unlikely to achieve statistical significance because sample size calculations were based on an estimated 40% reduction in breast cancer mortality, which is greater than the typical reduction in mortality observed in the other six trials that included women in this age group. 30

Subgroup analyses of the other trials that included women under 50 have yielded conflicting evidence regarding the benefit of screening women aged 40-49. No benefit was observed in the Stockholm trial or in one arm of the two-county trial, while the remaining trials reported nonsignificant benefits of about 22% or more. 17,50 One meta-analysis, which pooled the results from 7-year follow-up of six published clinical trials without adjustment for variation in screening method or interval, found no reduction in breast cancer mortality for women in their forties (RR = 1.08 95% CI, 0.85 to 1.39). 41 When the Canadian trial was excluded from the analysis, the estimate changed little (RR= 0.99 95% CI, 0.74 to 1.32). The overview of the Swedish trials found a nonsignificant 13% reduction (RR= 0.87 95% CI, 0.63 to 1.20) only after 8-12 years of follow-up in this age group. 40 More recent meta-analyses of published mammography trial data reported nonsignificant 8-10% reductions in breast cancer mortality in women aged 40-49. 42,43 One meta-analysis reported a significant benefit for women in this age group when unpublished data were included and the Canadian trial was excluded. 43 Longer duration of follow-up was associated with a greater reduction in mortality, although this finding may have been due to chance. 42 Thus, there is conflicting evidence from clinical trials and meta-analyses, primarily based on subgroup analyses, regarding the benefit of screening women aged 40-49. An ongoing British trial is evaluating the effectiveness of annual mammography screening in women enrolled at age 40 or 41.

A recent analysis of data by tumor size, nodal status, and stage from the BCDDP, a U.S. screening project using annual two-view mammography and CBE, suggests comparable 14-year survival rates for women 40-49 and women 50-59. 51 A similar analysis of breast cancers detected in the Swedish two-county trial confirms this finding. 52 Based on these data, time series comparisons of survival, frequency of interval cancers in the two-county trial, and subgroup analysis of available clinical trial data, some experts have suggested that annual screening intervals may be necessary to achieve a reduction in breast cancer mortality from screening for women aged 40-49. 19,20,52 In a meta-analysis of published trial results, however, the estimated mortality reduction from screening women in this age group was similar for 12- and 18-33-month screening intervals (1% and 12%, respectively). 42

There is no direct evidence that assesses the effectiveness of CBE alone compared to no screening. Modeling studies of the HIP trial estimated that two thirds of the effectiveness of the combined screening may have been a result of CBE. 53-56 The Canadian NBSS 2 was designed to test the incremental value of annual mammography over a careful annual CBE among women aged 50-59 at study entry. 24 At 7-year follow-up, there was no difference in breast cancer mortality for the group receiving combined screening compared to CBE alone (RR = 0.97 95% CI, 0.62 to 1.52). This result suggests that thorough CBE may be as effective as mammography for screening in this age group. The confidence interval is wide, however, and substantial benefit or harm from screening is not excluded by the preliminary data. Concerns regarding the early quality of mammography of Canadian NBSS 1 also apply to this trial. 48 Long-term follow-up and additional studies are needed to confirm this apparent lack of an incremental benefit of mammography above a careful, thorough annual CBE. It is also unclear whether CBE adds benefit to screening with mammography. A meta-analysis of mammography trial results reported similar reductions in breast cancer mortality with and without the addition of CBE. 42

Evidence for effectiveness of BSE alone is also limited. In the United Kingdom Trial of Early Detection of Breast Cancer, a nonrandomized community trial, 40-50% of women living in two districts participated in BSE instruction that included a short film and a lecture by a specially trained health provider. 57 At 7-year follow-up, there was no reduction in breast cancer mortality in the BSE communities compared with the control districts. Baseline comparability of intervention and control districts, treatment variation by community, and contamination by other screening modalities were not assessed, however. A World Health Organization (WHO) population-based randomized controlled trial in Leningrad comparing formal BSE instruction to no intervention in women aged 40-64 has reported increases in physician visits, referrals for further screening tests, and excisional biopsies in the intervention group at 5-year follow-up. 58 Breast cancer patients in the two groups did not differ in number, size, or nodal status of their tumors. Completeness of endpoint assessment is a concern in this study, given the lack of a national tumor registry. Follow-up through 1999 is planned for reporting mortality results. In a case-control study of women who had been diagnosed with advanced stage (TNM III or IV) breast cancer, there was no association between disease status and self-reported BSE. 59 Proficiency in practicing BSE, however, was reported as poor by both cases and controls. For the small group of women reporting thorough BSE compared to all others, the relative risk was 0.54 (95% CI, 0.30 to 0.98). A meta-analysis of pooled data from 12 descriptive studies found that women who practiced BSE before their illness were less likely to have a tumor of 2.0 cm or more in diameter or to have evidence of extension to lymph nodes. 60 The studies from which these data were obtained, however, suffer from important design limitations and provide little information on clinical outcome (i.e., breast cancer mortality). Retrospective studies of the effectiveness of BSE have produced mixed results. 27,61-63 top link

Recommendations of Other Groups

The American Cancer Society (ACS), 64 American College of Radiology, 65 American Medical Association, 66 American College of Obstetricians and Gynecologists (ACOG), 67 and a number of other organizations 65 recommend screening with mammography every 1-2 years and annual CBE beginning at the age of 40, and annual mammography and CBE beginning at age 50.

The American Academy of Family Physicians (AAFP) recommends CBE every 1-3 years for women aged 30-39 and annually for those aged 40 and older, and mammography annually beginning at age 50 68 these recommendations are currently under review. The American College of Physicians (ACP) recommends screening mammography every 2 years for women aged 50-74 and recommends against mammograms for women under 50 or over 75 years and baseline mammograms. 69 The ACP makes the same recommendations for high-risk women, unless the woman expresses great anxiety about breast cancer or insists on more intensive screening. The Canadian Task Force on the Periodic Health Examination recommends annual CBE and mammography for women aged 50-69 and recommends against mammograms in women under 50. 70 The National Cancer Institute states there is a general consensus among experts that routine mammography and CBE every 1-2 years in women aged 50 and over can reduce breast cancer mortality, and that randomized clinical trials have not shown a statistically significant reduction in mortality for women under the age of 50. 71

Organizations that presently recommend routine teaching of BSE include the AAFP, 68 ACOG, 67 and ACS. 64 The recommendations of the AAFP are currently under review. top link

Discussion

At this time, there is little doubt that breast cancer screening by mammography with or without CBE has the potential of reducing mortality from breast cancer for women aged 50 through about 70. The benefit derived from biennial screening appears to be quite similar to the benefit derived from annual screening. Given this similarity in effectiveness, biennial screening is likely to have the added benefit of increased cost-effectiveness. The incremental value of CBE above mammography or vice versa is uncertain, although the Canadian NBSS 2 24 suggests that careful CBE may be as effective as mammography.

Evidence does not establish a clear benefit from screening in women aged 40-49. Only the Canadian NBSS 1 30 was designed to test the effectiveness of screening in this age group, however, and none of the trials had adequate power for subgroup analysis. If screening is in fact ineffective in younger women, one possible explanation is a lower sensitivity of mammography in younger women (see Accuracy of Screening Tests ). Other possibilities include suboptimal screening intervals, differential (less aggressive) treatment offered to women with mammographically detected cancer, and varying biologic characteristics of breast tumors. 52,72 The Swedish overview, 40 HIP, 37 and Edinburgh 36 trials suggest some benefit in women aged 40-49 after 8-12 years of follow-up, but it is possible that the delayed benefit is due to screening women in their fifties who entered the trials in their middle to late forties. 72a The final results of the Canadian NBSS 1 may provide important information. An ongoing British trial and a proposed trial in Europe which will enroll women only in their early forties and compare mammography to no screening could also clarify this issue. 73 Until further information is available, it is unclear whether any potential improvement in breast cancer mortality achieved by screening women aged 40-49 is of sufficient magnitude to justify the potential adverse effects that may occur as a result of widespread screening.

Because breast cancer incidence increases with age, the burden of suffering due to breast cancer in elderly women is substantial. In addition, there is no evidence (as there is in younger women) that sensitivity of mammography in older women is not comparable to that in women aged 50-69. This is an age group, moreover, in which underutilization of breast cancer screening is common. 74-76 In a decision analysis of the utility of screening women over 65 for breast cancer, screening saved lives at all ages, but the savings decreased substantially with increasing age and comorbidities. 77 In the oldest women, those aged >=85, short-term morbidity such as anxiety or discomfort from the screening may have outweighed the small benefits. Until more definitive data become available for elderly women, it is reasonable to concentrate the large effort and expense associated with screening mammography on women in the age group for which benefit has been most clearly demonstrated: those aged 50-69. Screening women aged 70 and older might be considered on an individual basis, depending on general health and other considerations (e.g., personal preference of the patient).

The age range of 50-69 years, for which mammography has been proven effective, is to a large extent based on artificial cutpoints chosen for study purposes rather than on biologic cutpoints above or below which the ratio of benefits to risks sharply decreases. Both the incidence of breast cancer and the sensitivity of mammography increase with age. Thus, it is logical that women in their seventies, for whom only limited clinical trial experience is available, benefit from breast cancer screening. For women aged <50 years, evidence has been insufficient to establish a clear benefit from breast cancer screening. This age cutpoint may be a marker for biologic changes that occur with age, especially menopause. It is therefore plausible that women in their late forties, particularly postmenopausal women, might derive some intermediate benefit from screening. The risks and benefits of mammography and CBE may be best considered as changing on an age continuum rather than at a specific chronologic age. Guidelines for breast cancer screening should be interpreted with this in mind.

No large study has quantitated the effectiveness of breast cancer screening by either CBE or mammography for women who are at higher risk of developing breast cancer than the general population. The increased incidence of disease in high-risk women increases the positive predictive value (PPV) of screening tests used in this group. For example, in a community screening program, the PPV of mammography was increased 2-3-fold for women with a family history of breast cancer. 31 This is an especially important consideration for women under 50, in whom the benefit of screening has not been established for the general population. There may be a benefit from screening younger women in high-risk groups, but studies confirming this effect are lacking. Nevertheless, given their increased burden of suffering, screening high-risk women under age 50 may be considered on an individual basis for women who express a strong preference for such screening.

Data regarding the effectiveness of BSE are extremely limited, and the accuracy of BSE as currently practiced appears to be considerably inferior to that of CBE and mammography. False-positive BSE, especially among younger women in whom breast cancer is uncommon, could lead to unnecessary anxiety and diagnostic evaluation, although a small randomized clinical trial 29 did not find such adverse effects. A point also worth consideration is that time devoted to teaching BSE may reduce time available for prevention efforts with proven effectiveness. Given the present state of knowledge and the potential adverse effects and opportunity cost, a recommendation for or against inclusion of teaching BSE during the periodic health examination cannot be made.

CLINICAL INTERVENTION

Screening for breast cancer every 1-2 years, with mammography alone or mammography and annual clinical breast examination (CBE), is recommended for women aged 50-69 ("A" recommendation). Clinicians should refer patients to mammographers who use low-dose equipment and adhere to high standards of quality control. Such standards have recently been established by the Mammography Quality Standards Act, a federal law mandating that all mammography sites in the U.S. be accredited through a process approved by the Department of Health and Human Services. 78 There is insufficient evidence to recommend annual CBE alone for women aged 50-69 ("C" recommendation). For women aged 40-49, there is conflicting evidence of fair to good quality regarding clinical benefit from mammography with or without CBE, and insufficient evidence regarding benefit from CBE alone therefore, recommendations for or against routine mammography or CBE cannot be made based on the current evidence ("C" recommendation). There is no evidence specifically evaluating mammography or CBE in high-risk women under age 50 recommendations for screening such women may be made on other grounds, including patient preference, high burden of suffering, and the higher PPV of screening, which would lead to fewer false positives than are likely to occur from screening women of average risk in this age group. There is limited and conflicting evidence regarding clinical benefit of mammography or CBE for women aged 70-74 and no evidence regarding benefit for women over age 75 however, recommendations for screening women aged 70 and over who have a reasonable life expectancy may be made based on other grounds, such as the high burden of suffering in this age group and the lack of evidence of differences in mammogram test characteristics in older women versus those aged 50-69 ("C" recommendation). There is insufficient evidence to recommend for or against teaching BSE in the periodic health examination ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Marisa Moore, MD, MPH, and Carolyn DiGuiseppi, MD, MPH. top link

8. Screening for Colorectal Cancer

Burden of Suffering

Colorectal cancer is the second most common form of cancer in the U.S. and has the second highest mortality rate, accounting for about 140,000 new cases and about 55,000 deaths each year. 1 An individual's lifetime risk of dying of colorectal cancer in the U.S. has been estimated to be 2.6%. 2 About 60% of patients with colorectal cancer have regional or distant metastases at the time of diagnosis. 1 Estimated 5-year survival is 91% in persons with localized disease, 60% in persons with regional spread, and only 6% in those with distant metastases. 2 The average patient dying of colorectal cancer loses 13 years of life. 2 In addition to the mortality associated with colorectal cancer, this disease and its treatment -- surgical resection, colostomies, chemotherapy, and radiotherapy -- can produce significant morbidity. Persons at highest risk of colorectal cancer include those with uncommon familial syndromes (i.e., hereditary polyposis and hereditary nonpolyposis colorectal cancer [HNPCC]) and persons with longstanding ulcerative colitis. 3,4 Familial syndromes are estimated to account for 6% of all colorectal cancers, 3 and various genetic mutations associated with these syndromes have been identified. 4a Other principal risk factors include a history of colorectal cancer or adenomas in a first-degree relative, a personal history of large adenomatous polyps or colorectal cancer, and a prior diagnosis of endometrial, ovarian, or breast cancer. In an analysis of two large cohorts involving over 840,000 patient-years of follow-up, a family history of colorectal cancer was associated with a significant increase in risk in younger persons (1.7-4-fold increase between ages 40 and 60), but was not associated with a significantly increased risk in persons after age 60 4b risk was higher in persons with more than one affected relative. The absolute increase in lifetime risk in persons with a family history was modest, however: an estimated cumulative incidence of colorectal cancer by age 65 of 4% vs. 3% in persons without a family history. 4b Diets high in fat or low in fiber may also increase the risk of colorectal cancer. 3 top link

Accuracy of Screening Tests

The principal screening tests for detecting colorectal cancer in asymp-tomatic persons are the digital rectal examination, FOBT, and sigmoidoscopy. Less frequently mentioned screening tests include barium enema and colonoscopy, which have been advocated primarily for high-risk groups. The digital rectal examination is of limited value as a screening test for colorectal cancer. The examining finger, which is only 7-8 cm long, has limited access even to the rectal mucosa, which is 11 cm in length. A negative digital rectal examination provides little reassurance that the patient is free of colorectal cancer because fewer than 10% of colorectal cancers can be palpated by the examining finger. 3

A second screening maneuver is FOBT. The reported sensitivity and specificity of FOBT for detecting colorectal cancer in asymptomatic persons are 26-92% and 90-99%, respectively (usually based on two samples from three different stool specimens), with the widely varying estimates reflecting differences in study designs. 5-10 Positive reactions on guaiac impregnated cards, the most common form of testing, can signal the presence of bleeding from premalignant adenomas and early-stage colorectal cancers. The guaiac test can also produce false-positive results, however. The ingestion of foods containing peroxidases, 11 and gastric irritants such as salicylates and other antiinflammatory agents, 12 can produce false-positive test results for neoplasia. Nonneoplastic conditions, such as hemorrhoids, diverticulosis, and peptic ulcers, can also cause gastrointestinal bleeding. FOBT can also miss small adenomas and colorectal malignancies that bleed intermittently or not at all. 13,14 Other causes of false-negative results include heterogeneous distribution of blood in feces, 15 ascorbic acid and other antioxidants that interfere with test reagents, 16 and extended delay before testing stool samples. 17

As a result, when FOBT is performed on asymptomatic persons, the majority of positive reactions are falsely positive for neoplasia. The reported positive predictive value among asymptomatic persons over age 50 is only about 2-11% for carcinoma and 20-30% for adenomas. 6,5,9,18-20 Assuming a false-positive rate of 1-4%, a person who receives annual FOBT from age 50 to age 75 has an estimated 45% probability of receiving a false-positive result. 21 This large proportion of false-positive results is an important concern because of the discomfort, cost, and occasional complications associated with follow-up diagnostic tests, such as barium enema and colonoscopy. 22,23 Rehydration of stored slides can improve sensitivity, but this occurs at the expense of specificity. 24 In one study, rehydration improved sensitivity from 81% to 92%, but it decreased specificity from 98% to 90% and lowered positive predictive value from 6% to 2%. Due to the high false-positive rate, about one third of the entire screened population of asymptomatic patients underwent colonoscopy for abnormal FOBT results within a 13-year period. 5

Other tests have been proposed to improve the accuracy of screening for fecal occult blood. Current evidence is equivocal as to whether HemoQuant (SmithKline Diagnostics, Sunnyvale, CA), a quantitative measurement of hemoglobin in the stool, has better sensitivity or specificity than does qualitative FOBT. 9,10,25-29 Recently developed hemoglobin immunoassays offer the promise of improved sensitivity and specificity but require further evaluation before being considered for routine screening. 30,31

The third screening test for colorectal cancer is sigmoidoscopy. Sigmoidoscopic screening in asymptomatic persons detects 1-4 cancers per 1,000 examinations. 32,33 However, the sensitivity and diagnostic yield of sigmoidoscopy screening varies with the type of instrument: the rigid (25 cm) sigmoidoscope, the short (35 cm) flexible fiberoptic sigmoidoscope, or the long (60 cm) flexible fiberoptic sigmoidoscope. Since only 30% of colorectal cancers occur in the distal 20 cm of bowel, and less than half occur in or distal to the sigmoid colon, 34-37 the length of the sigmoidoscope has a direct effect on case detection. The rigid sigmoidoscope, which has an average depth of insertion of about 20 cm 38-44 and allows examination to just above the rectosigmoid junction, 45 can detect only about 25-30% of colorectal cancers. The 35-cm flexible sigmoidoscope, however, can visualize about 50-75% of the sigmoid colon and can detect about 50-55% of polyps. Longer 60-cm instruments have an average depth of insertion of 40-50 cm, reaching the proximal end of the sigmoid colon in 80% of examinations 46,47 with the capability of detecting 65-75% of polyps and 40-65% of colorectal cancers 48-52 Researchers have examined the feasibility of introducing a 105-cm flexible sigmoidoscope in the family practice setting, 53 but it is unclear whether the added length substantially increases the rate of detection of premalignant or malignant lesions. Barium enema studies have confirmed that some neoplasms within reach of the sigmoidoscope may not be seen on endoscopy. 54

Sigmoidoscopy can also produce false-positive results, primarily by detecting polyps that are unlikely to become malignant during the patient's lifetime. Autopsy studies have shown that 10-33% of older adults have colonic polyps at death, 55 but only 2-3% have colorectal cancer. 56-58 Depending on the type of adenomatous polyp, an estimated 5-40% eventually become malignant, 59 a process that takes an average of 10-15 years. 60,61 It follows that the majority of asymptomatic persons with colonic polyps discovered on routine sigmoidoscopic examination will not develop clinically significant malignancy during their lifetime. For these persons, interventions that typically follow such a discovery (i.e., biopsy, polypectomy, frequent colonoscopy), procedures that are costly, anxiety provoking, and potentially harmful, are unlikely to be of significant clinical benefit.

Other potential screening tests for colorectal cancer include colonoscopy and barium enema, which appear to have comparable accuracy. About 95% of colorectal cancers are within reach of the colonoscope, and the examination has an estimated 75-95% sensitivity in detecting lesions within its reach. 20,21 Colonoscopy, which requires sedation and often involves the use of a hospital suite, is more expensive than other screening tests and has a higher risk of anesthetic and procedural complications. The estimated sensitivity and specificity of air-contrast barium enema in detecting lesions within its reach are about 80-95% and 90%, respectively, using subsequent diagnosis as a reference standard. 21 Some retrospective studies have reported a higher sensitivity of barium enema for detecting colorectal cancer (about 90-96%), 62,63 with pathologic diagnosis as the reference standard, but these estimates generally do not account for the selection bias introduced by the case-selection methods. top link

Effectiveness of Early Detection

Persons with early-stage colorectal cancer at the time of diagnosis appear to have longer survival than do persons with advanced disease. 2 Since there is little information on the extent to which lead-time and length biases (see Chapter ii) account for these differences, researchers in the U.S. and Europe launched large clinical trials in the late 1970s to collect prospective data on the effects of screening on co-lorectal cancer mortality.

Two of these trials 5,6 examined the effect of routine FOBT on colorectal cancer mortality. A randomized controlled trial involving over 46,000 volunteers over age 50 found that the 13-year cumulative mortality from colorectal cancer was 33% lower among persons advised to undergo annual FOBT (5.9 deaths per 1,000) than among a control group that was not offered screening (8.8 deaths per 1,000). 5 The report provided insufficient data, however, to determine to what extent observed differences in outcome were attributable to FOBT or to the large number of colonoscopies that were performed due to frequent false-positive FOBT. An analysis of the study data by other authors suggested that one third to one half of the mortality reduction was due to "chance" selection of persons for colonoscopy, 64 but the assumptions in the analysis have been disputed by the authors. 65 Another controlled trial, 6 which was not randomized, assigned over 21,000 patients to a control group that received a standard periodic health examination or to a study group that was also offered FOBT both groups received sigmoidoscopy screening. Among new patients (first visit to the preventive medicine clinic), colorectal cancer mortality was 43% lower in the study group than in controls, a difference of borderline statistical significance (p = 0.053, one-tail), and there was no difference in outcomes among patients seen previously at the clinic. Recent case-control studies have also reported a 31-57% reduction in risk among persons receiving FOBT. 66,67 Three large clinical trials of FOBT screening, currently under way in Europe, are expected to report their results in the coming years. 7,68,69

Recent case-control studies have provided important information on the effectiveness of sigmoidoscopy screening. The largest study found that 9% of persons who died of colorectal cancer occurring within 20 cm of the anus had previously undergone a rigid sigmoidoscopic examination, whereas 24% of persons in the control group had received the test. 70 The adjusted odds ratio of 0.41 (95% confidence interval, 0.25-0.69) suggested that sigmoidoscopy screening reduced the risk of death by 59% for cancers within reach of the sigmoidoscope. The investigators noted that the adjusted odds ratio for patients who died of more proximal colon cancers was 0.96. This finding added support to the hypothesis that the reduced risk of death from cancers within reach of the rigid sigmoidoscope was due to screening rather than to confounding factors. Another case-control study reported that the odds ratio for dying of colorectal cancer was 0.21 in screened subjects, and the benefit appeared to be limited to cancers within the reach of the sigmoidoscope. 71

Older evidence of the effectiveness of sigmoidoscopy screening suffered from important design limitations. A randomized controlled trial of multiphasic health examinations, which included rigid sigmoidoscopy, reported that the study group had significantly lower incidence and mortality rates from colorectal cancer. 72-74 A subsequent analysis of the data, however, revealed that the proportion of subjects receiving sigmoidoscopy and the rate of detection or removal of polyps were similar in both the study and control groups, thus suggesting little benefit from sigmoidoscopy. 75 Two large screening programs found that persons receiving periodic rigid sigmoidoscopy had less advanced disease and better survival from colon cancer than was typical of the general population. 76-78 However, both studies lacked internal controls and used nonrandomized methods to select participants other methodologic problems with these investigations are outlined in other reviews. 75,79

An important consideration in assessing the effectiveness of sigmoidoscopic screening is the potential iatrogenic risk associated with the procedure. Complications from sigmoidoscopy are relatively rare in asymptomatic persons but can be potentially serious. Perforations are reported to occur in approximately 1 of 1,000-10,000 rigid sigmoidoscopic examinations. 20,21,32,80 Although there are fewer data available on flexible sigmoidoscopy, the complication rate appears to be less than or equal to that observed for rigid sigmoidoscopy. The reported risk of perforation from colonoscopy is about one in 500-3,000 examinations, 5,81 and the risk of serious bleeding is 1 in 1,000. 5 The estimated risk of perforation during barium enema is 1 in 5,000-10,000 examinations. 82

There is little useful evidence regarding the effectiveness of colonoscopy or barium enema screening in asymptomatic persons. Several recent studies describe colonoscopy screening of asymptomatic persons, but they report only the anatomic distribution of polyps and do not address clinical outcomes. 48,49,83 A prospective study demonstrated a significantly lower incidence of subsequent colorectal cancer in patients with previously diagnosed adenomas who received periodic colonoscopy and polypectomy, but potential biases in the control groups (historical controls and population incidence rates) prevent definitive conclusions. 84 No studies have directly examined the effectiveness of routine barium enema screening in decreasing colorectal cancer mortality in asymptomatic persons. Modeling studies suggest its effectiveness might be comparable to a screening strategy of periodic sigmoidoscopy. 21

There is limited information on the optimal age to begin or end screening and the frequency with which it should be performed. The age groups in which screening has been shown to decrease mortality are ages 50-80 for FOBT 5 and over age 45 for sigmoidoscopy. 70 Theoretically, the potential yield from screening should increase beyond age 50 since the incidence of colorectal cancer after this age doubles every 7 years. 2 Modeling studies suggest that beginning screening at age 40 rather than at age 50 offers no improvement in life expectancy. 21 There is little evidence from which to determine the proper age for discontinuing screening. The optimal interval for screening is less certain for sigmoidoscopy than for FOBT, for which there is good evidence of benefit from annual screening. A modeling study of sigmoidoscopy screening estimated that an interval of 10 years would preserve 90% of the effectiveness of annual screening this model assumes that adenomatous polyps take 10-14 years to become invasive cancers. 21 Another model suggested that an interval of 2-4 years would allow detection of 95% of all polyps greater than 13 mm in diameter. 85 In a case-control study, the risk reduction associated with sigmoidoscopy screening did not diminish during the first 9-10 years after sigmoidoscopy. 70 Other studies suggest that a single sigmoidoscopic screening examination may be adequate for low-risk individuals, 86 an approach being investigated in the United Kingdom. 87

Primary preventive measures to prevent colorectal cancer are currently under investigation. An association between colorectal cancer and dietary intake of fat and fiber has been demonstrated in a series of epidemiologic studies (see Chapter 56). Case-control and cohort studies also suggest that aspirin use may decrease the risk of colon cancer. 88-90 top link

Recommendations of Other Groups

The American Cancer Society recommends annual digital rectal examination for all adults beginning at age 40, annual FOBT beginning at age 50, and sigmoidoscopy every 3-5 years beginning at age 50. 91 Similar recommendations have been issued by the American Gastroenterological Association, 92 the American Society for Gastrointestinal Endoscopy, 92 and the American College of Obstetricians and Gynecologists. 93 The American College of Physicians' (ACP) guidelines, revised in 1995, recommend offering a variety of screening options to persons from age 50 to 70, depending on local resources and patient preferences: flexible sigmoidoscopy, colonoscopy, or air-contrast barium enema, repeated at 10-year intervals. The ACP recommends that annual FOBT be offered to persons who decline these screening tests, but concluded that there was relatively little benefit of continuing endoscopic screening beyond age 70 in individuals who had been adequately screened up to that age. 21 The American College of Radiology recommends screening with barium enema every 3-5 years as an equivalent alternative to periodic sigmoidoscopy. 94 The recommendations of the American Academy of Family Physicians are currently under review. 95 Most organizations recommend more intensive screening of those in high-risk groups (e.g., familial polyposis, inflammatory bowel disease) with periodic colonoscopy or barium enema. The Canadian Task Force on the Periodic Health Examination concluded that there was insufficient evidence to support screening of asymptomatic individuals over age 40 but that persons with a history of cancer family syndrome should be screened with colonoscopy. 96 An expert panel convened by the Agency for Health Care Policy and Research is expected to issue guidelines for colorectal cancer screening and surveillance in 1996. top link

Discussion

In summary, recent studies have provided compelling evidence of the effectiveness of FOBT and sigmoidoscopy screening, but the evidence is not definitive. At least one randomized controlled trial and several observational studies have shown that annual FOBT in persons over age 50 can reduce colorectal cancer mortality. This evidence does not, however, clarify whether the observed benefits were due to FOBT or to the effect of performing colonoscopy on a large proportion of the screened population. For sigmoidoscopy, a case-control study supports a strong association between regular screening and reduced colorectal cancer mortality from cancers within reach of the sigmoidoscope. This study was limited, however, by its small number of cases, potential selection biases, and inability to provide prospective evidence of benefit. There are additional concerns about the adverse effects, costs, and optimal frequency of screening. Studies that will help resolve these uncertainties are currently in progress the final results of ongoing European FOBT trials will be unavailable for several years, however, and a large United States study 97 of FOBT and sigmoidoscopy screening will not be completed until the turn of the century.

An important limitation to the effectiveness of screening for colorectal cancer is the ability of patients and clinicians to comply with testing. Patients may not comply with FOBT for a variety of reasons, 68,98 but compliance rates are generally higher than for sigmoidoscopy. Recent clinical trials report compliance rates of 50-80% for FOBT among volunteers, 5-7,68,69 but lower rates (about 15-30%) have been reported in community screening programs. 99-101 Although the introduction of flexible fiberoptic instruments has made sigmoidoscopy more acceptable to patients, 102 the procedure remains uncomfortable, embarrassing, and expensive, and therefore many patients may be reluctant to agree to this test. A survey of patients over age 50 found that only 13% wanted to receive a sigmoidoscopy examination after being advised that they should receive the test the most common reasons cited for declining the test were cost (31%), discomfort (12%), and fear (9%). 103 In a study in which sigmoid-oscopy was recommended repeatedly, only 31% of participants consented to the procedure, 72-74 but this study was performed during years when rigid sigmoidoscopy was common. Compliance rates as low as 6-12% have been reported. Studies suggest that physician motivation is a major determinant of patient compliance, 104,105 and physicians may be reluctant to perform screening sigmoidoscopy on asymptomatic persons. It has been estimated that a typical family physician with 3,000 active patients (one third aged 50 or older) would have to perform five sigmoidoscopies daily to initially screen the population and two daily procedures for subsequent screening. 33 In addition, examinations using 60-cm sigmoidoscopes are more time-consuming 35-39 and require more extensive training 106-108 than do those using shorter instruments.

Another limitation to screening is its cost. Although a formal cost-effectiveness analysis of screening for colorectal cancer is beyond the scope of this chapter, the economic implications associated with the widespread performance of FOBT and sigmoidoscopy are clearly significant. A single flexible sigmoidoscopic examination costs between $100 and $200. 22,109 A policy of routine FOBT and sigmoidoscopic screening of all persons in the United States over age 50 (about 63 million persons) would cost over $1 billion per year in direct charges. 109 Others have calculated that FOBT screening alone could cost the United States and Canada between $500 million and $1.2 billion each year. 110,111 Another model predicted that performing annual FOBT on persons over age 65 would cost about $35,000 per year of life saved adding flexible sigmoidoscopy would increase the cost to about $42,000 to $45,000 per year of life saved. 112 Mathematical models suggest that barium enema screening every 3-5 years might have comparable or superior cost-effectiveness when compared with sigmoidoscopy screening, but neither the clinical effectiveness nor acceptability of barium enema screening has been demonstrated directly in clinical studies.

The downstream effects of screening are also of concern. The logistical difficulties and costs of performing FOBT and sigmoidoscopy on a large proportion of the U.S. population are significant, due to the limited acceptability of the tests and the expense of performing screening and follow-up on a large proportion of the population. Moreover, the tests have potential adverse effects that must be considered, such as false-positive results that lead to expensive and potentially harmful diagnostic procedures. Studies that have reported reduced mortality from FOBT used rehydrated slides to increase sensitivity, thereby producing a higher proportion of false-positive results than with nonrehydrated slides 32% of the annually screened population underwent colonoscopy during a 13-year follow-up period. 65 If this rate is extrapolated to the 63 million Americans over age 50 who would receive annual FOBT, it can be predicted that about 20 million persons would require colonoscopy.

The full implications of this "screening cascade" need to be considered, along with the scientific evidence of clinical benefits, before reaching conclusions about appropriate public policy. For example, using nonrehydrated slides rather than rehydrated slides could substantially reduce the adverse effects and costs of a national screening program. As noted earlier, data from a major screening trial suggest that using nonrehydrated slides rather than rehydrated slides could increase the positive predictive value of FOBT from 2% to 6%, subjecting far fewer screened persons to unnecessary colonoscopy. This improvement in specificity, however, comes at the expense of sensitivity, which decreased from 92% with rehydration to 81% in nonrehydrated slides. The use of nonrehydrated slides would therefore allow a much larger proportion of persons with cancer to escape detection.

The special considerations that apply to persons at increased risk of colorectal cancer are complicated by inadequate epidemiologic and effectiveness data and inconsistent disease classifications. Having a single family member with colorectal cancer does not carry the high risk associated with hereditary cancer syndromes (e.g., familial polyposis, HNPCC). 4,113 A family history that is suggestive of the latter includes a pattern of diagnoses consistent with autosomal dominant inheritance of a highly penetrant disorder. Characteristic features include a family history of colorectal cancer being diagnosed at an early age, frequent cases of multiple primary cancers, or florid adenomatous colonic polyps. Performing periodic colonoscopy to screen for cancer in these groups may be justified in light of the high risk of disease and the incidence of proximal colonic lesions, but there is no direct evidence to determine the optimal strategy in this population.

CLINICAL INTERVENTION

Screening for colorectal cancer is recommended for all persons aged 50 or over ("B" recommendation). Effective methods include FOBT and sigmoidoscopy. There is insufficient evidence to determine which of these screening methods is preferable or whether the combination of FOBT and sigmoidoscopy produces greater benefits than either test alone. Although there is good evidence to support FOBT on an annual basis, there is insufficient evidence to recommend a periodicity for sigmoidoscopy screening. A frequency of every 3-5 years has been recommended by other groups on the basis of expert opinion, and a well-designed case-control study suggests that protection remains unchanged for at least 10 years after rigid sigmoid-oscopy. Current evidence suggests that at least some of the benefits of FOBT in reducing colorectal cancer mortality may be achieved through colonoscopic evaluation of abnormal results. Widespread FOBT or sigmoidoscopy screening is therefore likely to generate substantial direct and indirect costs. Appropriate public policy may require consideration of factors other than the scientific evidence of clinical benefit. The appropriate age to discontinue screening has not been determined.

Patients who are offered these tests should receive information about the potential benefits and harms of the procedures, the probability of false-positive results, and the nature of the tests that will be performed if an abnormality is detected. FOBT screening should adhere to current guidelines for dietary restrictions, sample collection, and storage. Although slide rehydration increases the sensitivity of FOBT, it also decreases specificity, and there is insufficient evidence to determine whether rehydration results in better outcomes than screening with nonrehydrated slides. Sigmoidoscopy should be performed by a trained examiner. The instrument should be selected on the basis of examiner expertise and patient comfort. Longer (e.g., 60-cm instrument) flexible sigmoidoscopes have greater sensitivity and are more comfortable than shorter, rigid sigmoidoscopes.

There is insufficient evidence to recommend for or against routine screening with digital rectal examination, barium enema, or colonoscopy ("C" recommendation). Recommendations against using these tests for screening average-risk persons may be made on other grounds (e.g., availability of alternate tests of proven effectiveness, inaccuracy of digital rectal examination, costs and risks of colonoscopy).

In persons with a single first-degree relative with colon cancer, it is not clear that the modest increase in the absolute risk of cancer justifies routine use of colonoscopy over other screening methods. The increased risk of developing cancer at younger ages may justify beginning screening before age 50 in persons with a positive family history, however, especially when affected relatives developed colorectal cancer at younger ages. Direct evidence of the benefit of screening in younger persons is not available for any group. For persons with a family history of hereditary syndromes associated with a very high risk of colon cancer (i.e., familial polyposis or HNPCC), as well as those previously diagnosed with ulcerative colitis, high-risk adenomatous polyps, or colon cancer, regular endoscopic screening is part of routine diagnosis and management referral to specialists is appropriate for these high-risk patients.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH. top link

9. Screening for Cervical Cancer

Burden of Suffering

Approximately 16,000 new cases of cervical cancer are diagnosed each year, and about 4,800 women die from this disease annually. 1 The lifetime risk of dying from cervical cancer in the U.S. is 0.3%. 1a Although the 5-year survival rate is about 90% for persons with localized cervical cancer, it is considerably lower (about 14%) for persons with advanced (Stage IV) disease. The incidence of invasive cervical cancer has decreased significantly over the last 40 years, due in large part to organized early detection programs. Although all sexually active women are at risk for cervical cancer, the disease is more common among women of low socioeconomic status, those with a history of multiple sex partners or early onset of sexual intercourse, and smokers. The incidence of invasive cervical cancer among young white women has increased recently in the United States. Infection with human immunodeficiency virus (HIV) and certain types of human papilloma virus (HPV) also increases the risk of cervical cancer. 2 top link

Accuracy of Screening Tests

The principal screening test for cervical cancer is the Pap smear. Although the Pap smear can sometimes detect endometrial, vaginal, and other cancers, 3,4 its use as a screening test is intended for the early detection of cervical dysplasia and cancer. Other proposed cervical screening tests include cervicography, colposcopy, and testing for HPV infection. The role of pelvic examination, which usually accompanies the collection of the cervical specimen, is discussed in Chapter 14 in relation to ovarian cancer screening.

Precise data on the sensitivity and specificity of the Pap smear in detecting cancer and dysplasia are lacking due to methodologic problems. Depending on study design, false-negative rates of 1-80% have been reported a range of 20-45% has been quoted most frequently, primarily in studies comparing normal test results with subsequent smears. 5-11 Studies using cone biopsy results as the reference standard have reported false-negative rates as low as 10%. 12 Although reliable data are lacking, specificity is probably greater than 90% 13 and may be as high as 99%. 6,11 The detection of precursor cervical intraepithelial neoplasia (CIN) by Pap smears may have poor specificity for cervical carcinoma, however, because a substantial proportion of CIN-1 lesions do not progress to invasive disease or may regress spontaneously. The test-retest reliability of Pap smears is influenced to some extent by variations in the expertise and procedures of different cytopathology laboratories.

A large proportion of diagnostic errors may be attributable to laboratory error. In one study of over 300 laboratories given slides with known cytologic diagnoses, false-negative diagnoses were made in 7.5% of smears with moderate dysplasia or frank malignancy, and false-positive diagnoses were made in 8.9% of smears with no more than benign atypia. 14 A survey of 73 laboratories in one state revealed a false-negative rate of 4.4% and a false-positive rate of 2.7%. 15 These data were reported in 1990, before the introduction of federal legislation designed to improve the accuracy of cytopathologic laboratory interpretation. 16 With the adoption of the Bethesda system for classification of cervical diagnoses, 17 a large proportion of benign smears are interpreted as "atypical," a finding that poses little premalignant potential but that often generates intensive follow-up testing.

Another cause of false-negative Pap smears is poor specimen collection technique. A 1991 survey of 600 laboratories found that 1-5% of specimens received were either unsatisfactory or suboptimal, generally because endocervical cells were absent from the smear. 18 Another study found that poor sampling technique accounted for 64% of false-negative results. 19 The Pap smear has traditionally been obtained with a spatula, to sample the ectocervix, and a cotton swab, to obtain endocervical cells. A 1990 survey found that about half of physicians used a spatula and cotton swab to collect Pap smears. 20 In recent years, new devices have been introduced to improve sampling of the squamocolumnar junction. Controlled studies have shown that using an endocervical brush in combination with a spatula is more likely to collect endocervical cells than using a spatula or cotton swab. 21-30 There is conflicting evidence, however, that the endocervical brush increases the detection rate for abnormal smears or affects clinical outcomes. 31-33 There is also conflicting evidence regarding the importance of collecting endocervical cells. Although some large series have reported that CIN is detected over 2 times more frequently when endocervical cells are present, 34,35 other series 36,37 have shown no association between the presence of endocervical cells and the detection rate for dysplasia. The brush is more expensive than the cotton swab, but studies suggest that this cost is easily recovered by the reduced need for repeat testing. 38 Other methods for improving the sensitivity of cervical cancer screening, such as acetic acid washes to improve the visibility of lesions, remain investigational. 39,40

There are important potential adverse effects associated with inaccurate interpretation of Pap smears. False-negative results are significant because CIN or more invasive lesions may escape detection and progress to more advanced disease during the period between tests. The potential adverse effects of false-positive results include patient anxiety regarding the risk of cervical cancer, 41,42 as well as the unnecessary inconvenience, discomfort, and expense of follow-up diagnostic procedures. Studies have shown that the distribution of patient education materials that explain the meaning of abnormal results is associated with a reduction in patient anxiety and stress and a better patient understanding of test results. 43-45

Other tests, such as cervicography and colposcopy, have been proposed to help improve the sensitivity of screening, 46 but their accuracy and technical requirements are suboptimal. Cervicography, in which a photograph of the cervix is examined for atypical lesions, has a sensitivity that is comparable to the Pap smear (approximately 60%) but a much lower specificity (approximately 50%) the reported positive predictive value in most studies is only 1-26%, and about 10-15% of cervigrams are unsatisfactory. 47-51 Colposcopy, in which the cervix is examined under magnification with acetic acid washing and suspicious lesions are biopsied, is widely performed on women with abnormal Pap smears but has poor sensitivity (34-43%), specificity (68%), and positive predictive value (4-13%) when used as a screening test for cervical neoplasia in asymptomatic women. 52-54 Other disadvantages of colposcopy screening include its cost, the limited availability of the equipment, the time and skills required to perform the procedure, and patient discomfort. Using a 10-point score for assessing pain, one study reported that women who underwent colposcopy gave the procedure a range of scores from 3 to 4.6. 55

Another proposed screening strategy is testing for HPV infection, a known risk factor for cervical cancer. Of the more than 70 types of HPV that have been identified, several oncogenic forms (e.g., types 16 and 18) have a strong epidemiologic association with cervical cancer. However, the natural history of how HPV infection progresses to cancer is poorly understood. 56 One study of women infected with either HPV type 16 or 18 found that 67% of the lesions remained unchanged or regressed after a mean of 5 years, 29% progressed to a more advanced stage of dysplasia, and 3% recurred. 57 The high prevalence of HPV infection in young women also limits its predictive value. In one study, nearly half of female college students had evidence of HPV when tested by polymerase chain reaction technology. 58 The reported positive predictive value of this HPV test for CIN-2 or CIN-3 lesions and carcinoma is less than 10%. 59 HPV typing to identify women with oncogenic strains may improve the future accuracy of the test and its role in directing follow-up, but its current suitability for routine screening in asymptomatic women is limited by its poor predictive value, uncertain natural history, and, due to the absence of an effective treatment, the lack of evidence that screening affects clinical outcomes. 60 top link

Effectiveness of Early Detection

Early detection of cervical neoplasia provides an opportunity to prevent or delay progression to invasive cancer by performing clinical interventions such as colposcopy, conization, cryocautery, laser vaporization, loop electrosurgical excision, and, when necessary, hysterectomy. 61 There is evidence that early detection through routine Pap testing and treatment of precursor CIN can lower mortality from cervical cancer. Correlational studies in the United States, Canada, and several European countries comparing cervical cancer data over time have shown dramatic reductions in the incidence of invasive disease and a 20-60% reduction in cervical cancer mortality rates following the implementation of cervical screening programs. 62-70 Case-control studies have shown a strong negative association between screening and invasive disease, also suggesting that screening is protective. 71-75 These observational studies do not constitute direct evidence that screening was responsible for the findings, 76 and randomized controlled trials to provide such evidence have not been performed. Nonetheless, the large body of supportive evidence accumulated to date has prompted the adoption of routine cervical cancer screening in many countries and makes performance of a controlled trial of Pap smears unlikely for ethical reasons.

Observational data suggest that the effectiveness of cervical cancer screening increases when Pap testing is performed more frequently. 72 Aggressive dysplastic and premalignant lesions are less likely to escape detection when the interval between smears is short. There are, however, diminishing returns as frequency is increased. 71,77 Although studies have shown that reducing the interval between Pap smears from 10 years to 5 years is likely to achieve a significant reduction in the risk of invasive cervical cancer, case-control studies and mathematical modeling have demonstrated that increasing to a 2-3-year interval offers only slight added benefit. 71,78-80 There is little evidence that women who receive annual screening are at significantly lower risk for invasive cervical cancer than are women who are tested every 3-5 years. These findings were confirmed in a major study of eight cervical cancer screening programs in Europe and Canada involving over 1.8 million women. 81 According to this report, the cumulative incidence of invasive cervical cancer was reduced 64.1% when the interval between Pap tests was 10 years, 83.6% at 5 years, 90.8% at 3 years, 92.5% at 2 years, and 93.5% at 1 year. These estimates were for women aged 35-64 who had at least one screening before age 35, and they are based on the assumption of 100% compliance.top link

Recommendations of Other Groups

A consensus recommendation that all women who are or have been sexually active, or who have reached age 18, should have annual Pap smears has been adopted by the American Cancer Society, National Cancer Institute, American College of Obstetricians and Gynecologists (ACOG), American Medical Association, American Academy of Family Physicians (AAFP), and others. 82 The recommendation permits Pap testing less frequently after three or more annual smears have been normal, at the discretion of the physician. Guidelines for determining frequency based on risk factors have been issued by ACOG. 83 The consensus did not recommend an age to discontinue Pap testing. The AAFP recommends that screening can be discontinued at age 65 if there is documented evidence of previously negative smears, but its recommendations are currently under review. 84 The American College of Physicians (ACP) recommends Pap smears every 3 years for women aged 20-65, and every 2 years for women at high risk. 85 The ACP also recommends screening women aged 66-75 every 3 years if not screened in the 10 years before age 66. The Canadian Task Force on the Periodic Health Examination recommends screening for cervical cancer with annual Pap smears in women following initiation of sexual activity or age 18, and after two normal smears, screening every 3 years to age 69. 86 The Canadian Task Force recommends considering more frequent screening for women at increased risk. In their guidelines for adolescent preventive services (GAPS), the American Medical Association recommends annual screening with a Pap test for female adolescents who are sexually active or age 18 or older. 87 Bright Futures also recommends annual Pap testing for sexually active adolescent females. 88 Similar recommendations have been endorsed by the American Academy of Pediatrics. 89 top link

Discussion

It has been estimated that screening women aged 20-64 every 3 years with Pap testing reduces cumulative incidence of invasive cervical cancer by 91%, requires about 15 tests per woman, and yields 96 cases for every 100,000 Pap smears. Annual screening reduces incidence by 93%, but requires 45 tests and yields only 33 cases for every 100,000 tests. 81 Empirical data also support the effectiveness of a 3-year interval. A study of 25,000 Dutch women found that screening a stable population every 3 years reduced the incidence of squamous cell carcinoma of the cervix from 0.38 per 1000 to zero within 12 years. 67 There are, in addition, important economic considerations to performing Pap tests every 2-3 years, since annual testing could double or triple the total number of smears taken on over 92 million American women at risk, 90 yet provide only limited added benefit in lowering mortality. 81

Annual testing, however, has been common. In the mid-1980s, a survey of recently trained gynecologists found that 97% recommend a Pap test at least once a year. 91 The preference of many clinicians for performing annual Pap smears is based on concerns that less frequent testing may result in more harm than good, but reliable scientific data to support these opinions are lacking. Specifically, advocates of annual testing have expressed concerns that data demonstrating little added value to annual testing are based on retrospective studies and mathematical models that are subject to biases and invalid assumptions that an interval longer than 1 year may permit aggressive, rapidly growing cancers to escape early detection that the public may obtain Pap smears at a lower frequency than that publicized in recommendations that a longer interval might affect compliance among high-risk women, a group with poor coverage even with an annual testing policy that repeated testing may offset the false-negative rate of the Pap smear that the test is inexpensive and safe and that a large proportion of women believe it is important to have an annual Pap test and, while visiting the clinician, may receive other preventive interventions. Definitive evidence to support these concerns is lacking.

Women who have never engaged in sexual intercourse are not at risk for cervical cancer and therefore do not require screening. 92-94 In addition, screening of women who have only recently become sexually active (e.g., adolescents) is likely to have low yield. The incidence of invasive cancer in women under age 25 is only about 1-3 per 100,000, a rate that is much lower than that of older age groups. 11 One study found that most women with CIN who had become sexually active at age 18 were not diagnosed with severe dysplasia or carcinoma in situ until age 30. 93

Although invasive cervical cancer is uncommon at young ages, authorities have recommended since the early 1980s that screening should begin with the onset of sexual activity. 82,92,94 This policy is based in part on the concern that a proportion of young women with CIN may have an aggressive cell type that can progress rapidly and go undetected if screening is delayed to a later age. There is some evidence that adenocarcinomas are accounting for a growing proportion of new cervical cancer cases in young women, 95,96 but the exact incidence and natural history of aggressive disease in young women remain uncertain. The Pap smear is also a poor screening test for adenocarcinoma, compared with squamous cell carcinoma. Another reason given for early screening is the concern that the incidence of cervical dysplasia occurring in young women appears to be on the rise, coincident with the increasing sexual activity of adolescents. On these grounds, testing should begin by age 18, since many American teenagers are sexually active by this age. Screening in the absence of a history of sexual intercourse may be justified if the credibility of the sexual history is in question.

When screening is initiated, it is frequently recommended that the first two to three smears be obtained 1 year apart as a means of detecting aggressive tumors at a young age. There is little evidence to suggest, however, that young women whose first two tests are separated by 2 or 3 years, rather than 1 year, have a greater mortality or person-years of life lost. 78 Recommendations to perform these first tests annually are based primarily on expert opinion.

Elderly women do not appear to benefit from Pap testing if repeated cervical smears have consistently been normal. 97,78 Modeling data suggest that continued testing of previously screened women reduces the risk of dying from cervical cancer by only 0.18% at age 65 and 0.06% at age 74. 80 Many older women have had incomplete screening, however. A reported 17% of women over age 65 and 32% of poor women in this age group have never received a Pap test. 98 In a study of elderly minority women with an average age of 75 years, the mean reported number of prior Pap smears received since age 65 was 1.7. 99 Further screening in this group of older women is important 78,100 and some studies suggest that it is cost-effective. 101 Women who have undergone a hysterectomy in which the cervix was removed do not benefit from Pap testing, unless it was performed because of cervical cancer. Post-hysterectomy screening has the potential to detect vaginal cancer, but the yield and predictive value are likely to be very low. Women who had hysterectomies performed in which the cervix was left behind probably still require screening.

The effectiveness of cervical cancer screening is more likely to be improved by extending testing to women who are not currently being screened and by improving the accuracy of Pap smears than by efforts to increase the frequency of testing. Studies suggest that those at greatest risk for cervical cancer are the very women least likely to have access to testing. 102,103 Incomplete Pap testing is most common among blacks, the poor, uninsured persons, the elderly, and persons living in rural areas. 98,104-106 In addition, many women who are tested receive inaccurate results due to interpretative or reporting errors by cytopathology laboratories or specimen collection errors by clinicians. The failure of some physicians to provide adequate follow-up for abnormal Pap smears is another source of delay in the management of cervical dysplasia. 107 Finally, a large proportion of patients with abnormal smears (30% in studies of poor, elderly black women 108 ) do not return for further evaluation. Various techniques may enhance physician and patient compliance with screening, follow-up of abnormal results, and patient compliance with rescreening. 109-112

CLINICAL INTERVENTION

Regular Pap tests are recommended for all women who are or have been sexually active and who have a cervix ("A" recommendation). Testing should begin at the age when the woman first engages in sexual intercourse. Adolescents whose sexual history is thought to be unreliable should be presumed to be sexually active at age 18. There is little evidence that annual screening achieves better outcomes than screening every 3 years. Pap tests should be performed at least every 3 years ("B" recommendation). The interval for each patient should be recommended by the physician based on risk factors (e.g., early onset of sexual intercourse, a history of multiple sex partners, low socioeconomic status). (Women infected with human immunodeficiency virus require more frequent screening according to established guidelines. 113 ) There is insufficient evidence to recommend for or against an upper age limit for Pap testing, but recommendations can be made on other grounds to discontinue regular testing after age 65 in women who have had regular previous screening in which the smears have been consistently normal ("C" recommendation). Women who have undergone a hysterectomy in which the cervix was removed do not require Pap testing, unless the hysterectomy was performed because of cervical cancer or its precursors. Patients at increased risk because of unprotected sexual activity or multiple sex partners should receive appropriate counseling about sexual practices (see Chapter 62).

The use of an endocervical brush increases the likelihood of obtaining endocervical cells, but there is conflicting evidence that sampling these cells improves sensitivity in detecting cervical neoplasia. Physicians should submit specimens to laboratories that have adequate quality control measures to ensure optimal accuracy in the interpretation and reporting of results. Thorough follow-up of test results should also be ensured, including repeat testing and referral for colposcopy as indicated. Physicians should consider providing patients with a pamphlet or other written information about the meaning of abnormal smears to help ensure follow-up and minimize anxiety over false-positive results.

There is insufficient evidence to recommend for or against routine cervicography or colposcopy screening for cervical cancer in asymptomatic women, nor is there evidence to support routine screening for HPV infection ("C" recommendation). Recommendations against such screening can be made on other grounds, including poor specificity and costs.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH. top link

10. Screening for Prostate Cancer

Burden of Suffering

Prostate cancer is the most common noncutaneous cancer in American men. 1 After lung cancer, it accounts for more cancer deaths in men than any other single cancer site. Prostate cancer accounted for an estimated 244,000 new cases and 40,400 deaths in the U.S. in 1995. 1 Risk increases with age, beginning at age 50, and is also higher among African American men. Because it is more common in older men, prostate cancer ranks 21st among cancers in years of potential life lost. 2 The age-adjusted death rate from prostate cancer increased by over 20% between 1973 and 1991. 3 The lifetime risk of dying from prostate cancer is 3.4% for American men. 3 The reported incidence of prostate cancer has increased in recent years by 6% per year, a trend attributed to increased early detection efforts. 4 Because local extension beyond the capsule of the prostate rarely produces symptoms, about one to two thirds of patients already have local extracapsular extension or distant metastases at the time of diagnosis. 5 Ten-year survival rates are 75% when the cancer is confined to the prostate, 55% for those with regional extension, and 15% for those with distant metastases. 6 The potential morbidity associated with progression of prostate cancer is also substantial, including urinary tract obstruction, bone pain, and other sequelae of metastatic disease. top link

Accuracy of Screening Tests

The principal screening tests for prostate cancer are the digital rectal examination (DRE), serum tumor markers (e.g., prostate-specific antigen [PSA]), and transrectal ultrasound (TRUS). The reference standard for these tests is pathologic confirmation of malignant disease in tissue obtained by biopsy or surgical resection. The sensitivity and specificity of screening tests for prostate cancer cannot be determined with certainty, however, because biopsies are generally not performed on patients with negative screening test results. False-negative results are unrecognized unless biopsies are performed for other reasons (e.g., abnormal results on another screening test, tissue obtained from transurethral prostatic resection). The resulting incomplete information about the number of true- and false-negative results makes it impossible to properly calculate sensitivity and specificity. Only the positive predictive value (PPV) -- the probability of cancer when the test is positive -- can be calculated with any confidence.

Even the PPV is subject to uncertainty because of the inaccuracies of the usual reference standard. Needle biopsy, the typical reference standard used for calculating sensitivity and specificity, has limited sensitivity. One study suggested that as many as 19% of patients with an initially negative needle biopsy (but abnormal screening test results) had evidence of cancer on a second biopsy. 7 Moreover, studies vary in the extent to which the gland is sampled during needle biopsy. Recent studies, in which larger numbers of samples are obtained from multiple sections of the gland, provide a different reference standard than the more limited needle biopsies performed in older studies. These methodologic problems account for the large variation in the reported sensitivity, specificity, and PPV of prostate cancer screening tests and the current controversy over their true values.

DRE is the oldest screening test for prostate cancer. Its sensitivity is limited, however, because the examining finger can palpate only the posterior and lateral aspects of the gland. Studies suggest that 25-35% of tumors occur in portions of the prostate not accessible to the examining finger. 8 In addition, Stage A tumors, by definition, are nonpalpable. Most recent studies report that DRE has a sensitivity of 55-68% in detecting prostate cancer in asymptomatic men, 9,10 but values as low as 18-22% have also been reported in studies using different screening protocols. 11,12 The DRE also has limited specificity, producing a large proportion of false-positive results. The reported PPV in asymptomatic men is 6-33% 10,13-15 but appears to be somewhat higher when performed by urologists rather than by general practitioners. 16

Elevations in certain serum tumor markers (e.g., PSA and prostatic acid phosphatase) provide another means of screening for prostate cancer. In screening studies, a PSA value greater than 4 ng/dL has a reported sensitivity of over 80% in detecting prostate cancer in asymptomatic men, 10 although a sensitivity as low as 29% has also been reported in studies using different screening protocols. 11 Prostatic acid phosphatase has a much lower sensitivity (12-20% for Stage A and B disease) and PPV (below 5%) than PSA, 17 and its role in screening has largely been replaced by PSA. PSA elevations are not specific for prostate cancer. Benign prostatic conditions such as hypertrophy and prostatitis can produce false-positive results about 25% of men with benign prostatic hypertrophy (BPH) and no malignancy have an elevated PSA level. 18

In most screening studies involving asymptomatic men, the reported PPV of PSA in detecting prostate cancer is 28-35%. 10,19-21 In many instances, however, other screening tests (e.g., DRE) are also positive. The PPV of PSA when DRE is negative appears to be about 20%. 22 It is unclear whether the same PPV applies when screening is performed in the general population. Participants in most screening studies are either patients seen in urology clinics or volunteers recruited from the community through advertising. Studies suggest that such volunteers have different characteristics than the general population. 23 For example, in one screening study, 53% of the volunteers had one or more symptoms of prostatism. 10 Since PPV is a function of the prevalence of disease, routine PSA testing of the general population, if it had a lower prevalence of prostate cancer than volunteers, would generate a higher proportion of false-positive results than has been reported in the literature. A significant difference in prevalence in the two populations has not, however, been demonstrated.

Several techniques have been proposed to enhance the specificity and PPV of the PSA test. The serum concentration of PSA appears to be influenced by tumor volume, and some investigators have suggested that PSA density (the PSA concentration divided by the gland volume as measured by TRUS) may help differentiate benign from malignant disease. 24-26 According to these studies, a PSA density greater than 0.15 ng/mL may be more predictive of cancer. Other studies suggest that the rate of change (PSA velocity), rather than the actual PSA level, is a better predictor of the presence of prostate cancer. An increase of 0.75 ng/mL or higher per year has a reported specificity of 90% and 100% in distinguishing prostate cancer from BPH and normal glands, respectively. 27 PSA values tend to increase with age, and investigators have therefore proposed age-adjusted PSA reference ranges. 28,29 Current evidence is inadequate to determine the relative superiority of any of these measures or to prove conclusively that any is superior to absolute values of PSA. 30 The most effective method to increase the PPV of PSA screening is to combine it with other screening tests. In a large screening study, the combination of an elevated PSA and abnormal DRE achieved a PPV of 49%. Even with this improved accuracy, however, combined DRE and PSA screening led to the performance of needle biopsies on 18% of the screened population, 10 raising important public policy issues (see below).

A large proportion of cancers detected by PSA screening may be latent cancers, indolent tumors that are unlikely to produce clinical symptoms or affect survival. Autopsy studies indicate that histologic evidence of prostate cancer is present in about 30% of men over age 50. The reported prevalence of prostate cancer in men without previously known prostate cancer during their lifetimes is 10-42% at age 50-59, 17-38% at age 60-69, 25-66% at age 70-79, and 18-100% at age 80 and older. 31-37 Recent autopsy studies have even found evidence of carcinoma in 30% of men aged 30-49. 38 Although patients who undergo autopsy may not be entirely representative of the general population, these prevalence rates, combined with census data, 39 suggest that millions of American men have prostate cancer. Fewer than 40,000 men in the U.S. die each year from prostate cancer, however, suggesting that only a subset of cancers in the population are clinically significant. Natural history studies indicate that most prostate cancers grow slowly over a period of many years. 40 Thus, many men with early prostate cancer (especially older men) will die of other causes (e.g., coronary artery disease) before their cancer becomes clinically apparent. Because a means of distinguishing definitively between indolent and progressive cancers is not yet available, widespread screening is likely to detect a large proportion of cancers whose effect on future morbidity and mortality is uncertain.

Recent screening studies have suggested, however, that cancers detected by PSA screening may be of greater clinical importance than latent cancers found on autopsy. Studies of asymptomatic patients with nonpalpable cancers detected through PSA screening have reported extracapsular extension, poorly differentiated cell types, tumor volumes exceeding 3 mL, and metastases in 31-38% of cancers that were pathologically staged. 20,41-43 In a retrospective review of radical prostatectomies performed on patients with nonpalpable prostate cancer detected by PSA screening, 65% had a volume greater than 1 mL, and surgical margins were positive in 26% of cases. 44 In a similar series, the mean tumor volume was 7.4 mL and 30% of the tumors had penetrated the capsule. 45

The sensitivity of PSA for clinically important cancers was examined in a recent nested case-control study among 22,000 healthy physicians participating in a long-term clinical trial. 46 Archived blood samples collected at enrollment were compared for 366 men who were diagnosed clinically with prostate cancer during a 10-year follow-up period and 1,098 matched controls without cancer. PSA was elevated (>4 ng/mL) in 46% of the men who subsequently developed prostate cancer and 9% of the control group (i.e., sensitivity 46%, specificity of 91%). For cancers diagnosed within the first 4 years of follow-up, the sensitivity of PSA was 87% for aggressive cancers but only 53% for nonaggressive cancers (i.e., small, well-differentiated tumors), suggesting that PSA is more sensitive for clinically important disease. Given the low incidence of aggressive prostate cancer in this study (1% over 10 years), the reported specificity of 91% would generate a PPV (10-15%) that is lower than that reported from studies using routine biopsies (28-35%). 10 Furthermore, this study could not address the central question of whether PSA would have identified aggressive cancers at a potentially curable stage.

TRUS is a third means of screening for prostate cancer, but its performance characteristics limit its usefulness as a screening test. In most studies, TRUS has a reported sensitivity of 57-68% in detecting prostate cancer in asymptomatic men. 9,10 Because TRUS cannot distinguish between benign and malignant nodules, its PPV is lower than PSA. Although a PPV as high as 31% has been reported for TRUS, 47 its reported PPV when other screening tests are normal is only 5-9%. 15,19 Even when cancers are detected, the size of tumors is often underestimated by TRUS. The discomfort and cost of the procedure further limit its role in screening. top link

Effectiveness of Early Detection

There is currently no evidence that screening for prostate cancer results in reduced morbidity or mortality, in part because few studies have prospectively examined the health outcomes of screening. A case-control study found little evidence that DRE screening prevents metastatic disease the relative risk of metastatic prostate cancer for men with one or more screening DREs compared with men with none was 0.9 (95% confidence interval, 0.5-1.7). 48 A cohort study also reported little benefit from DRE screening, 49 but its methodologic design has been criticized. Randomized controlled trials of DRE and PSA screening, which are expected to provide more meaningful evidence than is currently available, are currently under way in the U.S. and Europe. 50 The results of these studies, however, will not be available for over a decade. Therefore, recommendations for the next 10 years will depend on indirect evidence for or against effectiveness.

Indirect evidence that early detection of prostate cancer improves outcome is limited. Survival appears to be longer for persons with early-stage disease 5-year survival is 87% for Stage A (nonpalpable) tumors, 81% for Stage B (palpable, organ-confined cancer), 64% for Stage C (local extracapsular penetration), and 30% for Stage D (metastatic). 5 Due to recent screening efforts, prostate cancer is now increasingly diagnosed at a less advanced stage. As with survival advantages observed with other cancers, however, it is not known to what extent lead-time and length biases account for differences in observed survival rates (see Chapter ii). The frequently indolent nature of prostate cancer makes length bias a particular problem in interpreting stage-specific survival data. Successful treatment of indolent tumors may give a false impression that "cure" was due to treatment. Prostate cancers detected through screening are more likely to be organ-confined than cancers detected by other means. 20 Proponents of radical prostatectomy often argue that such cancers are potentially curable by removing the gland. As already noted, however, current evidence is inadequate to determine with certainty whether these organ-confined tumors are destined to progress or affect longevity thus the need for treatment is often unclear.

Even if the need for treatment is accepted, the effectiveness of available treatments is unproven. Stage C and Stage D disease are often incurable, and the efficacy of treatment for Stage B prostate cancer is uncertain. Currently available evidence about the effectiveness of radical prostatectomy, radiation therapy, and hormonal treatment derives largely from case-series reports without internal controls, usually involving carefully selected patients and surrogate outcome measures for monitoring progression (e.g., PSA levels). 51-55 Although men treated for organ-confined prostate cancer have a normal life expectancy, it is not clear how much their prognosis owes to treatment. The only randomized controlled trial of prostate cancer treatment, which compared radical prostatectomy with expectant management, reported no difference in cumulative survival rates over 15 years, but the study was conducted in the 1970s and suffered from several design flaws. 56,57 Randomized controlled trials to evaluate the effectiveness of current therapies for early disease are being launched in the U.S. and Europe, but results are not expected for 10-15 years. 58,59

Some observational studies suggest that survival for early-stage prostate cancer may be good even without treatment. A Swedish population-based cohort study of men with early-stage, initially untreated prostate cancer found that, after 12.5 years, 10% had died of prostate cancer and 56% had died of other causes. The 10-year disease-specific survival rate (adjusted for deaths from other causes) for the study population was 85%. Cancer-related morbidity was significant, however. Over one third of the cancers progressed through regional extension, and 17% metastasized. The patient's age and the tumor stage did not significantly influence survival rates, but tumor grade (degree of differentiation) did affect survival the 5-year survival rate was only 29% for poorly differentiated tumors. 59-61 Critics of the study have argued that the high survival rates were due to the relatively large proportion of older men and of tumors detected incidentally during transurethral prostatic resection, and that Swedish data are not generalizable to the U.S. 22,62 Other studies have reported similar results in one series of selected men with well- and moderately differentiated cancer and extracapsular (nonmetastatic) extension, 5- and 9-year survival rates were 88% and 70%, respectively, without treatment. 63 Reported 10-year disease-specific survival for expectant management of palpable but clinically localized prostate cancer is 84-96%. 64-66 Finally, it is unclear whether reported survival rates in these studies, in which many cancers were detected without screening, are generalizable to screen-detected cancers.

Reviewers have attempted to compare the efficacy of treatment and watchful waiting by pooling the results of uncontrolled studies. An analysis of six studies concluded that conservative management of clinically localized prostate cancer (delayed hormone therapy but no surgical or radiation therapy) was associated with a 10-year disease-specific survival rate of 87% for men with well- or moderately differentiated tumors and 34% for poorly differentiated tumors. 67 The assumptions used in the model are not universally accepted, however. 68,69 A structured literature review concluded that the median annual rates of metastatic disease and prostate cancer mortality were 1.7% and 0.9%, respectively, without treatment. 70 This study was criticized for including a large proportion of patients with well-differentiated tumors and those receiving early androgen deprivation therapy. 71 Another review concluded that the annual rates for metastasis and mortality were higher (2.5% and 1.7%, respectively), but the review was limited to patients with palpable clinically localized cancers and excluded studies of cancers found incidentally at prostatectomy. In this population, disease-specific survival was estimated to be 83% for deferred treatment, 93% for radical prostatectomy, and 74% for external radiation therapy. 72 Thus, the effectiveness of treatment when compared with watchful waiting remains uncertain.

Uncertainties about the effectiveness of treatment are important because of its potentially serious complications. Needle biopsy, the diagnostic procedure performed on about 20% of men screened with DRE and PSA, 10 is generally safe but results in infection in 0.3-5% of patients, septicemia in 0.6% of patients, and significant bleeding in 0.1% of patients. 19,73-75 The potential adverse effects of radical prostatectomy are more substantial. Although urologists at specialized centers report operative mortality rates of 0.2-0.3%, 55,76 published rates in clinical studies and national databases range between 0.7% and 2%. 6,70,77-79 An examination of Medicare claims files estimated that the 30-day mortality rate was 0.5%. 80 The reported incidence of impotence varies between 20% and 85%, 11,51,70,79,81,82 depending on definitions for impotence and whether bilateral nerve-sparing techniques are used. Other complications of prostatectomy include incontinence (2-27%), urethral stricture (10-18%), thromboembolism (10%), and permanent rectal injuries (3%). 11,51,70,77,83-87 A study of Medicare patients who underwent radical prostatectomy in the late 1980s reported a 30-day operative mortality rate of 1% and a 4-5% incidence of perioperative cardiopulmonary complications. Over 30% wore pads to control wetting, 6% underwent corrective surgery for incontinence, and 2% required the use of an indwelling catheter. Over 60% reported partial erections and 15% underwent treatment for sexual dysfunction 20% had dilatations or surgical procedure for strictures. 88 Studies of generally healthy and younger patients who have undergone radical prostatectomy in recent years have noted considerably fewer complications. 55

Complications of radiation therapy include death (about 0.2-0.5%), acute gastrointestinal and genitourinary complications (8-43%), chronic complications requiring surgery or prolonged hospitalization (2%), impotence (40-67%), urethral stricture (3-8%), and incontinence (1-2%). 89 Three-dimensional conformal radiotherapy, a recently introduced technique for more precise, high-dose treatment, is reported to produce acute and chronic gastrointestinal or genitourinary complications in 55-76% and 11-12% of patients, respectively. 90 Complication rates in studies of radiation therapy cannot be compared with confidence to reported complication rates for surgery because of differences in study designs and patient populations.

Recent decision analyses have combined current estimates of the benefits and harms to predict whether early treatment improves survival. A frequently cited decision analysis for men aged 60-75 concluded that, in most cases of clinically localized prostate cancer, neither surgery nor radiation therapy significantly improved life expectancy. 91 According to the model, treatment generally results in less than 1 year of improvement in quality-adjusted survival. In men over age 70, the analysis suggested that treatment was more harmful than watchful waiting. The study has been criticized because the subjects consisted largely of older men with low-volume, low-grade tumors and because the probability estimates used in the model may be incorrect. 71,92 Defenders of the study note that the data were adjusted for age and tumor grade (but not stage). Retrospective quality-of-life analyses have reported similar findings, noting that men who have undergone radical prostatectomy or radiation therapy for localized prostate cancer generally report lower quality of life due to impaired sexual, urinary, and bowel function than untreated men, even after controlling for the sexual and urinary dysfunction that is common in this age group. 93

Other decision analyses have examined whether screening itself improves survival. Although older analyses suggested a modest benefit from screening, 94,95 more recent models have reached more pessimistic conclusions when quality-of-life adjustments are incorporated. One analysis concluded that screening and treatment result in an average loss of 3.5 quality-adjusted months of life. 96 Another decision analysis concluded that one-time screening of men aged 50-70 with either DRE or PSA would increase life expectancy by 0-0.2 days and 0.6-1.6 days, respectively, but quality-adjusted life would be decreased by 1.8-7.1 days and 2.1-9.5 days, respectively, per patient screened. 97 The assumptions and calculations used in this model have also been criticized. 98 A recent analysis of annual screening after age 50 concluded that screening would result in an average loss of 0.7 quality-adjusted life-years per patient screened. 98a top link

Recommendations of Other Groups

The American Cancer Society 99 recommends an annual DRE for both prostate and colorectal cancer, beginning at age 40. It recommends that the annual examination of men age 50 and older should include a serum PSA measurement and that PSA screening should begin at age 40 for African American men and those with a family history of prostate cancer. 100 Similar recommendations have been issued by the American Urological Association 101 and the American College of Radiology. 102 In 1994, the Food and Drug Administration expanded the licensure for the PSA test to include screening. 103 The Canadian Task Force on the Periodic Health Examination (CTF) recommended against the routine use of PSA or TRUS as part of the periodic health examination while recognizing the limitations of DRE, they concluded that the evidence was not sufficient to recommend that physicians discontinue use of DRE in men aged 50-70. 104 A 1995 report by the Office of Technology Assessment concluded that research to date had not determined whether or not systematic early screening for prostate cancer with PSA or DRE would save lives, and that the choice to have screening or forego it would depend on patient values. 105 The recommendations of the American College of Physicians and American Academy of Family Physicians are currently under review. In 1992, the American Urological Association concluded that the value of TRUS as an independent screening procedure has not been established and should be reserved for patients with an abnormal DRE or PSA. 106 top link

Discussion

In summary, prostate cancer is a serious public health problem in the United States, accounting for 35,000-40,000 deaths each year and substantial morbidity from disease progression and metastatic complications. Autopsy studies indicate, however, that these cases arise from a much larger population of latent prostate cancers that are present in over nine million American men. Although screening tests such as PSA have adequate sensitivity to detect clinically important cancers at an early stage, they are also likely to detect a large number of cancers of uncertain clinical significance. The natural history of prostate cancer is currently too poorly understood to determine with certainty which cancers are destined to produce clinical symptoms or affect survival, which cancers will grow aggressively, and which will remain latent. Prostate cancer has a complex biology with many unanswered questions about heterogeneity, tumor-host interactions, and prognostic stratification.

More fundamentally, there is no evidence to determine whether or not early detection and treatment improve survival. For men with well- and moderately differentiated disease, treatment appears to offer little benefit over expectant management, whereas the most aggressive tumors may have spread beyond the prostate by the time they are detected by screening. Observed survival advantages for men with early-stage disease may be due to length bias and other statistical artifacts rather than an actual improvement in clinical outcome. Although it is possible that treatment is beneficial for an unknown proportion of men with early prostate cancer, definitive evidence regarding effectiveness will not be available for over a decade, when ongoing randomized controlled trials are completed. In the interim years, during which thousands of deaths from prostate cancer are predicted, screening might be justified for its potential benefit were it not for its potential harms. Widespread screening will subject many men to anxiety from abnormal test results and the discomfort of prostate biopsies aggressive treatment for screen-detected cancers will expose thousands of men to the risks of incontinence, impotence, death, and other sequelae without clear evidence of benefit. Decision-analysis models suggest that the negative impact of these complications on quality of life may outweigh the potential benefits of treatment, but the designs and assumptions of these models are controversial. The absence of proof that screening can reduce mortality from prostate cancer, together with the clear potential that screening will increase treatment-related morbidity, argues against a policy of routine screening in asymptomatic men.

The economic implications of widespread prostate screening, although not a principal argument against its appropriateness, also warrant attention. A full discussion of the cost effectiveness of prostate screening is beyond the scope of this chapter. Moreover, cost effectiveness cannot be properly determined without evidence of clinical effectiveness. Nonetheless, it is clear that routine screening of the 28 million American men over age 50, 39 as recommended by some groups, would be costly. Researchers have predicted that the first year of mass screening would cost the country $12-28 billion. 6,11 This investment might be worthwhile if the morbidity and mortality of prostate cancer could be reduced through early detection -- given certain assumptions, prostate cancer screening might even achieve cost-benefit ratios comparable to breast cancer screening 107 -- but there is currently little evidence to support these assumptions. The costs of this form of screening, with its emphasis on older men, is likely to increase in the future with the advancing age of the United States population the number of American men over age 55 is expected to nearly double in the next 30 years, from 23 million men in 1994 to 44 million by 2020. 39

There is some evidence that the recent increase in prostate screening may be generating a poorly controlled expansion in the performance of radical prostatectomies, creating an unnecessary iatrogenic morbidity in a growing population of surgical patients. The rising incidence of prostate cancer due to increased screening has been accompanied by a tripling in rates for radical prostatectomy in the U.S. 4 If early detection and treatment are effective, they are most likely to benefit men under age 70 rather than older men. As already noted, 10-year survival for early-stage prostate cancer approaches 90%. Thus, most men over age 70, who face a life expectancy of just over 10 years, are more likely to die of other causes than of prostate cancer. Subjecting these men to the risks of biopsy and treatment is often unwarranted, and many proponents of prostate screening therefore recommend against screening after age 70. Nonetheless, studies indicate that radical prostatectomy rates for men aged 70-79 increased 4-fold in 1984-1990, and the trend appears to be continuing in this decade. Population-based rates for prostatectomy in men aged 70-79, many of whom are unlikely to benefit from the procedure, appear to be the same as in men aged 60-69. 78 According to an American College of Surgeons survey, one out of three men undergoing radical prostatectomy in 1990 was age 70 or older. 79

The lack of evidence regarding the benefits of prostate screening and the considerable risks of adverse effects make it important for clinicians to inform patients who express an interest in screening about the consequences of testing before they consent to screening. Although such counseling is proper for all forms of screening, the need for informed consent is especially important for prostate cancer screening because of current uncertainty about its effectiveness and because the proper choice for an individual is highly dependent on personal preferences. Screening is more likely to be chosen by men with strong fears of prostate cancer and by those who can accept the risks of incontinence, impotence, and other treatment complications. Screening is less likely to be chosen by men who are skeptical of the risks of cancer and the effectiveness of treatment and who have strong fears that treatment complications will jeopardize their quality of life.

CLINICAL INTERVENTION

Routine screening for prostate cancer with DRE, serum tumor markers (e.g., PSA), or TRUS is not recommended ("D" recommendation). Patients who request screening should be given objective information about the potential benefits and harms of early detection and treatment. Patient education materials that review this information are available. 108 If screening is to be performed, the best-evaluated approach is to screen with DRE and PSA and to limit screening to men with a life expectancy greater than 10 years. There is currently insufficient evidence to determine the need and optimal interval for repeat screening or whether PSA thresholds must be adjusted for density, velocity, or age.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH. See also the relevant background paper: U.S. Preventive Services Task Force. Screening for prostate cancer: commentary on the recommendations of the Canadian Task Force on the Periodic Health Examination. Am J Prev Med 199410:187-193. top link

11. Screening for Lung Cancer

Burden of Suffering

Cancer of the lung is the leading cause of death from cancer in both men and women in the U.S. An estimated 172,000 new cases will be diagnosed in 1995, with an estimated 153,000 deaths. 1 Lung cancer has one of the poorest prognoses of all cancers, with a 5-year survival rate of less than 13%. 1 Important risk factors for lung cancer include tobacco use and certain environmental carcinogen exposures. Tobacco is associated with 87% of all cases of cancer of the lung, trachea, and bronchus. 2 top link

Accuracy of Screening Tests

The chest radiograph (x-ray) and sputum cytomorphologic examination (cytology) lack sufficient accuracy to be used in routine screening of asymptomatic persons. The accuracy of the chest x-ray is limited by the capabilities of the technology and observer variation among radiologists. Suboptimal technique, insufficient exposure, and poor positioning and cooperation of the patient can obscure pulmonary nodules or introduce artifacts. 3 Radiologists frequently disagree on the interpretation of chest x-rays (interobserver variability). In one study, over 40% of these disagreements were considered potentially significant. 4 Most errors are false-negative interpretations, and pulmonary and hilar masses are among the most commonly missed diagnoses. From 10% to 20% of the incorrect radiologic diagnoses or indeterminate results require follow-up testing for clarification. 4 Interpretation of chest x-rays by primary care physicians is less accurate than interpretation by radiologists. Discrepancies were identified in 58% of chest x-rays read by both family physicians and radiologists. 5 Current radiographic technologies require greater than 20 doublings of tumor size to reach the 1 cm3 needed for the lower limit of chest imaging sensitivity. By the time lung cancer is suspected on chest x-ray, micrometastatic dissemination has often occurred, limiting the effectiveness of early detection. 6

Furthermore, the yield of screening chest radiography is low, largely due to the low prevalence of lung cancer in asymptomatic individuals, even those at high risk. Of the initial 31,360 screening x-rays of asymptomatic smokers in the National Cancer Institute (NCI) Cooperative Early Lung Cancer Detection Program, 256 (0.82%) were interpreted as "suspicious for cancer," and only 121 (0.39% of those screened) were diagnosed with lung cancer. 7 Other studies have confirmed a low yield of performing chest x-rays on asymptomatic persons. 8,9

Sputum cytology is an even less effective screening test, largely due to its low sensitivity compared to chest x-ray. 6 Of the 160 lung cancers detected by dual screening in the NCI study, 123 (77%) would have been detected by chest x-ray alone and 67 (42%) would have been detected by cytologic examination alone. 7 The majority of incident cases detected in subsequent screenings were detected by chest x-ray. 10 In other trials using dual screening, sensitivity of chest x-ray ranges from 40% to 50%, versus 10% to 20% for sputum cytology. 11 Mass screening to detect lung cancer with tests that lack a high sensitivity will be inefficient. 12 top link

Effectiveness of Early Detection

Lung cancer is usually asymptomatic until it has reached an advanced stage, when the treatment outcome is poor. Five-year survival for all stages is 11-14% for Stage I it is 42-47%. 1 Under optimal conditions, survival can be higher. 10,12,13 Early detection of Stage I cases through screening might be expected to improve survival, but the small amount of available evidence does not show that screening reduces lung cancer mortality.

The efficacy of chest radiographic screening for lung cancer was first investigated in the 1960s. A controlled prospective study involving over 55,000 persons found that those receiving chest x-rays every 6 months had a larger proportion of resectable tumors, but mortality for lung cancer remained the same when compared with controls who received examinations only at the beginning and end of the trial. 14 Similar findings were reported in the Philadelphia Pulmonary Neoplasm Research Project 15 and, more recently, in a case-control study. 16 In addition, the results of one of the three centers participating in the NCI Cooperative Early Lung Cancer Detection Program provide indirect evidence of the limited efficacy of radiographic screening. In this study, persons receiving chest x-rays and sputum cytology every 4 months had the same lung cancer mortality as persons advised to obtain annual testing. 17

No prospective randomized study with adequate follow-up time has compared radiographic screening with no screening. A case-control study in Japan compared the screening histories of 273 fatal cases of lung cancer to 1,269 controls, and although the data suggest a trend toward a decreased risk of lung cancer mortality in those screened with chest x-rays (with or without sputum cytologic tests), the difference was not statistically significant. 18

Three large clinical trials published by the NCI Cooperative Early Lung Cancer Detection Program examined the efficacy of dual screening (chest x-ray and sputum cytology) in over 30,000 male smokers aged 45 or older. 7,10,19-23 Two trials comparing annual dual screening with annual radiographic screening tested the incremental benefit of adding sputum cytology to radiographic screening. 20,21 The third trial, which compared dual screening every 4 months with advice to receive the same tests annually, examined the benefit of frequent dual screening compared to usual medical care. 22 In each study, lung cancer mortality did not differ between experimental and control groups. Although early-stage, resectable tumors were more common and 5-year survival significantly higher in groups receiving regular dual screening, lead-time and length biases may have been responsible for these findings. A randomized prospective trial of dual screening in Czechoslovakia produced similar results. 24 The investigators found no substantial difference in the number or causes of death between study groups.

The NCI is currently conducting the multicenter PLCO (prostate, lung, colorectal, and ovarian cancers) Trial, which will compare annual chest radiographic testing with usual care in both men and women. 25 top link

Recommendations of Other Groups

No organizations currently recommend routine screening of either the general population or of smokers for lung cancer with either chest x-rays or sputum cytology. 26-31 top link

Discussion

Lung cancer is the leading cause of cancer mortality. Although screening may increase early detection of resectable early cancers, controlled trials provide no significant evidence that lung cancer screening reduces mortality from this disease. To the weakness of the evidence for screening must be added the substantial costs of routine testing, 9 including false-positive results that lead to unnecessary expense and morbidity from follow-up procedures. 32 Current research and clinical trials of chemoprevention, 33 as well as research in early detection markers such as monoclonal antibodies, 6,34 may improve efficacy in prevention or early identification of lung cancer. Primary prevention -- mainly through discouraging tobacco use -- is a more effective strategy than screening to reduce lung cancer morbidity and mortality. 11 Unless ongoing trials find a benefit of periodic chest x-rays, the cost, inconvenience, and potential harms of screening cannot be justified.

CLINICAL INTERVENTION

Routine screening of asymptomatic persons for lung cancer with chest radiography or sputum cytology is not recommended ("D" recommendation). All patients should be counseled against tobacco use (see Chapter 54).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Kathlyne Anderson, MD, MOH, and Donald M. Berwick, MD, MPP.top link

12. Screening for Skin Cancer --- Including Counseling to Prevent Skin Cancer

Burden of Suffering

Over 800,000 new cases of skin cancer are diagnosed each year. 1 More than 95% of these are basal cell (BCC) and squamous cell (SCC) carcinomas, also referred to as nonmelanomatous skin cancers (NMSC). These are highly treatable and rarely metastasize, but local tissue destruction may cause disfigurement or functional impairment if these tumors are not detected early. 2 They account for approximately 2,100 deaths each year. 1 The risk of NMSC is increased by a personal history of NMSC older age light eyes, skin, or hair poor ability to tan and substantial cumulative lifetime sun exposure. 3-5

Malignant melanoma (MM) is less common than NMSC but is far deadlier. An estimated 34,100 new cases and 7,200 deaths (2.2/100,000 population) from MM occurred in the U.S. in 1995. 1,6 The incidence rate varies by race: 9.2/100,000 in whites, 1.9/100,000 in Hispanics, and 0.7-1.2/100,000 in blacks and Asians. 7 In the past two decades, increases of 4%/year in MM incidence and nearly 2%/year in mortality have been reported. 6,8 With a median age at diagnosis of 53 years, 9 MM ranks second among adult-onset cancers in years of potential life lost per death. 10 Significant risk factors for MM besides white race include melanocytic precursor or marker lesions (e.g., atypical moles, certain congenital moles), increased numbers of common moles, immunosuppression, and a family or personal history of skin cancer, especially MM. 11-23 Fewer than 5% of the population have melanocytic precursor lesions, which have a high malignant potential and may account for as many as 40% of melanomas. 24 For persons with the rare familial atypical mole and melanoma (FAM-M) syndrome, the MM risk is increased 100-fold or more, 11-13 and the cumulative lifetime risk may approach 100%. 11 Persons with intermittent intense sun exposure or severe sunburns in childhood also appear to have an increased risk that varies by MM subtype. 17,19,20,25-27 Persons with poor tanning ability, freckles, or light skin, hair, and eye color may have a small increased risk of MM. 17-20,28 top link

Accuracy of Screening Tests

The principal screening test for skin cancer is physical examination of the skin by a clinician. Detection of a suspicious lesion constitutes a positive screening test, which then should be confirmed by skin biopsy. The true sensitivity and specificity of the skin examination are unknown. 29 In virtually all studies evaluating the accuracy of the skin examination, only clinically suspicious lesions were biopsied and only screen-positive persons were followed therefore, sensitivity and specificity cannot be determined accurately. One study of persons presenting to free skin cancer screening clinics for screening by dermatologists estimated sensitivity of the examination using population incidence rates to estimate false-negative rates sensitivities were 97% for MM, 94% for BCC, and 89% for SCC. 30 Two or more risk factors for skin cancer were present in 78% of those screened, however, so sensitivities may have been overestimated. 31 Among persons with positive screening clinic examinations, the likelihood of histologic confirmation has been reported to be 40% for MM, 43% and 57% for BCC, and 14% and 75% for SCC. 30,32 For persons presenting for skin examination to skin clinics, the likelihood of histologic confirmation given a clinical diagnosis of MM is 38-64% for dermatologists and 72-84% for skin cancer specialists. 33-35 Among patients biopsied by dermatologists who had histologically confirmed MM, the diagnosis was suspected in 62-85% of cases. 34,36 In a randomized community study evaluating screening by expert dermatologists, histologic examination confirmed the clinical diagnosis of SCC in 38% of cases and of BCC in 59%. 37 In vivo epiluminescence microscopy appears to improve dermatologists' diagnostic accuracy for skin lesions, 38,39 but it is not a practical screening tool for primary care physicians.

Primary care physicians and others lacking specialized training in dermatology would be expected to have greater difficulty in evaluating skin lesions. Several studies have reported that, compared to dermatologists, nondermatologists make significantly fewer correct diagnoses of skin lesions (including MM and BCC) from color photographs. 40-42 In one such study, at least five of six photographs of MM were correctly identified by 69% of the dermatologists but by only 12% of the nondermatologists at least one of two atypical moles was recognized by 96% of the dermatologists but by only 42% of the nondermatologists. 40

One factor affecting the yield of screening for skin cancer is the proportion of the body surface examined. Only 20% of MM occur on normally exposed body surfaces, in contrast to 85-90% of NMSC. 9,27 Dermatologists estimate that detection of MM is 2-6 times more likely with a total-body skin examination (TSE). 43,44 A second factor that affects yield is the frequency of examination. If the interval between examinations is too long, new cancers may not be detected before they have progressed to an advanced stage. There are no published data available, however, with which to determine the optimal frequency of examination in the general population annual or biennial intervals have been recommended on the basis of clinical judgment. Poor patient compliance with recommendations for yearly total skin examinations may reduce the effectiveness of this intervention in one study, only 22/524 (4.2%) patients returned for the yearly TSE that was recommended on the first visit. 45

In terms of risk to the patient, no serious adverse effects associated with TSE and follow-up biopsy have been reported, and experts view it as acceptable and safe. 33 Embarrassment may be an adverse effect, 46 because modesty is one of the main reasons given for refusing a TSE. 44 Medical expenses may also be increased because office visits must be lengthened to accommodate complete undressing, "chaperoning," examination, and redressing, 46 and because more frequent referrals and biopsies are likely to result. There are no controlled studies evaluating any adverse effects of TSE.

Patient self-examination would be expected to be less accurate than physician examination in evaluating skin lesions. One study evaluated patients' ability to apply a seven-point checklist to the skin lesion that prompted their referral to a dermatologist. 35 The patient checklist had a sensitivity of 71%, specificity of 99%, and positive predictive value of 7% for MM diagnosis, using the dermatologist's clinical diagnosis as the "gold standard." The sensitivity and specificity using histologic diagnosis as the reference standard would likely be lower. No data were found evaluating the ability of patients to detect suspicious lesions, the accuracy of periodic skin self-examination, or the efficacy of self-examination instructions in reducing errors. top link

Effectiveness of Early Detection

Early treatment might reduce morbidity and disfigurement for patients with BCC and SCC, 2 but no studies were found that have evaluated whether such cancers discovered by screening have a better outcome than those which present clinically.

For MM, there have also been no controlled trials evaluating the impact of screening on morbidity or mortality. A time-series study of an educational campaign to encourage MM screening by primary providers in Scotland found a trend toward a reduction in both thick tumors (p < 0.05) and mortality (not statistically tested) in women (but not men) after the campaign. 47 Women were overrepresented in the screened population, which may explain the difference in mortality by sex. No control group was included, so differences due to historical trends or other factors cannot be excluded. The authors noted that in Denmark, which has comparable incidence rates, the MM mortality in women rose during this period.

More data are available on the effect of screening by dermatologists on MM thickness. In two large case series of persons with atypical moles who were screened regularly by dermatologists, all MM detected were either thin (< 0.89 mm) or in situ. 13,15 Time series in the general population, and cohort studies in FAM-M syndrome kindreds and in persons with a prior MM, have reported that screening by dermatologists detected significantly thinner tumors when compared to historical population, kindred, or personal index cases. 47-51 Several countries have reported a consistent decline over the past 3-4 decades in median thickness of MM, although this decline has not been directly linked to screening programs. 52,53 None of these studies used concurrent unscreened controls to differentiate the effects of screening programs from historical trends or lead-time and length biases.

If clinician screening does in fact result in detection of significantly thinner MM, mortality might be reduced. Case series and a prediction model (validated on subsequent incident cases) have reported that survival is directly related to lesion thickness at the time of resection. 9,39,54-56 For example, 5-year survival is 95-99% for persons with lesions <= 0.75 mm, 66-77% for 1.51-4.0 mm, 41-51% for 4.76-9.75 mm, and 5% for those with disseminated MM. The likelihood of recurrence after resection also correlates with lesion thickness. A MM < 1 mm thick is associated with an 8-year disease-free survival rate of 90%, compared with 74% for lesions 1-2 mm thick. 57 Although it is possible that lead-time and length biases account for some of these differences, these data suggest that persons in whom thinner MM are detected experience a better outcome than those detected with more advanced disease.

Data on the effectiveness of early detection by skin self-examination are limited. Preliminary analyses from a population-based case-control study retrospectively evaluating the efficacy of skin self-examination in patients with MM suggest a protective effect of skin awareness and self-examination, 58,59 but final results from this study have not yet been published. top link

Primary Prevention

Primary prevention of skin cancer may involve limiting exposure to solar radiation (by limiting sun exposure, avoiding tanning facilities, and wearing protective clothing) or applying sunscreen preparations. Although the effectiveness of these maneuvers has not been evaluated in clinical trials, avoiding sun exposure or using protective clothing is likely to decrease the risk of MM and NMSC, since both types of cancer have been associated with sun exposure in numerous cohort and case-control studies. 3,4,17,19,20,25-27 Use of tanning facilities has not been directly linked to cancer risk, but skin damage after use is common. 60,61 Many adolescents report using such facilities, 61 and severe sunburns occurring at a young age may increase the risk of subsequent melanoma. 17,19,20,26,27 The principal adverse effect associated with avoiding exposure to ultraviolet and other solar radiation is failure to acquire a suntan, which may be perceived as undesirable by some. 48,62,63

The evidence that sunscreens prevent skin cancer is less clear. Sunscreen agents are formulated and tested for their ability to prevent the acute effects of solar ultraviolet radiation (i.e., sunburn). 64 Most currently available sunscreens block ultraviolet B (UVB) wavelengths, and a few block ultraviolet A (UVA) rays. 65 Only the physical sunblocks (e.g., zinc oxide, talc, etc.) block all solar rays. A randomized controlled trial evaluated the regular use of UVA- and UVB-blocking sunscreens by persons >=40 years of age with previous solar keratoses (which are precursors of SCC, although their risk of malignant transformation is low). 66 The development of solar keratoses over a 6-month period was significantly reduced, implying that the risk of SCC may also be reduced. The generalizability of the results achieved by these highly motivated volunteers is unknown, and the study did not adequately describe investigator blinding, lesion classification, or the adequacy of randomization. Studies in albino laboratory rodents have also reported that sunscreens can reduce the incidence of tumors resembling human SCC after UV radiation. 64,67-69 Animal data are more limited for MM, but a recent study in mice reported that sunscreen failed to protect against UV radiation-induced increases in melanoma incidence, although it did prevent sunburn. 70 In a fish model, both UVA and visible light, which are not blocked by many currently available sunscreens, were highly effective in inducing melanomas. 71 Several case-control and cohort studies found either no effect or a significantly increased risk of BCC 72 and MM 73,74 in sunscreen users, after adjusting their risk estimates for phenotype (e.g., hair color, tendency to sunburn). The increased risk found in several of these studies may be due to residual confounding, since in all studies adjustment for phenotype reduced the crude risk estimates. It is also possible that sunscreens may increase skin cancer risk by encouraging susceptible persons to prolong exposure of greater skin surface areas to solar rays that are not blocked by most currently used sunscreens. There is as yet no direct evidence that sunscreens prevent skin cancer in humans, but clinical trials of sunscreen in humans are unlikely to be conducted due to cost and time constraints. Sunscreens are associated with mild to moderate side effects in 1-2% of users, including contact and photocontact dermatitis, contact urticaria, and comedogenicity, although these are readily reversible when use is discontinued. 65,75,76

There are few data examining the effectiveness of counseling patients to protect themselves from sunlight. A case series evaluating counseling given at the time of removal of a skin cancer, and on a yearly basis thereafter, reported increased use of protective clothing and sunscreen and reduced deliberate tanning at 2-6-year follow-up. 77 This study included only the two-thirds of patients who complied with follow-up and was not able to determine how much of the effect seen was due to the surgery alone. There is also evidence from case series that public education can increase knowledge and beliefs about the health risks of sun exposure, 48,78 but cross-sectional surveys give conflicting results about whether knowledgeable persons act on this information. 62,63,79 Community and worksite educational interventions to reduce the risk of skin cancer, including one with a concurrent control group, have demonstrated significantly increased use of sun protection measures, such as hats, shirts, and staying in the shade, after the intervention. 80,81 Whether the results of such educational interventions can be generalized to clinician counseling is not known. No studies on the effectiveness of counseling in reducing skin cancer incidence or mortality were found. top link

Recommendations of Other Groups

The American Cancer Society recommends monthly skin self-examination for all adults 8 and physician skin examination every 3 years in persons 20-39 years old and annually in persons >= 40 years old. 82 The American Academy of Dermatology, 2,83 and a National Institutes of Health (NIH) Consensus Panel 84 recommend regular screening visits for skin cancer and patient education concerning periodic skin self-examinations. The NIH Consensus Panel also recommended that some family members of patients with MM be enrolled in surveillance programs. 84 The Canadian Task Force on the Periodic Health Examination does not recommend for or against routine screening for skin cancer or periodic skin self-examination, but suggests that TSE for a very select subgroup of individuals at high risk (e.g., those with familial atypical mole and melanoma syndrome) may be prudent. 85 The American Academy of Family Physicians recommends complete skin examination for adolescents and adults with increased recreational or occupational exposure to sunlight, a family or personal history of skin cancer, or evidence of precursor lesions these recommendations are under review. 86 The American Cancer Society, 8 the American Academy of Dermatology, 2,83 the American Medical Association, 87 and the NIH Consensus Panel 84 all recommend patient education concerning sun avoidance and sunscreen use. The American Academy of Family Physicians recommends skin protection from ultraviolet light for all persons with increased exposure to sunlight. 86 The Canadian Task Force recommends avoidance of sun exposure and use of protective clothing, but it does not recommend either for or against sunscreen use for the prevention of skin cancer. 85 The American Academy of Dermatology, 88 the American Medical Association, 87 the American Cancer Society, 89 and the NIH Consensus Panel 84 have recommended avoiding artificial tanning devices. top link

Discussion

Basal cell and squamous cell skin carcinomas are very common but are slow-growing and rarely metastasize. It is unlikely that population screening would substantially improve the already excellent outcome of persons with these tumors. The principal potential benefit of periodic skin examination lies in discovering early MM. The sensitivity and specificity of skin examination by primary physicians, and the optimal frequency of such examinations, is unknown, however. MM is, in addition, uncommon in the general population (lifetime risk of about 1.0%). 90 Since 99% of patients who would be examined annually under a policy of routine screening would never have MM, it is also important to consider the potential adverse effects as well as the cost/benefit ratio of skin cancer screening. Neither of these has been adequately evaluated. 33,39 No controlled studies have demonstrated that screening for MM by primary providers improves outcome, although a time series study suggests a possible mortality benefit. There is thus weak evidence that screening by primary clinicians is effective in improving clinical outcome. In persons at very high risk for MM (i.e., those with melanocytic precursor or marker lesions), referral to skin cancer specialists for evaluation may be justified based on high burden of suffering, minimal adverse effects of TSE, and greater accuracy of the TSE by such specialists however, there is no direct evidence that screening this population reduces mortality. There is currently only limited evidence of the efficacy of skin self-examination in reducing melanoma mortality, but preliminary results from a population-based case-control study appear promising.

There is fair evidence of the efficacy and safety of sun avoidance and use of protective clothing for the prevention of skin cancer, and weaker evidence to support avoiding artificial tanning devices. There is also fair evidence from one randomized controlled trial, supported by animal data, that sunscreens that block UVA and UVB rays are efficacious in preventing squamous cell cancer precursors, but data are limited on the efficacy of sunscreens in preventing skin cancer. There is also good evidence of mild, reversible adverse effects of sunscreens. Community or worksite educational interventions may increase the use of these sun protection measures, but the effectiveness of clinician counseling in modifying such behaviors is not established.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening for skin cancer by primary care providers using total-body skin examination ("C" recommendation). Clinicians should remain alert for skin lesions with malignant features (i.e., asymmetry, border="1" irregularity, color variability, diameter > 6 mm, or rapidly changing lesions) 84 when examining patients for other reasons, particularly patients with established risk factors. Such risk factors include clinical evidence of melanocytic precursor or marker lesions (e.g., atypical moles, certain congenital moles), large numbers of common moles, immunosuppression, a family or personal history of skin cancer, substantial cumulative lifetime sun exposure, intermittent intense sun exposure or severe sunburns in childhood, freckles, poor tanning ability, and light skin, hair, and eye color. Appropriate biopsy specimens should be taken of suspicious lesions.

Persons with melanocytic precursor or marker lesions (e.g., atypical moles [also called dysplastic nevi], certain congenital nevi, familial atypical mole and melanoma syndrome) are at substantially increased risk for MM. A recommendation to consider referring these patients to skin cancer specialists for evaluation and surveillance may be made on the grounds of patient preference or anxiety due to high burden of suffering, the greater accuracy of TSE when performed by such specialists, and the relatively limited adverse effects from TSE and follow-up skin biopsy, although evidence of benefit from such referral is lacking.

There is also insufficient evidence to recommend for or against counseling patients to perform periodic self-examination of the skin ("C" recommendation). Clinicians may wish to educate patients with established risk factors for skin cancer (see above) concerning signs and symptoms suggesting cutaneous malignancy and the possible benefits of periodic self-examination.

Avoidance of sun exposure, especially between the hours of 10:00 am and 3:00 pm, 65 and the use of protective clothing such as shirts and hats when outdoors are recommended for adults and children at increased risk of skin cancer (see above) ("B" recommendation). Counseling such patients to avoid excess sun exposure and use protective clothing is recommended, based on the established efficacy of risk reduction from sun avoidance, the potential for large health benefits, low cost, and low risk of adverse effects from such counseling, even though the effectiveness of such counseling is less well established ("C" recommendation).

There is insufficient evidence to recommend for or against counseling patients to use sunscreens to prevent skin cancer ("C" recommendation). The routine use of sunscreens that block both UVA and UVB radiation may be appropriate for persons who have previously had solar keratosis and who cannot avoid sun exposure, in order to prevent additional solar keratoses, which have a small malignant potential.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

13. Screening for Testicular Cancer

Burden of Suffering

Testicular cancer is a relatively uncommon disease, with an overall annual incidence of about 4/100,000 men. 1 It is, however, the most common form of cancer in young men between ages 20 and 35, 2 accounting for an estimated 7,100 new cases and 370 deaths in the U.S. in 1995. 3 The peak annual incidence ranges from 8 to 14/100,000 men between 20 and 35 years of age, with a smaller peak in early childhood. 4 The incidence in black men is less than one fifth that of white men. 4 The major predisposing risk factor is cryptorchidism. 1 In men with a history of cryptorchidism, 80-85% of testicular tumors occur in the cryptorchid testicle, while 15-20% occur in the contralateral testicle. Other risk factors include previous cancer in the other testicle, a history of mumps orchitis, inguinal hernia, or hydrocele in childhood, and high socioeconomic status. 1

Ninety-six percent of testicular cancers are of germ cell origin, of which seminoma is the most common type. Prognosis and treatment depend on the cell type and stage of disease however, recent advances in treatment have resulted in a 92% overall 5-year survival. 3 Even among the small proportion of patients (12%) with advanced disease at diagnosis, 5-year survival is close to 70%. 4 top link

Accuracy of Screening Tests

The two screening tests proposed for testicular cancer are physician palpation of the testes and self-examination of the testes by the patient. Detection of a suspicious testicular mass constitutes a positive test, and the diagnosis is confirmed by biopsy and histologic examination of tissue. There is no information on the sensitivity, specificity, or positive predictive value of testicular examination in asymptomatic persons whether done by providers or by patients. Even if they were known, measures of sensitivity and specificity for palpation of the testes might not be very meaningful because of the low incidence of testicular cancer and the high cure rate. If sensitivity is defined as the probability that disease, when present, is detected at a curable stage, then sensitivity is probably high because the overall cure rate (in the absence of systematic screening) is 92%. The negative predictive value is probably also quite good due to the low incidence of the disease. The positive predictive value, however, of palpation of the testes is probably very low due to the low incidence of disease and large number of other causes of scrotal masses.

There is evidence from older literature that between 26% and 56% of patients presenting initially to their physician with testicular cancer are first diagnosed as having epididymitis, testicular trauma, hydrocele, or other benign disorders, 6-8 and these patients often receive treatment for these conditions before the cancer is diagnosed. 7,9,10

There have been few studies of whether counseling men to perform self-examination motivates them to adopt this practice or to perform it correctly. Research to date has demonstrated only that education about testicular cancer and self-examination may enhance knowledge and self-reported claims of performing testicular examination. 11,12 One study found that men who reviewed an educational checklist on how to perform self-examination were able to demonstrate greater skill when self-examination was performed moments later they were also able to recall the contents of the checklist in a telephone survey months later. 13 Few studies, however, have examined whether education or self-examination instructions actually increase the performance of self-examination. It is also unclear whether persons who detect testicular abnormalities seek medical attention promptly. Patients with testicular symptoms may wait as long as several months before contacting a physician. 6

Finally, no studies have been conducted to test whether persons who perform testicular self-examination are more likely to detect early-stage tumors or have better survival than those who do not practice self-examination. 5 Published evidence that self-examination can detect testicular cancer in asymptomatic persons is limited to a small number of case reports. 14

Tumor markers, including alpha-fetoprotein and human chorionic gonadotropin are useful in following nonseminomatous testicular cancers but are not useful for early detection or screening. 1,15 top link

Effectiveness of Early Detection

The prognosis for advanced stages of testicular cancer has improved dramatically in the past decade with the introduction of better chemotherapy. Current cure rates are greater than 80%. 5,16 Survival, however, is still better for patients with Stage I cancer than in those with more advanced disease, and the treatment of early cancer has less cost and morbidity. Treatment for all types and stages of testicular cancer includes removal of the involved testicle. The current 5-year survival for Stage I seminoma treated with radiotherapy is 97%. 3 Stage I nonseminomatous cancers (e.g., teratoma, embryonal carcinoma, choriocarcinoma) treated with radical retroperitoneal lymph node dissection have a reported 3-5-year survival approaching 90%. 17 With the advent of cisplatin-based chemotherapeutic regimens, a 3-year survival of 90-100% has been reported. Reported survival in patients with disseminated testicular cancer, however, is lower (about 67-80%), and these persons require intensive treatment with chemotherapeutic agents that produce a variety of systemic side effects. 3,5,16

Although lead-time and length biases may account for part of the improved survival observed for persons with early-stage testicular cancer, it is likely that the prognosis is better for persons with less advanced disease. No studies have been done to determine whether screening increases the proportion of cancers diagnosed at early stages, or improves outcomes. Even without screening, 60-80% of seminomas are Stage I at diagnosis. 17 There is evidence that once testicular symptoms have appeared, diagnostic delays are associated with more advanced disease and lower survival. 6,7,18

The appropriate management and follow-up of patients with a history of an undescended testicle is controversial. 19,20 It is known that orchiopexy at puberty does not reduce malignant transformation. It is uncertain whether earlier orchiopexy prior to school age, which is now common practice, will prevent development of testicular cancer. 19 Giwercman et al. found carcinoma in situ in 2% of men with a history of cryptorchidism who had testicular biopsies. 20 They predicted 50% of these lesions would progress to invasive cancer and recommended that testicular biopsy be offered to all men with a history of cryptorchidism. Many experts recommend that intraabdominal testes should be removed. 1 The survival for patients with a history of cryptorchidism who develop testicular cancer is excellent, as it is in noncryptorchid patients. No studies have been done to evaluate benefits of formal screening of men with a history of cryptorchidism.top link

Recommendations of Other Groups

The American Cancer Society recommends a cancer checkup that includes testicular examination every 3 years for men over 20 and annually for those over 40. 21 No recommendation is given for testicular self-examination. The American Academy of Family Physicians recommends a clinical testicular examination for men aged 13-39 years who have a history of cryptorchidism, orchiopexy, or testicular atrophy this policy is currently under review. 22 The American Academy of Pediatrics recommends testes self-examination beginning at age 18 years. 23 The Canadian Task Force on the Periodic Health Examination concluded that there is insufficient evidence to include or exclude routine screening for testicular cancer by palpation in the periodic health examination. 24 top link

Discussion

There is no direct experimental evidence on which to base a recommendation for or against screening for testicular cancer by either physician examination or patient self-examination, since no studies of screening have been done. It seems unlikely that screening would substantially improve the already favorable outcome in this uncommon disease. If a population of 100,000 men aged 15-35 years were screened with a 100% sensitive test, at most 10 cancers would be detected. At least nine of these would be expected to be cured in the absence of a formal screening program. It is unknown whether the tenth patient would also be cured as a result of the cancer being detected by screening. A primary care physician with 1,500 males in his/her practice could expect to detect one testicular cancer every 15-20 years. The vast majority of men screened by either physician or self-palpation would have normal examinations of those with suspicious masses, most would have benign disease (false positives). Many of these cases, however, would require referral to urologists, radiographic studies, or invasive procedures (e.g., biopsy or inguinal exploration) before malignancy could be ruled out. 17 These interventions would incur considerable costs and possible morbidity.

Men with a history of undescended testes or testicular atrophy have a much greater incidence of testicular cancer. Although screening in this population has also not been shown to improve outcome, it would be expected to have a much higher yield than screening in the general population.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening of asymptomatic men for testicular cancer by physician examination or patient self-examination ("C" recommendation). Patients with an increased risk of testicular cancer (those with a history of cryptorchidism or atrophic testes) should be informed of their increased risk of testicular cancer and counseled about the options for screening. Such patients may then elect to be screened or to perform testicular self-examination. Adolescent and young adult males should be advised to seek prompt medical attention if they notice a scrotal abnormality.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Paul S. Frame, MD. top link

14. Screening for Ovarian Cancer

Burden of Suffering

Ovarian cancer is the fifth leading cause of cancer deaths among U.S. women and has the highest mortality of any of the gynecologic cancers. 1 It accounted for an estimated 26,600 new cases and 14,500 deaths in 1995. 1 The lifetime risk of dying from ovarian cancer is 1.1%. 1a The overall 5-year survival rate is at least 75% if the cancer is confined to the ovaries and decreases to 17% in women diagnosed with distant metastases. 2,3 Symptoms usually do not become apparent until the tumor compresses or invades adjacent structures, ascites develops, or metastases become clinically evident. 4 As a result, two thirds of women with ovarian cancer have advanced (Stage III or IV) disease at the time of diagnosis. 2,5,6 Carcinoma of the ovary is most common in women over age 60. 7 Other important risk factors include low parity and a family history of ovarian cancer. 8-10 Less than 0.1% of women are affected by hereditary ovarian cancer syndrome, but these women may face a 40% lifetime risk of developing ovarian cancer. 11 top link

Accuracy of Screening Tests

Potential screening tests for ovarian cancer include the bimanual pelvic examination, the Papanicolaou (Pap) smear, tumor markers, and ultrasound imaging. The pelvic examination, which can detect a variety of gynecologic disorders, is of unknown sensitivity in detecting ovarian cancer. Although pelvic examinations can occasionally detect ovarian cancer, 12,13 small, early-stage ovarian tumors are often not detected by palpation, 14,15 due to the deep anatomic location of the ovary. Thus, ovarian cancers detected by pelvic examination are generally advanced 7,16-18 and associated with poor survival. 16 The pelvic examination may also produce false positives when benign adnexal masses (e.g., functional cysts) are found. 16,19 The Pap smear may occasionally reveal malignant ovarian cells, 20 but it is not considered a valid screening test for ovarian carcinoma. 16-18,21 Studies indicate that the Pap smear has a sensitivity for ovarian cancer of only 10-30%. 16

Serum tumor markers are often elevated in women with ovarian cancer. Examples of these markers include carcinoembryonic antigen, ovarian cystadenocarcinoma antigen, lipid-associated sialic acid, NB/70K, TAG 72.3, CA15-3, and CA-125. CA-125 is elevated in 82% of women with advanced (Stage III or IV) ovarian cancer, 22 and it is also elevated, although less frequently, in women with earlier stage disease. 23 In studies of women with known or suspected ovarian cancer, the reported sensitivities of CA-125 in detecting Stage I and Stage II cancers are 29-75% and 67-100%, respectively. 24-30 These cases may not be representative of asymptomatic women in the general population, however. In screening studies, including a recent study of more than 22,000 women, the reported sensitivity was 53-85%. 13,31 Evidence is limited on whether tumor markers become elevated early enough in the natural history of occult ovarian cancer to provide adequate sensitivity for screening. Studies of stored sera have found that about one half of women who developed ovarian cancer had elevated CA-125 levels (>35 U/mL) 18 months 23 to 3 years 32 before their diagnosis. Further research is needed, however, to provide more reliable data on the sensitivity of this and other tumor markers in detecting early-stage ovarian cancer in asymptomatic women.

Tumor markers may have limited specificity. It has been reported that CA-125 is elevated in 1% of healthy women, 6-40% of women with benign masses (e.g., uterine fibroids, endometriosis, pancreatic pseudocyst, pulmonary hamartoma), and 29% of women with nongynecologic cancers (e.g., pancreas, stomach, colon, breast). 22,33 Reported specificity in screening studies is about 99%. 13,31 It may be possible to improve the specificity of CA-125 measurement by selective screening of postmenopausal women, 34 modifying the assay technique, 35 adding other tumor markers to CA-125, 36 requiring a higher concentration or persistent elevation of CA-125 levels over time, or combining CA-125 measurement with ultrasound (see below). Prospective studies involving asymptomatic women are needed, however, to provide definitive data on the performance characteristics of these techniques when used as screening tests.

Ultrasound imaging has also been evaluated as a screening test for ovarian cancer, since it is able to estimate ovarian size, detect masses as small as 1 cm, and distinguish solid lesions from cysts. 17,37 Transvaginal color-flow Doppler ultrasound can also identify vascular patterns associated with tumors. 38,39 In screening studies, the reported sensitivity and specificity of transabdominal or transvaginal ultrasound are 50-100% and 76-97%, respectively, 3,14,15,40-43 but small sample sizes, limited follow-up, and outdated techniques may limit the validity of the data. Studies have shown that routine ultrasound testing of asymptomatic women has a low yield in detecting ovarian cancer and generates a large proportion of false-positive results that often require diagnostic laparotomy or laparoscopy. In one study, ultrasound screening of 805 high-risk women led to 39 laparotomies, which revealed one ovarian carcinoma, two borderline tumors, one cancer of the cecum, and five cystadenomas. 40 A transvaginal ultrasound study of 600 patients with previous breast cancer revealed 18 patients with complex cysts or enlarged ovaries. Laparotomy was performed on 21 patients, four of whom had ovarian cancer (positive predictive value of 22%) the use of color-flow imaging appeared to increase the positive predictive value. 44

In a larger study, ultrasound was performed routinely on 5,678 asymptomatic female volunteers over age 45 or with a history of previous breast or gynecologic cancer. 44a Two Stage I ovarian cancers were detected in a total of 6,920 scans performed over 2 years. Another report from the same center indicated that 14,356 ultrasound examinations performed over 3 years on 5,489 asymptomatic women over age 45 detected five ovarian cancers. 45 Although the sensitivity and specificity of the test were excellent (100% and 94.6%, respectively), the positive predictive value in this low-risk study population was only 2.6% and follow-up was of short duration. It has been calculated from these results and other data that ultrasound screening of 100,000 women over age 45 would detect 40 cases of ovarian cancer, but at a cost of 5,398 false positives and over 160 complications from diagnostic laparoscopy. 46

It may be possible to improve accuracy by combining ultrasound with other screening tests, such as the measurement of CA-125. This approach has been examined as a method of discriminating between benign and malignant adnexal masses in preoperative patients. 47 Further research is needed, however, to determine the sensitivity, specificity, and positive predictive value of performing these tests in combination to screen asymptomatic women. One prospective study 12 screened 1,010 asymptomatic postmenopausal women over age 45 with pelvic examination and CA-125 measurement those with abnormal results received an ultrasound examination. Although one ovarian cancer was detected (all three screening tests were positive in this woman), the study demonstrated poor positive predictive value with each of the three screening tests. No abnormality was discovered in 28 of the 31 women with elevated CA-125. Fibroids and benign cysts were responsible for over half of the 28 abnormal pelvic examinations. There were 13 abnormal ultrasound examinations 12 of these women consented to laparotomy, which revealed six benign ovarian cysts, two fimbrial cysts, two women with no surgical findings, one woman with adhesions, and the ovarian cancer. A more recent report from the same center found that the combination of abdominal ultrasound and sequential CA-125 measurements had a sensitivity of 58-79%, a specificity of about 100%, and a positive predictive value of 27%. 31 Another program that screened 597 women with transvaginal color-flow Doppler ultrasound and CA-125 measurements detected abnormalities in 115 patients, only one of whom had ovarian cancer. 48 top link

Effectiveness of Early Detection

There is no direct evidence from prospective studies that women with early-stage ovarian cancer detected through screening have lower mortality from ovarian cancer than do women with more advanced disease. A large body of indirect evidence, however, suggests that this is the case. Although lead-time and length biases may be responsible, it is known that survival from ovarian cancer is related to stage at diagnosis. The 5-year survival rate is 89% for localized disease, 36% for women with regional metastases, and 17% for women with distant metastases. 1 Studies have shown that the most important prognostic factor in patients with advanced ovarian cancer is the size of residual tumor after treatment. 4,7 Surgical debulking and chemotherapy for ovarian cancer appear to be more effective in reducing the size of residual tumor when ovarian cancer is detected early. 4 Although these observations provide suggestive evidence that early detection may be beneficial, conclusive proof will require properly conducted prospective studies comparing long-term mortality from ovarian cancer between screened and nonscreened cohorts. A large clinical trial to obtain this evidence has recently been launched by the National Cancer Institute. 49 Under the most optimistic assumptions (100% sensitivity, 30% reduction in 5-year mortality with screening, no lead-time bias), annual pelvic examinations of 40-year-old women would reduce 5-year mortality from ovarian cancer in the population by less than 0.0001%. 50 Modeling studies that have examined annual CA-125 testing or a single screening with transvaginal ultrasound and CA-125 measurement have found that either approach would increase life expectancy by an average of less than 1 day per woman screened. 51,52 top link

Recommendations of Other Groups

There are no official recommendations to screen routinely for ovarian cancer in asymptomatic women by performing ultrasound or serum tumor marker measurements. The American College of Physicians (ACP), 53 the Canadian Task Force on the Periodic Health Examination, 54 and the American College of Obstetricians and Gynecologists 55 recommend against such screening. A National Institutes of Health Consensus Conference on Ovarian Cancer recommended taking a careful family history and performing an annual pelvic examination on all women. 56 The pelvic examination, including palpation of the adnexae, is mentioned in a recommendation on Pap testing issued by the American Cancer Society, National Cancer Institute, American College of Obstetricians and Gynecologists, American Medical Association, American Nurses Association, American Academy of Family Physicians, and the American Medical Women's Association. 57 Specifically, the pelvic examination (and Pap smear) is recommended annually for all women who are or have been sexually active or have reached age 18. Although Pap testing may be performed less frequently once three annual smears have been normal, the American Cancer Society specifies that the pelvic examination be performed with the Pap test every 1-3 years in women aged 18-40 years and annually thereafter. 58

The NIH Consensus Conference concluded that women with presumed hereditary cancer syndrome should undergo annual pelvic examinations, CA-125 measurements, and transvaginal ultrasound until childbearing is completed or at age 35, at which time prophylactic bilateral oopherectomy was recommended. 56 The ACP recommends counseling high-risk women about the potential benefits and harms of screening. 53 The Canadian Task Force on the Periodic Health Examination found insufficient evidence to recommend for or against screening for ovarian cancer in high-risk women. 54 top link

Discussion

The sensitivity and specificity of available screening tests for ovarian cancer in asymptomatic women are uncertain and require further study. Although various tests can detect occasional asymptomatic tumors, there is currently no evidence that routine screening will improve overall health outcomes. The large majority of women with abnormal screening test results do not have cancer, yet will require invasive procedures (laparoscopy or laparotomy) to rule out malignancy. Given the risks, inconvenience, and substantial costs of follow-up testing, and the current lack of evidence that screening reduces morbidity or mortality from ovarian cancer, routine screening cannot be recommended. Trials to determine the benefits and risks of ovarian cancer screening are under way. There is also no evidence to support routine screening in women with a history of ovarian cancer in a first-degree relative. Although such women are at increased risk and stand to benefit more from interventions that reduce ovarian cancer mortality, the effectiveness of screening has yet to be determined for any group of women. Referral to a specialist may be appropriate for women whose family history suggests hereditary ovarian cancer syndrome, due to the very high risk of cancer in this disorder.

CLINICAL INTERVENTION

Screening asymptomatic women for ovarian cancer with ultrasound, the measurement of serum tumor markers, or pelvic examination is not recommended ("D" recommendation). There is insufficient evidence to recommend for or against the screening of asymptomatic women at increased risk of ovarian cancer ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH, based in part on a paper prepared for the Clinical Efficacy Assessment Panel of the American College of Physicians by Karen J. Carlson, MD, et al. See relevant background paper: Carlson KJ, Skates SJ, Singer DE. Screening for ovarian cancer. Ann Intern Med 1994121:124-132.top link

15. Screening for Pancreatic Cancer

Burden of Suffering

Cancer of the pancreas is the fifth leading cause of cancer deaths in the U.S., accounting for an estimated 27,000 deaths in 1995 (8.4 deaths/100,000 persons). 1,1a Worldwide, the age-adjusted incidence and mortality of pancreatic cancer have been increasing since the 1930s, 2,3 although in the U.S. these rates have declined since the early 1970s. 1,1a Incidence rates may be overestimated, because an important proportion of pancreatic cancer diagnoses (as many as half in some studies) are not histologically confirmed. 4 Pancreatic cancer is more common in men, blacks, cigarette smokers, and older persons (the majority of cases being diagnosed between ages 65 and 79). 1,2,3,5 The risk of pancreatic cancer is increased in patients with diabetes, including those with long-standing (>=5 years) diabetes. 5a Familial aggregations of pancreatic cancer are rare but have been described. 2,6

Since initial symptoms are usually nonspecific (e.g., abdominal pain and weight loss) and are frequently disregarded, some 80-90% of patients have regional and distant metastases by the time they are diagnosed. 5,7 Only 3% of the 24,000 patients annually diagnosed with pancreatic cancer live more than 5 years after diagnosis. 1,1a Of pancreatic adenocarcinomas, which account for more than 90% of all pancreatic neoplasms, 5 only about 4-16% are resectable at diagnosis, 4,8-12 and the 5-year survival rate is less than 1%. 7 In addition, 5-year survival does not indicate cure, since further decrements in survival occur after 5 years. 13,14 top link

Accuracy of Screening Tests

Adenocarcinoma is the principal form of pancreatic neoplasm for which screening has been considered in this chapter, "pancreatic cancer" refers to adenocarcinoma. There are no reliable screening tests for detecting pancreatic cancer in asymptomatic persons. The deep anatomic location of the pancreas makes detection of small localized tumors unlikely during the routine abdominal examination. Even in patients with confirmed pancreatic cancer, an abdominal mass is palpable in only 15-25% of cases. 4,5,15,16

Imaging procedures such as magnetic resonance imaging and computed tomography are too costly to use as routine screening tests, while more accurate tests such as endoscopic retrograde cholangiopancreatography and endoscopic ultrasound are inappropriate for screening asymptomatic patients due to their invasiveness. 17,18 Abdominal ultrasonography is a noninvasive screening test, but there is little information on the efficacy of abdominal ultrasound as a screening test for pancreatic cancer in asymptomatic persons. In symptomatic patients with suspected disease it has a reported sensitivity of 40-98% and a specificity as high as 90-94%. 15,19,20 Conventional ultrasonography is limited by visualization difficulties in the presence of bowel gas or obesity and by its range of resolution (2-3 cm). 7,18,20 Even tumors <2 cm in diameter are frequently associated with metastatic disease, 8,11,21 thus limiting the ability of ultrasound to detect early disease.

Most persons with pancreatic malignancy have elevated levels of certain serologic markers such as CA19-9, peanut agglutinin, pancreatic oncofetal antigen, DU-PAN-2, carcinoembryonic antigen, alpha-fetoprotein, CA-50, SPan-1, and tissue polypeptide antigen. 22-25 None of these markers is, however, tumor specific or organ specific 25 elevations of various serologic markers also occur in significant proportions of persons with benign gastrointestinal diseases or malignancies other than pancreatic cancer. 17,22,24,25 Most of these markers have been studied exclusively in high-risk populations, such as symptomatic patients with suspected pancreatic cancer. CA19-9 has probably achieved the widest acceptance as a serodiagnostic test for pancreatic carcinoma in symptomatic patients, with an overall sensitivity of approximately 80% (68-93%) and specificity of 90% (73-100%) sensitivity was highest in patients with more advanced disease. 23,24 Among healthy subjects, CA19-9 has good specificity (94-99%) 26-28 but nevertheless generates a large proportion of false-positive results due to the very low prevalence of pancreatic cancer in the general population. 29 A study of a mass screening of more than 10,000 asymptomatic persons for pancreatic cancer in Japan, 30 using either ultrasonography alone or CA19-9 plus elastase-1, found the likelihood of pancreatic cancer given a positive screening test to be 0.5% only one of the four cancers discovered could be curably resected.

The predictive value of a positive test could be improved if a population at substantially higher risk could be identified. Diabetes mellitus in older adult patients might be useful as a marker for a population at high risk of having pancreatic cancer. 3,31,32 Cohort studies have reported incidences of pancreatic cancer among diabetic patients ranging from 51 to 166/100,000 person-years. 5a Studies evaluating screening efficacy might therefore be warranted in this population. top link

Effectiveness of Early Detection

Evidence that early detection can lower morbidity or mortality from pancreatic cancer is not conclusive. The reported 5-year survival for localized disease based on 1983-1990 national data is only 9%, not substantially higher than the 5-year survival with regional (4%) and distant (2%) metastases. 1a A comprehensive review of published reports on surgical resection of pancreatic cancer estimated an overall 5-year survival rate of 8% for small tumors without evidence of local or distant spread. 8 In part, this low rate may reflect the fact that a proportion of patients with localized disease cannot be operated on because of concomitant medical problems, advanced age, or other reasons. 10,11,13 Patients who have small localized tumors that are resected for attempted cure, which account for only 4-16% of the total, may have better 5-year survival rates (as high as 37-48% in the most experienced centers), 8-13,21,33 although the designs of most studies of surgical outcome suffer from lead-time, length, and selection biases. The morbidity associated with surgical resection is high (15-53%), but perioperative mortality is now less than 7% in the hands of experienced surgeons. 8,9,13,33,34

Reports on the effectiveness of adjuvant external beam and/or intraoperative radiotherapy in improving survival among curatively resected patients, using historical controls, have yielded inconsistent results. 35-37 In one small randomized controlled trial, 38 corroborated by a subsequent case series by the same authors, 39 an adjuvant treatment program using combined radiation and chemotherapy following curative resection was associated with a significant median survival advantage of 9 months and a 5-year survival advantage of 14.5% in treated versus control cases however, the study was closed early due to poor subject accrual and it did not control for the substantially greater frequency of clinic visits by cases. Adverse effects of combined radiation and chemotherapy include leukopenia and gastrointestinal toxicity. 38,40 Intraoperative radiotherapy frequently causes gastrointestinal bleeding, which may be life-threatening. 37 Additional randomized controlled trials of adjuvant therapy are needed to confirm its effectiveness in improving survival in patients with early pancreatic carcinoma. New modalities being explored include immunotherapy 41 and hormonal therapy. 42 top link

Primary Prevention

Cigarette smoking has been consistently associated with a modestly increased risk of pancreatic cancer in numerous cohort and case-control studies of populations in the U.S., Canada, Europe, and Japan. 3,43-46 A clear dose-response relationship has not been demonstrated, however, nor have the biologic mechanisms underlying this association been adequately delineated. Cohort and case-control studies suggest that former smokers have a decreased risk of pancreatic cancer compared with current smokers, 43,44,47-49 but estimates of the duration of abstinence required to show a reduction in risk have varied from as few as 1-3 years to as many as 10-20 years, and some studies have found no risk reduction at all associated with smoking cessation. 43,45 In addition, a number of these studies suffer from selection, misclassification, and other biases. Although the causal relationship between smoking and pancreatic cancer requires further study, counseling patients to discontinue smoking (see Chapter 54) is easily justified by its established efficacy in preventing other malignancies (e.g., lung cancer), coronary artery disease, and other serious disorders.

Several cohort studies and many population-based case-control studies have reported positive associations between pancreatic cancer and dietary factors such as meat, eggs, carbohydrates, refined sugar, cholesterol, fat, and total calorie intake, as well as negative (protective) associations with intake of vegetables and fruits. 3,46,48,50-56 However, study results are inconsistent many studies suffer from selection, misclassification, and other biases and large numbers of comparisons make significance testing problematic. Further research to define nutritional risk factors for pancreatic cancer is therefore needed. Studies of the relationship between increased alcohol consumption and pancreatic cancer have yielded inconsistent results 3,45-48,57,58 few have adequately assessed level and duration of intake, or evaluated the possibility of a link between alcohol, pancreatitis, and pancreatic cancer. Current epidemiologic evidence does not support an association between pancreatic cancer and coffee consumption. 3,58,59 top link

Recommendations of Other Groups

No groups recommend routine screening for pancreatic cancer in asymptomatic persons. The Canadian Task Force on the Periodic Health Examination recommends against such screening. 60 top link

Discussion

Given the lack of evidence for improved outcome with early detection of pancreatic cancer, the invasive nature of diagnostic tests likely to follow a positive screening test (e.g., endoscopic ultrasound, laparotomy), and the fact that most positive screening tests would be false positives, screening for pancreatic cancer cannot be recommended at this time. Primary prevention of pancreatic cancer may be possible through clinical efforts directed at the use of tobacco products.

CLINICAL INTERVENTION

Routine screening for pancreatic cancer in asymptomatic persons, using abdominal palpation, ultrasonography, or serologic markers, is not recommended ("D" recommendation). All patients should be counseled regarding use of tobacco products (see Chapter 54). Counseling to reduce fat and cholesterol intake and to increase intake of fruits and vegetables may be recommended on other grounds (see Chapter 56).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

16. Screening for Oral Cancer

Burden of Suffering

The term "oral cancer" includes a diverse group of tumors arising from the oral cavity. Usually included are cancers of the lip, tongue, pharynx, and oral cavity. The annual incidence of oral cancer in the U.S. is about 11/100,000 population, with a male/female ratio greater than 2:1. 1 Oral cancer is responsible for 2% of all cancer deaths in the U.S., and it is projected to account for over 28,000 new cases and about 8,400 deaths in 1995. 2

Fifty-three percent of oral cancers have spread to regional or distant structures at the time of diagnosis. 1 Overall 5-year survival is 52%, but it ranges from 79% for localized disease to 19% if distant metastases are present. 1 The natural history of each type of cancer can be quite different. Cancer of the lip accounts for 11% of new cases of oral cancer but only 1% of deaths. In contrast, cancer of the pharynx accounts for 31% of new cases of oral cancer but 50% of deaths. 1 The median age at diagnosis of oral cancers is 64 years, and 95% occur in persons over age 40. About half of all oropharyngeal cancers and the majority of deaths from this disease occur in persons over age 65. 1

Use of tobacco in all forms and, to a lesser extent, alcohol abuse are the major risk factors for the development of oral cancer. 3 The risk of oral cancer is increased 6-28 times in current smokers, 4 and the effects of tobacco and alcohol account for 90% of oral cancer in the U.S. 5 In parts of India and Asia where chewing tobacco or betel nut is very common, the incidence of oral cancer is 3 times higher than in the U.S. 5 In several areas of India, oral cancer accounts for 40% of all female cancer deaths. 5 Other risk factors for oral cancer include occupational exposures, solar radiation (for cancer of the lip), and the presence of premalignant lesions such as leukoplakia or erythroplakia. 3 Depending on the degree of histologic abnormality, up to 18% of cases of leukoplakia may develop into invasive cancers over long-term follow-up. 5 Patients infected with human immunodeficiency virus are at increased risk of oral cancers, most commonly Kaposi's sarcoma and non-Hodgkin's lymphoma. 6 top link

Accuracy of Screening Tests

The principal screening test for oropharyngeal cancer in asymptomatic persons is inspection and palpation of the oral cavity. Studies indicate that many oral cancers occur on the floor of the mouth, the ventral and lateral regions of the tongue, and the soft palate, anatomic sites that may be inaccessible to routine visual inspection. 7 The recommended examination technique involves a careful visual examination of the oral cavity and extraoral areas using a dental mirror, retracting the tongue with a gauze pad to visualize hard-to-see areas. It also includes digital palpation with a gloved hand for masses. Complete descriptions of the recommended techniques have been published. 8 There is little information, however, on the sensitivity of this procedure in detecting oral cancer or on the frequency of false-positive results when a lesion is found. The abbreviated oral inspection that is more typical of the routine physical examination is also of unknown accuracy and predictive value. Studies in India and Sri Lanka have shown that nonphysician basic health care workers, given a short course on screening for oral cancer, can identify oral cancers and their precursors. 9,10 Mehta found a 59% sensitivity and 98% specificity for lesions appropriately referred to dentists by the basic health care workers. 9 No outcome data were reported in these studies, and it is unclear how these findings relate to the very different, lower prevalence population of the United States.

Some studies suggest that dentists are more effective than are physicians in routinely performing a complete mouth examination and detecting early-stage oral cancer. 11 Older Americans, the population at greatest risk for oral cancer, visit the dentist infrequently, however physician visits are much more frequent in older persons. 12 No studies of the sensitivity and specificity of screening for oral cancer by dentists have been reported.

Alternative screening tests for oral cancer have been proposed, such as tolonium chloride rinses to stain suspicious lesions, 13,14 but further research is needed to evaluate the accuracy and acceptability of these techniques before routine use in the general population can be considered. top link

Effectiveness of Early Detection

No controlled trials of screening for oral cancer that include data on clinical outcomes have been reported. There is consistent evidence that persons with early-stage oral cancer have a better prognosis than those diagnosed with more advanced disease. 1,2 Because of the possible effects of lead-time and length bias, however, these observational data are not sufficient to prove that screening and earlier detection improve the prognosis in patients with oral cancer. Some authors have questioned the effectiveness of early detection in improving prognosis. 15 Prospective trials of screening for oral cancer, although difficult and expensive to conduct in the general population, might be feasible in high-risk populations in which the incidence of oral cancer is substantially greater.

Several studies have examined treatment of oral leukoplakia, a form of premalignancy, as a means of preventing oral cancer. Primary treatment of oral leukoplakia and prevention of second primary lesions in patients with treated oral cancer have been studied in several randomized, placebo-controlled chemoprevention trials of high-dose isotretinoin (13-cis-retinoic acid). 16-18 These studies demonstrated that isotretinoin was effective in promoting remission of leukoplakia and preventing the occurrence of second primary oral cancers. 17 Leukoplakia relapsed in a majority of cases within 3-6 months after discontinuation of therapy, however, and the rate of toxicity of treatment was high (mild to moderate side effects in up to 79% of patients). A trial of alternate maintenance therapies after isotretinoin induction for leukoplakia suggested that low-dose isotretinoin was more effective in maintaining remissions than beta-carotene and caused fewer side effects than high-dose therapy: 12% of participants experienced severe toxicity and 42% had moderate toxicity from low-dose isotretinoin, including dry skin, cheilitis, and conjunctivitis. 18

Uncontrolled trials using beta-carotene demonstrated variable reductions (up to 71%) in the occurrence of oral leukoplakia and mucosal dysplasia. 19-21 In a randomized trial, however, the majority of patients with leukoplakia progressed during beta-carotene treatment. 18 Although side effects of beta-carotene are minimal, older male smokers who took beta-carotene for 5-8 years experienced slightly higher rates of lung cancer and overall mortality in a recently completed trial in Finland. 22 Research is currently in progress on alternative agents (e.g., vitamin E) and combinations of therapies. 23 top link

Recommendations of Other Groups

The American Cancer Society recommends a cancer checkup that includes oral examination every 3 years for persons over age 20 and annually for those over age 40. 24 The Canadian Task Force on the Periodic Health Examination concluded that there was insufficient evidence to include or exclude screening for oral cancer in the periodic health examination of persons in the general population, but suggested that annual oral examination by a physician or dentist should be considered for persons over 60 with risk factors for oral cancer (e.g., smokers and regular drinkers). 25 Although the National Institutes of Health no longer issue specific clinical guidelines regarding screening for oral cancer, both the National Cancer Institute and the National Institute of Dental Research support efforts to promote the early detection of oral cancers during routine dental examinations. 8,26 top link

Discussion

Primary prevention strategies, such as counseling patients regarding the use of tobacco and alcohol, may have a greater impact on the morbidity and mortality associated with oral cancer than measures aimed at early detection. There is good evidence that tobacco use and excessive consumption of alcohol are both independent and synergistic risk factors for oral cancer. 3 Over 90% of oropharyngeal cancer deaths are associated with smoking. 5 In addition to smoking and alcohol, oral cancer is also associated with the use of snuff and chewing tobacco. 27

Oral cancer is a relatively uncommon cancer in the United States. Even among high-risk groups such as smokers, oral cancer accounts for a relatively small proportion (<2%) of all deaths. 4 Available screening tests for oral cancer are limited to the physical examination of the mouth, a test of undetermined sensitivity, specificity, and positive predictive value. Despite the strong association between stage at diagnosis and survival, there are few controlled data to determine whether routine screening in the primary care setting leads to earlier diagnosis or reduced mortality from oral cancer. Given the significant morbidity and mortality associated with advanced oral cancer and its treatment, clinicians may wish to include careful examinations for oral cancer in asymptomatic persons at significantly increased risk for the disease (see Clinical Intervention) direct evidence of a benefit of screening in any group, however, is lacking. It is also appropriate to refer patients for regular visits to a dentist, for whom complete examination of the oral cavity is often more feasible (see Chapter 61).

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening of asymptomatic persons for oral cancer by primary care clinicians ("C" recommendation). Although direct evidence of a benefit is lacking, clinicians may wish to include an examination for cancerous and precancerous lesions of the oral cavity in the periodic health examination of persons who chew or smoke tobacco (or did so previously), older persons who drink regularly, and anyone with suspicious symptoms or lesions detected through self-examination. All patients, especially those at increased risk, should be advised to receive a complete dental examination on a regular basis (see Chapter 61). All adolescent and adult patients should be asked to describe their use of tobacco (Chapter 54) and alcohol (Chapter 52). Appropriate counseling should be offered to those persons who smoke cigarettes, pipes, or cigars, those who use chewing tobacco or snuff, and those who have evidence of alcohol abuse. Persons with increased exposure to sunlight should be advised to take protective measures when outdoors to protect their lips and skin from the harmful effects of ultraviolet rays (see Chapter 12).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Paul S. Frame, MD, based on materials prepared for the Canadian Task Force on the Periodic Health Examination by Carl Rosati, MD, FRCSC.top link

17. Screening for Bladder Cancer

Burden of Suffering

Bladder cancer is an important cause of morbidity and mortality in the U.S., primarily in older men. Over 50,000 new cases and over 11,000 deaths due to bladder cancer are predicted to occur in 1995 in the U.S. 1 Risk rises steeply with age over half of all deaths from bladder cancer occur after age 70. The incidence of bladder cancer is 3-4 times higher in men than women, and roughly twice as high in white men compared to black men. 2,3 Among white men, the annual incidence of bladder cancer after age 65 is approximately 2/1,000 persons (vs. 0.1/1,000 under 65), and the lifetime probability of developing cancer is over 3%. 3 The probability of dying from bladder cancer is much smaller, however -- less than 1%. Cigarette smoking markedly increases the risk for bladder cancer (relative risk among smokers vs. nonsmokers = 1.5-7) 4,5 nearly half of all new cases of bladder cancer occur in current or former smokers. 2 Occupational exposure to chemicals used in dye, leather, and tire and rubber industries has also been associated with increased risks of bladder cancer. 4 Despite initial reports, positive associations between bladder cancer and consumption of coffee or artificial sweeteners have not been confirmed. 2,6,7 top link

Accuracy of Screening Tests

Early asymptomatic bladder cancer may be associated with occult bleeding (microscopic hematuria) or the presence of dysplastic cells in the urine. The definition of significant hematuria varies, but more than 3-5 red blood cells (RBCs) per high-powered field in microscopic analysis of the urine sediment is usually considered abnormal. 8 Urine dipsticks, which detect peroxidase activity of hemoglobin, provide a quick, inexpensive, and sensitive test for hematuria, and have largely supplanted microscopic urinalysis for screening in asymptomatic patients. Depending on the reference standard used (>2 or >5 RBCs per high-powered field on microscopy), dipstick urinalysis has a sensitivity of 91-100% and a specificity of 65-99% for detecting microscopic hematuria. 9-16 Dipsticks facilitate testing of serial urine specimens at home, which increases the detection of intermittent hematuria. False-positive dipstick results may be produced by myoglobin in the urine, and false-negative results may result from high concentrations of ascorbic acid, or from prolonged exposure of dipsticks to air. 8

Although dipsticks are reasonably accurate for detecting hematuria, microscopic hematuria is not specific for bladder cancer or other urologic cancers. Of the various other causes of microscopic hematuria in asymptomatic patients, most are either benign (e.g., benign prostatic hypertrophy (BPH), exercise, renal cysts, urethral trauma, menstrual bleeding) or of questionable importance (bladder stones, dysplasia, asymptomatic infection). In three separate studies, 46-55% of men with asymptomatic hematuria had no identifiable source of bleeding. 17-19

Two large outpatient studies have used urine dipsticks to screen for hematuria in asymptomatic populations at increased risk of bladder cancer. In a study by Messing, older men (mean age 65) screened their urine daily for 14 days. Of 1,340 men completing screening, 21% had at least one positive screen and 16 (1.2%) had urologic cancers (9 bladder, 1 renal, and 6 prostate). 18 Britton recruited 3,152 male outpatients over 60 years old to test their urine 10 times (daily or weekly). At least one screen was positive in 20% of men (12% on initial screen), and 22 (0.5%) had cancer (17 bladder and 5 prostate). 19 Among men undergoing full evaluation for hematuria, the positive predictive values (PPV) of serial dipstick screening for malignancy in these two studies were 8% and 6%, respectively one third of men with hematuria refused or had an incomplete workup in each study. Similar results have been reported in other populations: among 272 Japanese men with 5 or more RBCs on urinalysis, 6% had urologic cancers. 17 Hematuria has a higher PPV (26-33%) if other urologic disorders are included as useful outcomes of screening, 18,20 but the benefit of early detection of many of these conditions (bladder stones, mild obstruction, urinary tract infection) remains unproven in asymptomatic individuals (also see Chapter 31).

The yield of onetime screening for bladder cancer in the general outpatient population appears to be much lower. In a retrospective review of over 20,000 men over 35 and women over 55 receiving a personal health appraisal, dipstick screening detected only three cases of cancer (one bladder, two prostate). 21 Prevalence of positive dipstick results ranged from 3-9% over a 7-year period. In a second study of almost 2,700 outpatients, 13% of screened men and women had hematuria (at least one RBC on urine sediment), but only 2% of those with microscopic hematuria had serious urologic disease. 22,23 In each of these studies, only 0.5% of all patients (3-4% of men over age 55) with asymptomatic hematuria were diagnosed with urologic cancers within 3 years of a positive screen.

Urine cytology is more specific but less sensitive than microscopic hematuria as a screen for early bladder cancer. Because cytology is technically difficult and significantly more expensive than dipstick urinalysis, its use as an initial screening test has been limited to high-risk occupational screening programs. Specificity for cytology has been estimated to be as high as 95%, 24 and sequential screening combining urine dipstick and urine cytology may be able to reduce the false-positive rate of screening while maintaining sensitivity for clinically important cancers. Among men with dipstick hematuria in one screening study, urine cytology detected 10 of 17 patients with bladder cancer with a specificity of 96% 6 of the 7 cases missed were well-differentiated, superficial lesions with a good prognosis. 19 Rapid tests based on other tumor markers are under investigation. 25 top link

Effectiveness of Early Detection

Survival in patients with bladder cancer is strongly associated with stage at diagnosis. Although most cancers are superficial at time of diagnosis, currently 10-20% of all cases of bladder cancer have invaded the muscular wall of the bladder when first diagnosed, with a much worse prognosis. Five-year survival for patients with superficial disease is over 90%, but falls to less than 50% with invasive disease. 1 The rationale for screening is that detecting and treating early asymptomatic bladder cancers may prevent progression to invasive disease, or allow for more effective treatment of noninvasive tumors, which have a high rate of recurrence. Many cases detected on screening, however, are low-grade transitional cell cancers with low propensity for invasion in contrast, since aggressive cancers may invade early, periodic screening may have a limited potential for detecting lethal bladder cancers at an early, treatable stage. 26

In the prospective screening studies cited above, all 26 cases of bladder cancer detected by screening were early tumors confined to superficial areas of the bladder (Stage T0 or T1). 18,19 Compared to outcomes of cancers developing in the general population, cases detected by screening appeared to be less likely to progress over 3 years 27 or lead to death within 2 years. 28 Because of lead-time and length biases (see Chapter ii, Methodology), however, comparing case-survival is not sufficient to establish a benefit of screening, without information on rates of cancer and death in a comparable unscreened population. The incidence of invasive and fatal bladder cancer among screened men is very low, and it is also quite low in the general population of older men (<1/1,000 per year). Larger studies, with a comparable unscreened group and longer follow-up, are needed to determine whether screening improves the outcome of bladder cancer in high-risk populations. Despite early detection and treatment, 10 of 16 cancers detected by screening recurred within 3 years in one study. 27 top link

Recommendations of Other Groups

No major organization recommends screening for bladder cancer in asymptomatic adults. The Canadian Task Force on the Periodic Health Examination recommends against routine screening in asymptomatic individuals and concludes that there is insufficient evidence for or against screening in specific high risk groups. 29 The American Cancer Society has not issued any specific guidelines on screening for bladder cancer. top link

Discussion

Dipstick and microscopic urinalysis are simple and sensitive tests for detecting hematuria from early tumors, but they are not sufficiently specific to be practical for screening for bladder cancer in the general population. Even among older high-risk populations, the predictive value of a positive screening test is low (5-8%). As a result, many persons without cancer will require diagnostic workups for false-positive test results and will be subjected to the costs, discomforts, and risks of cystoscopy and intravenous pyelography. More important, there is no proof that early detection significantly improves the prognosis for the small minority of patients found to have urologic malignancies. Most of the bladder cancers detected have a good prognosis in the absence of screening: 5-year survival for all bladder cancer is currently close to 80%. 1 Due to the frequent multifocal nature of bladder cancer, recurrences are common despite early detection and treatment. Conversely, the most lethal tumors become invasive early in the course of disease, and the potential to detect them at an earlier stage may be limited. Only a prospective study that includes an unscreened comparison group can determine whether screening is effective in reducing morbidity or mortality from bladder cancer (or other urologic cancers), and whether the benefits are sufficient to justify the costs and risks of screening and early treatment. In the absence of such evidence, routine screening cannot be recommended, due to the high rate of false-positive results, and the possibility of harm to asymptomatic patients, few of whom have cancer. Primary prevention may offer a safer and more effective strategy than screening for reducing mortality from urologic cancer, since smoking accounts for nearly half of all deaths from cancers of the bladder and kidney. 2

CLINICAL INTERVENTION

Routine screening for bladder cancer with microscopic urinalysis, urine dipstick, or urine cytology is not recommended in asymptomatic persons ("D" recommendation). Persons working in high-risk professions (e.g., dye or rubber industries) may be eligible for screening at the worksite, although the benefit of this has not been determined. Men and women who smoke cigarettes should be advised that smoking significantly increases the risk for bladder cancer, and all smokers should be routinely counseled to quit smoking (see Chapter 54).

The draft update for this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH, with contributions from materials prepared by Sarvesh Logsetty, MD, for the Canadian Task Force on the Periodic Health Examination. top link

18. Screening for Thyroid Cancer

Burden of Suffering

Thyroid cancer accounts for an estimated 14,000 new cancer cases and more than 1,000 deaths in the U.S. each year. 1 The annual incidence is about 4/100,000 population. 2 Women account for 77% of new cases and 61% of deaths. 1 Current overall 5-year survival with treatment is 95% in whites and 90% in blacks, 1 but is much lower with some histologic types (e.g., medullary, anaplastic). 3-5 Among those at substantially increased risk for thyroid cancer are persons exposed to external upper body (especially head and neck) irradiation during infancy or childhood, and individuals with a family history of thyroid cancer or multiple endocrine neoplasia type 2 (MEN 2) syndrome. 6-10 The risk of radiation-induced thyroid nodularity and cancer increases with radiation dose and decreases with increasing age at which irradiation occurred. 7-12 In several large cohort studies, the absolute excess risk of thyroid cancer associated with low-dose irradiation given before 18 years of age ranged from 0.3 to 12.5/10,000 person-years per Gy (Gy = radiation dose). 8-10 Medullary thyroid cancer, which comprises about 10% of all thyroid malignancies, is inherited in one fourth of cases as part of the autosomal dominant MEN 2 syndrome. 6,13 top link

Accuracy of Screening Tests

Common screening tests for thyroid cancer include neck palpation and ultrasonography to detect nodules. Screening for medullary thyroid cancer as part of the autosomal dominant syndrome MEN 2 is performed at specialized centers 13a rather than by primary care providers and will not be considered further in this chapter. Diagnostic procedures such as scintigraphy and fine-needle aspiration with cytology are generally reserved for persons with evidence of nodular disease or goiter.

The accuracy of neck palpation as a screening test varies with the examiner's technical skill and the size of the mass. 14 Among asymptomatic persons, palpation had a reported sensitivity of 38% for any thyroid disease (compared to examination at surgery for hyperparathyroidism), and 4 of 6 thyroid tumors were missed by palpation. 15 Compared to ultrasonographic examination, the sensitivity of neck palpation for detection of solitary thyroid nodules was only 15%, although no nodules were malignant. 16 Of 821 patients with no history of thyroid abnormalities and without clinically palpable nodules at autopsy, 100 had solitary nodules on macroscopic inspection of serially sectioned glands, and 17 had primary thyroid cancer. 17 Direct palpation of the excised thyroid gland at autopsy had a sensitivity for solitary nodules of only 24% compared to macroscopic inspection. These studies suggest a negative examination does not appreciably decrease the probability of thyroid nodules or cancer.

Neck palpation for thyroid nodularity has a high specificity in asymptomatic persons (93-100%), but routine palpation for thyroid nodules as a screening test to detect thyroid cancer generates many false-positive results because only a small proportion of nodular thyroid glands are neoplastic. 14,17,18 Periodic screening by palpation of almost 77,000 Japanese women detected thyroid cancer in 0.14% the likelihood of thyroid cancer in the presence of a palpable thyroid abnormality was 6%. 18 A similarly high false-positive rate has been reported when ultrasonography is used as a screening test for thyroid abnormalities. In Finnish studies, ultrasound screening of 354 asymptomatic persons detected 56 solitary nodules, but none was malignant on biopsy. 16,19 Those falsely identified as positive by screening tests must undergo the inconvenience, expense, and anxiety of needless additional testing, including invasive tests such as biopsy, to rule out cancer.

Persons exposed to upper body irradiation in childhood have a higher prevalence of thyroid cancer, but also have a higher prevalence of thyroid nodularity. 8-11 It is unclear what effect a history of irradiation has on the likelihood of thyroid cancer in the presence of palpable thyroid nodularity. Periodic screening by neck palpation of 1,500 patients with a history of thyroid irradiation detected carcinoma in 1%. 20 The likelihood of thyroid cancer given detection of any nodular disease was only 3%, but the likelihood of malignancy among patients with nodular disease that was suspicious for cancer (e.g., solitary nodules) was not reported. Only 17% of the patients with nodularity actually underwent surgery, of whom 20% had thyroid cancer. top link

Effectiveness of Early Detection

The benefits of early detection of thyroid cancer in the general population are not well defined. For all histologic types, 5-year survival is significantly better with earlier stage at diagnosis. 3 A cohort study of mass screening found a significantly higher 7-year cumulative survival rate in patients whose cancer was detected by screening (98%) when compared with those presenting with symptoms (90%). 18 Cancers detected by screening were significantly more likely to have a favorable histology, however, and both lead-time and length biases are likely in this study. There have been no controlled trials demonstrating that asymptomatic persons detected by screening have a better outcome than those who present with clinical symptoms or signs. In addition, not all cancers detected through screening are likely to present clinically during the patient's lifetime. In autopsy studies in the U.S., the prevalence of occult thyroid carcinoma in adults ranges from 2-13% 17,21,22 in contrast, the annual incidence of thyroid carcinoma is only about 4/100,000 population. 2 top link

Recommendations of Other Groups

The American Cancer Society recommends screening for thyroid cancer by palpation every 3 years in persons aged 21-40 years and annually in those more than 40 years old. 23 The American Academy of Family Physicians recommends palpation for thyroid nodules in adults with a history of upper body irradiation 24 this recommendation is currently under review. The Canadian Task Force on the Periodic Health Examination concluded that there was poor evidence for either inclusion or exclusion of screening for thyroid cancer in the periodic health examination. 25 top link

Discussion

Given the lack of evidence that early detection of thyroid cancer by screening improves outcome, the high prevalence and uncertain clinical significance of occult throid carcinoma, the poor sensitivity of neck palpation in the detection of thyroid nodules, the fact that most positive screening tests would be false-positives, and the invasive nature of diagnostic tests (e.g., biopsy) likely to follow a positive screening test, routine screening for thyroid cancer cannot be recommended at this time. For persons irradiated in childhood, the greater likelihood of having both thyroid nodules and malignancy means that the yield from screening is likely to be higher. The clinical benefits of such screening have not been established, however.

CLINICAL INTERVENTION

Screening asymptomatic adults or children for thyroid cancer using either neck palpation or ultrasonography is not recommended ("D" recommendation). Although there is insufficient evidence to recommend for or against such screening in asymptomatic persons with a history of external upper body (primarily head and neck) irradiation in infancy or childhood, recommendations for periodic palpation of the thyroid gland in such persons may be made on other grounds, including patient preference or anxiety regarding their increased risk of cancer ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

19. Screening for Diabetes Mellitus

Burden of Suffering

Approximately 14 million persons in the U.S. have diabetes mellitus. 1 Non-insulindependent diabetes mellitus (NIDDM) or Type II diabetes accounts for 90-95% of all cases of diabetes in the U.S., while insulin-dependent diabetes mellitus (IDDM) or Type I diabetes accounts for the remaining 5-10%. 1-4 An estimated half of all persons with diabetes (primarily patients with NIDDM) are currently unaware of their diagnosis. 2 Diabetes may cause life-threatening metabolic complications, and is the seventh leading cause of death in the U.S., contributing to roughly 160,000 deaths each year. 1,3 It is also an important risk factor for other leading causes of death such as coronary heart disease and cerebrovascular disease. 4 Diabetes is the most common cause of polyneuropathy, with approximately 50% of diabetics affected within 25 years of diagnosis, 5 and is responsible for over 50% of the 120,000 annual nontraumatic amputations in the U.S. 6 Diabetic nephropathy is now the leading cause of end-stage renal disease in the U.S. 7 and, if current trends continue, will soon account for 50% of all patients with renal failure. 8 Diabetes is the leading cause of blindness in adults ages 20-74 and accounts for over 8,000 new cases of blindness each year. 9 Infants born of diabetic women are at increased risk of fetal malformation, prematurity, spontaneous abortion, macrosomia, and metabolic derangements. 10,11 Compared to persons without diabetes, diabetic patients have a higher hospitalization rate, longer hospital stays, and increased ambulatory care visits. 3,12 The total annual economic burden of diabetes is believed to approach $100 billion in the U.S. 13

The onset of NIDDM is usually after age 30, and the prevalence steadily increases with advancing age. It is estimated that nearly 20% of the U.S. population aged 65-74 has diabetes. 2 The prevalence of NIDDM is markedly increased in Native Americans and is also higher among black and Hispanic populations. 3 The prevalence of NIDDM is greater than 70% in Pima Indians 55 years of age and older. 14 Other risk factors for diabetes include family history, obesity, and a previous history of gestational diabetes or impaired glucose tolerance. IDDM has an earlier onset (usually before age 30), a much shorter asymptomatic period, and a more severe clini cal course than NIDDM.

Gestational diabetes mellitus (GDM), the development of glucose intolerance during pregnancy, occurs in 3-5% of all pregnancies and is the most common medical problem of pregnancy. 3,15 Risk factors for GDM include obesity, increased maternal age, hypertension, glucosuria, a family history of diabetes, and a history of a macrosomic, stillborn, or congenitally malformed infant. GDM is a risk factor for fetal macrosomia and is associated with other neonatal complications, such as hyperbilirubinemia and hypoglycemia. Macrosomia -- most commonly defined as birth weight above 4,000 or 4,500 g -- is not itself a morbid condition but is associated with increased risk of operative delivery (cesarean section or vacuum or forceps delivery) and birth trauma (e.g., clavicular fracture, shoulder dystocia, and peripheral nerve injury). 16-19 In some series, the incidence of shoulder dystocia in infants over 4,000 g is close to 2%. 20 Women with a history of GDM are also at increased risk for developing NIDDM later in life. 21 top link

Accuracy of Screening Tests

The diagnosis of diabetes in many nonpregnant patients is based on typical symptoms (polyuria, polydypsia) in association with clear elevation of glucose (fasting plasma glucose > 140 mg/dL [7.8 mM]). Many asymptomatic persons, however, may have abnormal glucose metabolism and be at increased risk for complications of diabetes.

Diagnosis of Diabetes in Asymptomatic Persons.

The National Diabetes Data Group (NDDG) 22 and World Health Organization (WHO) 23 have issued similar criteria for diagnosing diabetes in asymptomatic persons, based on elevated fasting plasma glucose (>140 mg/dL) or an abnormal plasma or serum glucose following an 75 g oral glucose tolerance test (OGTT). NDDG criteria for a positive OGTT (> 200 mg/dL at 2 hours and before 2 hours) differ slightly from WHO criteria (glucose > 200 mg/dL at 2 hours alone). Abnormal glucose measurements on more than one occasion are required for a diagnosis of diabetes. 22,23 The complex diagnostic criteria reflect both the difficulty in distinguishing diabetic from nondiabetic patients on the basis of a single measurement, and the substantial test-retest variability of the OGTT. The coefficient of variation for OGTT ranges from 20% to 35%. 24,25 To improve reliability of the OGTT in nonpregnant adults, the American Diabetes Association (ADA) recommends that patients eat an unrestricted diet for 3 days preceding the test and fast overnight before the test. 26

Both the NDDG and WHO recognize an intermediate form of disordered glucose metabolism, impaired glucose tolerance (IGT), based on intermediate results of the OGTT (140-200 mg/dL). 22,23 Patients with IGT are at increased risk of developing frank diabetes, but rates of progression are highly variable. IGT is also a risk factor for cardiovascular disease. 25 A significant number of individuals diagnosed with IGT revert to normal on repeat testing, 25 and the treatment implications of IGT alone are uncertain. top link

Diagnosis of Gestational Diabetes.

The diagnosis of GDM is traditionally based on two or more abnormal values during a 3-hour glucose tolerance test using 100 g glucose. 22,27 NDDG diagnostic criteria are based on extrapolations from standards for whole blood glucose originally derived by O'Sullivan 28 to identify mothers at risk of developing diabetes in long-term follow-up. The conversion factor used to develop criteria for plasma glucose measurements may have been incorrect, however, 29 and others have proposed modified criteria with lower thresholds as more sensitive predictors of adverse pregnancy outcomes. 30 Outside of North America, the diagnosis of GDM is usually based on WHO criteria using a 2-hour 75 g glucose tolerance test. 23 The prevalence of GDM varies considerably depending on whether WHO, NDDG, or modified criteria of Carpenter and Coustan 30 are used. 15,31 In addition to poorly standardized criteria for a positive OGTT in pregnancy, the lack of studies on the reproducibility of the 100 g glucose tolerance test contributes to ongoing controversy over the diagnosis of GDM. 32-34

Because diagnostic glucose tolerance testing is too time-consuming and expensive for routine screening, various blood or urine tests have been examined for their ability to identify three distinct at-risk populations among asymptomatic persons: persons with undiagnosed NIDDM, pregnant women with GDM, and individuals at high risk of developing IDDM. top link

Screening for Non-Insulin-Dependent Diabetes.

The most commonly used screening tests for NIDDM include measurement of serum or plasma glucose in fasting or postprandial specimens, measurement of glycosylated proteins in blood, and detection of glucose in urine. The sensitivity and specificity of the fasting plasma glucose (compared to diagnostic oral glucose tolerance testing) depends on the threshold set to define an abnormal screening result. A single fasting glucose above 140 mg/dL is specific for diabetes (>99%) but sensitivity varies widely among different populations (21-75%). 35-40 Using a lower threshold (>123 mg/dL) improves sensitivity (40-88%), while maintaining reasonably high specificity (97-99%). 35-40 A random (i.e., nonfasting) plasma glucose greater than 140 mg/dL has a sensitivity of 45% and a specificity of 86%. 41 The ADA recommends that a fasting plasma glucose greater than 115 mg/dL, or a random glucose greater than 160 mg/dL, be considered a positive screen to be confirmed with OGTT. 26

The nonenzymatic attachment of glucose to circulating proteins, primarily hemoglobin and albumin, reflects overall metabolic control in diabetic populations. A number of studies have evaluated HbA1c and serum fructosamine as screening tests for diabetes. 40,42-45 Test characteristics are more variable than fasting plasma glucose, with sensitivity ranging from 15% to 93% and specificity from 84% to 99%.

Presence of glucose in the urine is fairly specific but less sensitive than most blood tests for NIDDM. In population-based screening using semiquantitative urine dipstick, a "trace positive" dipstick result or greater has a reported sensitivity of 23-64% and specificity of 98-99%. 40,46 In a high-risk population, quantitative assays of urine glucose achieved high sensitivity (81%) with high specificity (98%), comparable to both fasting plasma glucose and glycosylated protein assays. 40

Sensitivity of all screening tests increases with the severity of hyperglycemia among the diabetic population. 40 Both the sensitivity and positive predictive value of screening tests will be highest in high-risk populations such as Native Americans and African Americans, where undiagnosed diabetes and severe hyperglycemia are more prevalent. 40 In the asymptomatic general population, where the prevalence of undiagnosed diabetes is only 1-3%, a greater proportion of diabetic patients may be missed by screening, and many persons with a positive screening test will not have diabetes. Screening asymptomatic persons may have some harmful effects, including an increase in false-positive diagnoses in a review of 112 patients being treated for diabetes in a general practice, nine (8%) patients, all without classic symptoms, were found not to have diabetes on further evaluation. 47 Even a true-positive diagnosis could have adverse consequences for an asymptomatic person if it causes "labeling" effects 48 or difficulty obtaining insurance. top link

Screening for GDM.

The Third International Workshop Conference on Gestational Diabetes has recommended screening pregnant women at 24-28 weeks of gestation with a 50 g 1-hour oral glucose challenge test, performed in fasting or nonfasting state. 27 Patients with plasma glucose of 140 mg/dL (7.8 mM) or greater at 1 hour should undergo a diagnostic 3-hour OGTT. There is no single threshold that accurately separates normal from abnormal results on the glucose challenge test, however. 27 Estimates of sensitivity of screening under this protocol range from 71% to 83% with a specificity of 78%-87%. 30,49,50 Sensitivity is increased by using a lower threshold for a positive screen 30,51 and by testing in the fasting state. 49 A large prospective study of nearly 4,300 pregnant women reported that using higher cutpoints (142-149 mg/dL) and adjusting for time since last meal could reduce the misclassification of patients based on initial screening tests. 52 Reproducibility of the 1 hour glucose challenge test is only fair, 53 but it improves with advancing gestational age. 54 In an unselected pregnant population (prevalence of GDM approximately 3%), fewer than one in five women with a positive glucose challenge test will meet criteria for gestational diabetes on a full OGTT. 52

The elevations in plasma glucose in GDM are less pronounced than in IDDM or NIDDM. As a result, neither serum glycosylated proteins 51,55-58 nor urine glucose 34 are sufficiently sensitive for detecting GDM. In addition, glucosuria is common among nondiabetic pregnant women. Random blood glucose has been advocated as a simpler and less costly screening test for GDM 59,60 but its test performance has not been fully evaluated. A large prospective study is comparing fasting and random plasma glucose to oral glucose challenge for detecting GDM and for predicting adverse perinatal outcomes. 52 top link

Screening for Patients at Risk for IDDM.

A growing body of evidence indicates that IDDM is a genetically linked autoimmune disorder, in which progressive destruction of insulin-producing pancreatic islet cells eventually leads to complete dependence on exogenous insulin. 61 Islet cell autoantibodies and insulin autoantibodies are pres- ent in the majority of patients with newly diagnosed IDDM, 62 and may precede the onset of clinical symptoms by months to years. Immunoassays for islet cell antibodies remain difficult to standardize, 63 however, and appear to be of limited value for screening in the general population. In individuals without a family history of IDDM, the prevalence of islet cell autoantibodies ranges from 0.3% to 4.0% and the chance of developing IDDM in antibody-positive individuals is estimated to be less than 10%. 64 The potential value of immune markers is greater in high-risk individuals (i.e., first-degree relatives of affected patients). Several studies report that a combination of immune markers and measures of insulin responsiveness can identify a population at very high risk (up to 70%) of developing IDDM. 62,63,65 This high risk may make such persons appropriate candidates for experimental interventions to reduce the risk of progression to IDDM. Only 10% of all cases of IDDM, however, occur in persons with a positive family history.top link

Effectiveness of Early Detection

Asymptomatic NIDDM.

Up to 20% of patients with newly diagnosed NIDDM already have early retinopathy, suggesting that the onset of diabetes may be many years (estimated 9-12 years) before clinical diagnosis, and that the microvascular changes may precede overt symptoms in many patients. 66 Earlier detection through screening might provide an opportunity to reduce the progression of microvascular or macrovascular disease due to asymptomatic hyperglycemia. Animal models of diabetes suggest that hyperglycemia is the underlying cause of microvascular complications, 67 and numerous epidemiologic studies confirm that the degree of hyperglycemia and duration of disease are associated with microvascular complications such as nephropathy, retinopathy, and neuropathy. 5,68-72 Direct evidence that improving glucose control reduces the incidence of these complications has only recently become available, and only for patients with IDDM. In the Diabetes Control and Complications Trial (DCCT), over 1,400 subjects with IDDM were randomized to intensive insulin therapy versus conventional treatment. Intensive insulin therapy improved average blood glucose, significantly reduced progression of existing retinopathy, and significantly lowered the incidence of retinopathy, neuropathy, and nephropathy in all patients. 73,74

The DCCT study is generally regarded as providing strong evidence of the role of hyperglycemia in diabetic microvascular disease, but questions remain about extrapolating its results to the management of patients with NIDDM. 75 The incidence of microvascular complications is lower in NIDDM than IDDM, and the largest controlled trial to date of treatment of NIDDM (the University Group Diabetes Program study) found no effect of improved glucose control with insulin or drug thera py on retinopathy. 76 More definitive results may come from the U.K. Prospective Diabetes Study (UKPDS), which randomized 2,520 patients with newly diagnosed NIDDM controlled with diet to diet alone, or additional therapy with chlorpropamide, glibenclamide, metformin, or insulin. 77 Three-year results indicated that patients receiving drug or insulin therapy had significantly better glucose control but greater weight gain and more frequent episodes of hypoglycemia. 78 Data on other clinical outcomes are not yet available.

Patients with diabetes are at significantly increased risk for coronary heart disease, stroke, and peripheral vascular disease cardiovascular diseases combined account for the majority of deaths in diabetic patients. The risk of cardiovascular disease, however, is not clearly associated with either disease duration or degree of glycemic control. The rate of increase in coronary heart disease risk over time is similar in patients with NIDDM and in nondiabetic patients. 79,80 In 8-year follow-up of almost 500 diabetic men and women, disease duration was associated with risk of ischemic heart disease in patients with IDDM but not in those with NIDDM, 81 and there was no correlation between cerebrovascular and peripheral vascular events and diabetes duration. Detecting such an association may be complicated by difficulty in accurately ascertaining the onset of diabetes in patients with NIDDM. Insulin resistance and hyperinsulinemia may be more important determinants of macrovascular complications than degree of glucose control. 79,82 In the UGDP study, neither cardiovascular disease nor mortality was reduced by improved glucose control in the intervention groups, 76 but the interpretation of these findings has been criticized. 83 Drug therapy for NIDDM carries the risk of hypoglycemia. In the UKPDS study, the annual incidence of hypoglycemia was 28% for patients on glibenclamide, and 33% for those on insulin episodes requiring medical therapy occurred in 1.4% of subjects each year. 78

The majority of individuals in the U.S. who have disordered glucose metabolism have IGT. 84 Untreated, most persons with IGT do not develop diabetes, but the reported cumulative incidence of diabetes at 10 years has varied from 15% to 61%. 25 Progression to diabetes is highest in some Native American populations. 85 There is little direct evidence of a benefit of detecting and treating IGT. 86,87 Prospective studies of interventions to prevent progression to frank diabetes in patients with IGT have produced conflicting results. One trial of dietary and pharmacologic treatment 88 and a nonrandomized trial of diet and physical activity training 89 each reported a reduced incidence of diabetes, whereas other prospective studies have reported no effect on the rate of progression to diabetes. 90-92 top link

Gestational Diabetes.

GDM is associated with increased risk of fetal macrosomia, birth trauma, neonatal hypoglycemia, and perinatal mortality. 93-96 No properly controlled trial has examined the benefit of universal or selective screening compared to routine care without screening. In two retrospective analyses, no significant difference in macrosomia or in birth trauma was found in women screened for GDM compared to unscreened control populations. 97,98 Because women screened for GDM are more likely to be at high risk, such studies cannot reliably exclude a benefit of screening. 98

The clearest benefit of screening is the potential for treatment to reduce the incidence of fetal macrosomia in women with GDM. Although modified diet can reduce hyperglycemia in GDM, only one controlled trial has examined the effect of dietary therapy on clinical outcomes in GDM. 99 A total of 158 women with mild GDM (positive by NDDG criteria but not WHO criteria) were randomized to diet treatment or no therapy there were no significant differences in perinatal outcomes, although slightly fewer infants over 4,000 g were born to diet-treated mothers (3 vs. 5). 100 Several randomized controlled trials have demonstrated that diet and insulin (compared to diet alone) results in improved glucose control and reduced incidence of macrosomia in women with GDM. 94,101,102 Macrosomia was not significantly reduced in a fourth trial, but 15% of the women assigned to diet therapy received insulin because glucose control was inadequate. 103 An overview of four randomized trials estimated that treatment of GDM with diet and insulin, compared to diet alone, reduced the incidence of macrosomia by two thirds (6% vs. 17%). 104 Despite a reduction in macrosomia, there were no significant differences in rates of cesarean section, forceps delivery, or birth trauma between treated and control groups in any of the prospective trials, however. There was only one reported instance of shoulder dystocia among 140 births in the two trials reporting this outcome. 101,104 In a retrospective analysis of 445 gestational diabetics, women who received both insulin and dietary treatment had significantly lower rates of birth trauma and operative delivery than women who received dietary treatment alone or no intervention. 96 Since treatment was not randomly assigned, factors other than treatment may have contributed to the differences in outcomes.

The benefit of improved glucose control on other outcomes in GDM, including perinatal mortality, remains uncertain. Although several case series have reported marked improvements in perinatal death rates with treatment of GDM, 95,97,105-107 none of these studies employed an appropriate control group. The use of historical controls (i.e., outcomes of prior pregnancies) or general population controls is likely to exaggerate the apparent benefits of treatment. In an overview of five randomized trials, there was no significant difference in perinatal mortality among women treated with diet and insulin (2.7%) and those treated with diet alone (3.2%). 104 Moreover, in trials conducted after 1975, there were no perinatal deaths in treated or control groups. 100,104 In one trial, insulin treatment was associated with lower rates of neonatal jaundice and nonsignificant reductions in admissions to the neonatal ICU. 108 At the same time, treatment of GDM may have adverse effects for some women. In one retrospective analysis, women with GDM who maintained tight glucose control (mean glucose < 87 mg/dL) had a higher incidence of small-for-gestational age infants than nondiabetic controls. 109

Degrees of hyperglycemia more subtle than in GDM may result in increased maternal and neonatal complication rates. 110-112 The incidence of macrosomia and preeclampsia/eclampsia is higher in women who demonstrate at least one abnormal result among the four points on a glucose tolerance test. The prevalence of mildly hyperglycemic pregnant women who do not meet the criteria for GDM but are at increased risk during pregnancy is unknown.

Although treatment of GDM can reduce macrosomia, the impact of widespread screening and treatment on the overall incidence of macrosomia and dystocia may be quite small. The reported incidence of macrosomia in the general population varies from 1% to 8%, 93,113 and most macrosomic infants are born to women without GDM. 114 Gestational diabetes was responsible for only 5% of infants over 4,500 g in one study, 115 and it is estimated to account for only 5% of shoulder dystocia cases in this country. 116 Other factors such as maternal obesity, gestational weight gain, and maternal age may be more important determinants of macrosomia and adverse outcomes. 117 In a prospective study of GDM controlled with diet, the only significant predictor of birth weight was maternal weight at delivery plasma glucose levels were poor predictors of birth weight. 118 top link

Persons at Risk for IDDM.

Earlier diagnosis of IDDM could be of considerable benefit if treatment could arrest the disease process before severe insulinopenia and hyperglycemia had developed. A number of recent trials have examined whether immunosuppressive agents can delay disease progression in patients with new-onset IDDM. 61 Although some patients have experienced prolonged remissions, the benefit has not been sustained in most patients, and the serious adverse effects of immunosuppressive agents are likely to preclude their use in completely asymptomatic persons. There have been several promising small trials of other interventions to prevent IDDM in high-risk asymptomatic persons, enrolling individuals identified by autoantibodies levels and other physiologic measures. 63,119,120 Multicenter randomized clinical trials are currently underway to determine whether prophylactic regimens of insulin or nicotinamide can prevent progression to IDDM in such high-risk subjects. 61 top link

Recommendations of Other Groups

The Canadian Task Force on the Periodic Health Examination (CTF), 121 the American College of Physicians (ACP), 122 and the American Academy of Family Physicians 123 recommend against routine screening for diabetes among asymptomatic nonpregnant adults each of these organizations concluded that selective screening may be reasonable among individuals at high risk of developing diabetes (e.g., older obese persons, th ose with a strong family history). AAFP policy is currently under review. The ADA recommends screening all individuals with a careful history and measuring fasting glucose on those with identified risk factors for developing diabetes, including obesity, family history, history of GDM, selected medical conditions, or selected ethnic background. 124 A 1994 report of the WHO concluded that population screening for NIDDM was not justified, but that opportunistic screening of high-risk persons may be useful to permit earlier intervention. 125

The ACP, 122 the ADA, 124 and the Third International Workshop Conference on Gestational Diabetes 27 recommend universal screening for GDM in pregnant women between weeks 24 and 28 using a 1-hour glucose tolerance test. The American College of Obstetricians and Gynecologists and the American Academy of Pediatrics do not recommend universal screening in pregnancy but strongly recommend screening pregnant women in certain high-prevalence populations (e.g., Native Americans) and those with specific risk factors (age over 30, family history of diabetes, previous macrosomia, malformed or stillborn infants, hypertension, or glucosuria). 126,127 The CTF concluded that there was insufficient evidence to recommend for or against universal screening for GDM, but suggested close monitoring of women with risk factors for GDM. 121 top link

Discussion

Screening for diabetes in asymptomatic adults suffers from two important limitations: the lack of a practical screening test that is both sensitive and specific, and insufficient evidence that detection of diabetes in the asymptomatic period significan tly improves long-term outcomes. Even if improving glucose control can reduce long-term complications of NIDDM, many other factors must be considered in determining the likely benefits and risks of screening in asymptomatic persons: efficacy of diet or medications in reducing glucose levels compliance of asymptomatic persons with lifestyle advice possible risks of drug or insulin therapy inconvenience and costs of screening, follow-up, and treatment and the potential adverse effects of screening (false-positive diagnoses, "labeling" of asymptomatic persons). Targeting screening to high-risk groups (certain ethnic populations, older overweight subjects) and emphasizing interventions that are inexpensive and safe (exercise, prudent diet, and weight loss) are likely to minimize the potential adverse effects of screening. Since most of these interventions are recommended for all adults, the additional benefit of screening to promote lifestyle interventions remains uncertain. If the ongoing UKPDS trial demonstrates important clinical benefits from more intensive interventions (i.e., drug or insulin therapy) in patients with minimally symptomatic NIDDM, this would provide stronger support for screening for diabetes among asymptomatic adults.

The value of widespread screening for GDM is also unproven. Important questions remain about the diagnostic gold standard, the optimal screening test, and the appropriate management of GDM. Although there is good evidence that insulin treatment can reduce the incidence of macrosomia in GDM, evidence of a benefit on clinically important perinatal outcomes (birth trauma, operative delivery, neonatal metabolic derangements, or perinatal mortality) is much weaker. The high risk associated with GDM in earlier cohorts primarily reflects adverse outcomes in women who were older, overweight, or otherwise at increased risk. Universal screening is likely to have only a small impact on the overall incidence of macrosomia and birth trauma and may subject many low-risk women to the inconvenience, costs and possible risks of follow-up testing, dietary restriction, or insulin management. A 1988 study estimated that universal screening would cost $8,000 per case of macrosomia prevented. 122 By one estimate, however, up to 10,000 women would need to be screened to prevent 50 cases of macrosomia, 6 cases of shoulder dystocia, and 1 case of shoulder girdle injury (few of which cause lasting problems). 128 Targeting screening to women with risk factors for GDM (including older age), with emphasis on dietary management of GDM, is likely to minimize the adverse effects and costs of screening. Direct evidence of a benefit of screening on important clinical outcomes is not available for any group, however.

Immune markers are not sufficiently specific to recommend their use in the general population at this time. Screening persons with a family history of IDDM using immune markers and physiologic measurements can identify a small number of persons at very high risk of developing IDDM. Patients with a family history account for only 10% of all cases of IDDM, however, and trials of interventions to prevent IDDM in high-risk patients have not yet been completed.

Primary prevention may be a more effective means to reduce diabetes-associated morbidity than widespread screening. Diet, exercise, and weight reduction can safely improve glucose tolerance and are likely to have independent benefits on other important chronic diseases (see Chapters 55 and 56). Whether diabetes screening improves compliance with generally recommended lifestyle interventions has not been determined.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening for NIDDM in nonpregnant adults ("C" recommendation). Although evidence of a benefit of early detection is not available for any group, clinicians may decide to screen selected persons at high risk of NIDDM on other grounds, including the increased predictive value of a positive test in individuals with risk factors and the potential (although unproven) benefits of reducing asymptomatic hyperglycemia through diet and exercise. Individuals at higher risk of diabetes include obese men and women over 40, patients with a strong family history of diabetes, and members of certain ethnic groups (Native Americans, Hispanics, African Americans). In persons without risk factors, screening for asymptomatic disease is much less likely to be of benefit, due to the low burden of disease and the poor predictive value of screening tests in low-risk persons. Measurement of fasting plasma glucose is recommended by experts as the screening test of choice the frequency of screening is left to clinical discretion.

There is also insufficient evidence to recommend for or against routine screening for GDM ("C" recommendation). Although a clear benefit of screening on perinatal morbidity has not been demonstrated for any group, clinicians may decide to screen high-risk pregnant women on other grounds, including the higher burden of disease, and the potential clinical benefits from reducing macrosomia due to GDM. Risk factors for GDM include obesity, older maternal age, a family history of diabetes, and a history of macrosomia, fetal malformation, or fetal death. The 1-hour 50 g glucose challenge test, with confirmation of abnormal results with a 3-hour 100 g oral glucose tolerance test, is the screening test recommended by expert panels in the U.S.

Screening with immune markers to identify asymptomatic individuals at risk for developing IDDM is not recommended in the general population ("D" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by M. Carrington Reid, MD, PhD, Harold C. Sox, Jr., MD, Richard Comi, MD, and David Atkins, MD, MPH. top link

20. Screening for Thyroid Disease

Burden of Suffering

Hyperthyroidism and hypothyroidism together account for considerable morbidity in the U.S. The total prevalence of these two disorders in adolescents and adults is estimated to be 1-4% prevalence is higher in women and persons with Down syndrome, and increases with increasing age. 1-10 The annual incidence in adults has been estimated to be 0.05-0.1% for hyperthyroidism and 0.08-0.2% for hypothyroidism, with the higher incidences cited occurring in elderly women. 2,11 In adolescents, an incidence of 0.06%/year for these two disorders has been reported. 5 Symptoms of thyroid dysfunction, involving the nervous, cardiovascular, and gastrointestinal systems, may have an important impact on health and behavior. 12 Rarely, fatalities may occur due to thyroid storm in hyperthyroidism and myxedema coma in hypothyroidism. 13 Thyroid dysfunction during pregnancy is associated with an increased risk of adverse maternal and fetal outcomes. 14-18 Most patients with thyroid dysfunction will present with typical clinical symptoms and signs within a few months of disease onset, although overt disease may occasionally be overlooked. Studies in which asymptomatic adults of all ages were screened in the clinical setting, often using older, less sensitive tests, detected previously unsuspected thyroid disease in 0.6-0.9% of persons screened. 2,3,19,20

The clinical diagnosis of thyroid dysfunction can be more difficult in certain high-prevalence populations, including the elderly, those with Down syndrome, and postpartum women, possibly delaying treatment and risking complications. 13 Older persons may experience apathetic hyperthyroidism, without the goiter, ophthalmopathy, and signs of sympathetic nervous system hyperactivity typically seen in younger persons. 21 Typical symptoms and signs of hypothyroidism, like fatigue, constipation, dry skin, and poor concentration, may be confused with symptoms of aging, 22 and may also occur less frequently in elderly hypothyroid patients. 23 Screening persons 60 years or older in the clinical setting detects previously unsuspected hyperthyroidism in 0.1-0.9% and hypothyroidism in 0.7-2.1%. 2,19,24-27 The clinical diagnosis of hypothyroidism may also be overlooked in patients with Down syndrome because some symptoms and signs, such as slow speech, thick tongue, and slow mentation, are typical findings of both conditions. 6,7 Screening persons with Down syndrome has detected previously unrecognized thyroid disease, primarily hypothyroidism, in 2.9% (range 0-6.5%). 6,7,28 Screening reveals thyroid dysfunction, primarily thyroiditis, in 4-6% of postpartum women. 29-32 This dysfunction is sometimes accompanied by nonspecific symptoms, such as fatigue, palpitations, impaired concentration, or depression, 31,33,34 that may be mistakenly attributed to the postpartum condition. Women with a personal or family history of thyroid or autoimmune disease, or with thyroid antibodies, are at increased risk for postpartum thyroid dysfunction. 16,30,35-37 Postpartum thyroid dysfunction is usually transient, but it may require short-term treatment to control symptoms.

Subclinical thyroid dysfunction as typically defined in the literature is a biochemical abnormality, 38 characterized by an abnormal level of thyroid-stimulating hormone (TSH) with otherwise normal thyroid tests and no clinical symptoms. Subclinical hypothyroidism, recognized by an elevated TSH level, is seen in 6-8% of adult women and 3% of adult men. 1 As with overt disease, the prevalence is higher in the elderly and in persons with Down syndrome. 6,7,24,25,27,28,39-41 Progression to overt hypothyroidism occurred in <= 2% of patients who had subclinical hypothyroidism without evidence of thyroid autoimmunity or prior thyroid-related disorders and who were followed for 2-15 years in those with thyroid antibodies, however, progression occurs in about 5-7%/year, and in as many as 20-24%/year in elderly patients with antibodies. 25,28,39,42-45 Other than the risk of developing overt hypothyroidism, the importance of subclinical hypothyroidism is unknown. Case series have suggested adverse effects of an isolated elevated TSH level on blood lipid profile, myocardial function, and neuropsychiatric function, 22,46-48 but controlled observational studies have reported conflicting evidence regarding an association between subclinical hypothyroidism and any of these adverse effects. 49-54

Subclinical hyperthyroidism, recognized by a subnormal TSH level, is seen in 0.2-5% of the elderly population <= 1%/year progress to overt disease. 11,40,55,56 Subnormal levels of TSH are often transient, returning to normal without intervention. 3,25,57 There is limited evidence of risk from subclinical hyperthyroidism except when it is due to excessive thyroxine replacement. Case series have reported subclinical hyperthyroidism in a number of patients with atrial fibrillation. 58-61 Older controlled observational studies found no association between subclinical hyperthyroidism and atrial fibrillation, 62,63 but a significant association was reported in one carefully controlled cohort study that used a sensitive TSH assay. 57 One controlled study reported a significantly lower total cholesterol level in patients with a subnormal TSH level, 51 suggesting a possible benefit of this condition. top link

Accuracy of Screening Tests

Thyroid function tests to detect thyroid disease, including total thyroxine (TT4), free thyroxine (FT4), and TSH, are influenced by a variety of diagnostic and biologic factors that may affect their accuracy. For example, while TT4 is usually elevated in hyperthyroidism, it misses 5% of cases that are due to triiodothyronine (T3) toxicosis. 64 TT4 concentration is strongly influenced by the concentrations and binding affinities of thyroxine-binding globulin and other thyroid-binding proteins. 38 Falsely abnormal TT4 results often occur with conditions that affect these proteins, such as pregnancy, use of certain drugs, and nonthyroidal illness. 38,64 FT4 has the advantage over TT4 of being independent of thyroid-binding protein concentrations. Equilibrium dialysis (ED), regarded as the reference method for FT4, is not suitable for routine screening due to its high cost. 38,65 Immunoassay or index (FTI) methods to estimate FT4 are simpler, less expensive than ED, and have specificities of 93-99% compared to ED 20,66-68 these methods are not always independent of thyroid-binding protein concentrations, however, 38,69 and they may show substantial interlaboratory variation. 65 TT4 and FT4 cannot be reliably measured in ill patients, because a substantial proportion will have abnormal thyroid function in the absence of true thyroid disease, due to "sick euthyroid syndrome." 69-71 Screening with TT4 or FT4 will generate many false-positive results in healthy populations. 2,19,20,71 With test specificities in the mid-90% range or lower, the low prevalence of previously unsuspected thyroid disease means that the likelihood of disease given an abnormal test will be quite low. In one study, thyroid disease requiring treatment was found in only 13% of those with abnormal FTI results. 20 Because TT4 and FT4 are normal by definition in subclinical thyroid dysfunction, they are not useful as screening tests for this condition.

The immunometric ("sensitive") TSH (sTSH) assays detect low as well as high serum TSH levels, and have become the standard for detecting hyperthyroidism and hypothyroidism. They therefore offer promise as first-line thyroid screening tests. In unselected populations, sTSH has a sensitivity of 89-95% and specificity of 90-96% for overt thyroid dysfunction, as compared to reference standards incorporating clinical history, examination, repeat measurement, and/or additional testing including thyrotropin-releasing hormone tests. 3,40,68 In an asymptomatic older population, the likelihood of thyroid disease given an abnormal sTSH was only 7%, however, reflecting the low prevalence of disease in healthy people. 40 Acutely ill patients, pregnant women, and persons using certain drugs such as glucocorticoids may have false-positive sTSH results, 38,72,73 although specificity is better than for TT4 and FT4 when the three tests have been directly compared. 38,74 Newer sTSH assays reduce but do not eliminate false-positive diagnoses in such patients. 75-77 sTSH may respond slowly to abrupt changes in thyroid function, 38 such as those that occur after treatment for hyperthyroidism, but such changes are not generally relevant to the screening of asymptomatic patients. top link

Effectiveness of Early Detection

Screening for occult thyroid dysfunction in adults would be valuable if there were clinical benefits of early treatment, including relief of previously unrecognized symptoms. We found no studies evaluating the treatment of hyperthyroidism detected by screening in asymptomatic persons, or of subclinical hyperthyroidism in persons with atrial fibrillation. Uncertainties about the benefits of treating hyperthyroidism detected by screening are particularly important because of the costs and potential adverse effects (e.g., agranulocytosis, induced hypothyroidism, surgical complications) of treatment with antithyroid medications, radioactive iodine ablation, or subtotal thyroidectomy. 78,79

Several studies have evaluated the effectiveness of treating patients with subclinical hypothyroidism. Most of the subjects had previously identified thyroid disease, however, and the results may not apply to asymptomatic patients identified only by screening. In a randomized placebo-controlled trial of 33 women with subclinical hypothyroidism, all with a past history of treated hyperthyroidism, there were significant improvements in myocardial contractility and in previously unrecognized symptoms, but no significant changes in basal metabolic rate, pulse, body weight, skin texture, or serum lipid levels. 80 The long-term clinical importance of subtle changes in myocardial contractility is unknown. An uncontrolled experiment in 17 women identified by screening found a significantly improved mean clinical symptom score after treatment, mixed effects on myocardial function, and no effect on cholesterol, resting heart rate, body mass, or blood pressure. 81 Methodologic flaws make it difficult to interpret the results of this study. Uncontrolled experiments in adult patients, mostly women, with subclinical hypothyroidism due to previously identified thyroid disease have reported variable improvement in myocardial function but little effect on lipoproteins with thyroxine treatment. 82-89 A randomized controlled trial measuring the effects of thyroid replacement on quality of life, lipids, neuropsychological function, bone mineral density, and myocardial function in elderly patients with subclinical hypothyroidism is ongoing (personal communication, Dr. R. Jaeshke, St. Joseph's Hospital, Hamilton, Ontario, August 4, 1995).

In children and adults with Down syndrome and subclinical hypothyroidism, a double-blind crossover placebo-controlled trial failed to document any cognitive, social, or physical changes attributable to 8-14 weeks of thyroxine treatment, 41 although treatment duration may have been inadequate to effect change. There is otherwise little evidence regarding the benefits of early intervention in these individuals.

Thyroxine replacement therapy can have adverse effects with even moderate degrees of overtreatment (as detected by low TSH or high TT4 levels), including decreased bone density compared to matched controls. 90-93 Reduced bone density could increase the risk of fractures in the elderly, but one large series found no significant difference in risk for fractures (or for ischemic heart disease) in treated patients with normal TSH levels compared to those with suppressed TSH due to overtreatment. 94 The fracture rate in the two groups was the same as in the general population, while the risk of ischemic heart disease was higher in treated patients irrespective of TSH levels. The study was not designed to determine whether the latter finding was due to treatment or to the underlying disease, however. A small randomized controlled trial in postmenopausal women with subclinical hypothyroidism found no bone density reduction after 14 months of appropriate thyroxine treatment. 95 Evidence therefore suggests against an adverse impact of appropriate thyroxine treatment. top link

Recommendations of Other Groups

No organizations recommend routine screening for thyroid disease in the general population, except screening newborns for congenital hypothyroidism (see Chapter 45). The American Academy of Family Physicians (AAFP) 96 and the American Association of Clinical Endocrinologists 97 recommend measuring thyroid function periodically in all older women. The policy of the AAFP is currently under review. The Canadian Task Force on the Periodic Health Examination recommends maintaining a high index of clinical suspicion for nonspecific symptoms consistent with hypothyroidism when examining perimenopausal and postmenopausal women. 98 The American College of Physicians recommends screening women over age 50 with one or more general symptoms that could be caused by thyroid disease. 99 The American College of Obstetricians and Gynecologists recommends that physicians and patients be aware of the symptoms and risk factors for postpartum thyroid dysfunction, and evaluate patients when indicated. 100 The American Academy of Pediatrics recommends that children with Down syndrome have thyroid screening tests at 4-6 and 12 months of age, and annually thereafter. 101 The American Thyroid Association recommends screening thyroid function in elderly patients, postpartum women, and all patients with autoimmune disease or with a strong family history of thyroid disease, using serum TSH measurement. 64,102 top link

Discussion

The prevalence of unsuspected thyroid disease in healthy people in the general population is very low. Despite the high specificity of thyroid function tests such as the newer TSH assays, their routine use in the asymptomatic general population results in many false-positive results. Because of the low prevalence of unsuspected disease, only 1 in 5-10 persons with abnormal screening tests will prove to have thyroid disease. Given the low risk, the lack of evidence that treatment of subclinical thyroid disease identified by screening results in important health benefits, and the potential adverse effects of treatment, screening the asymptomatic general population is not recommended.

The prevalence of thyroid disease is higher in certain populations, including elderly persons (particularly women), persons with Down syndrome, and postpartum women, and these patients might be candidates for thyroid function testing if the results could provide an explanation for nonspecific and insidious symptoms, such as fatigue, memory impairment, or depression, that might be attributed mistakenly to other medical or psychiatric causes. Clinicians should therefore maintain a high index of suspicion for such nonspecific symptoms, and for thyroid disease when these types of symptoms are found, when examining high-risk patients. There is, however, little evidence that routinely screening high-risk patients results in important clinical benefits.

CLINICAL INTERVENTION

Routine screening for thyroid disease with thyroid function tests is not recommended for asymptomatic children or adults ("D" recommendation). This recommendation does not mean that clinicians should not monitor thyroid function in patients with a previous history of thyroid disease. There is insufficient evidence to recommend for or against screening for thyroid disease with thyroid function tests in high-risk patients, including elderly persons, postpartum women, and persons with Down syndrome, but recommendations may be made on other grounds, such as the higher prevalence of disease and the increased likelihood that symptoms of thyroid disease will be overlooked in these patients ("C" recommendation). Clinicians should remain alert for subtle or nonspecific symptoms of thyroid dysfunction when examining such patients, and maintain a low threshold for diagnostic evaluation of thyroid function. Examples of such symptoms include easy fatiguability, weight gain, dry skin or hair, cold intolerance, difficulty concentrating, depression, nervousness, and palpitations. If screening is performed, the preferred test is measurement of thyroid-stimulating hormone (TSH) using a sensitive immunometric or similar assay, because of its superior sensitivity and specificity. Screening for congenital hypothyroidism is discussed in Chapter 45.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

21. Screening for Obesity

Burden of Suffering

Obesity is an excess of body fat. 1 Most epidemiologic studies rely on indices of relative weight, such as body mass index (BMI), an index of body weight that is normalized for height, to estimate the prevalence of obesity. [a] For example, the National Center for Health Statistics currently uses the 85th percentile sex-specific values of BMI for persons aged 20-29 (>=27.8 kg/m2 for men and >=27.3 kg/m2 for women) from the second U.S. National Health and Nutrition Examination Survey (NHANES II) as a cutoff to define overweight in adults. 2

Approximately one third of adult Americans aged 20 and older are estimated to be overweight, based on data from NHANES III. 3 Using 1990 census figures, this corresponds to 58 million people. The prevalence of overweight in the United States has increased dramatically during the past 15 years in men and women of all age and ethnic groups, and remains disproportionately high among black and Hispanic women. 3 Other groups that have a high prevalence of obesity include Asian and Pacific Islanders, Native Americans and Alaska Natives, and Native Hawaiians. 4 The prevalence of overweight among adolescents has also increased. 5 Based on NHANES III data, about one fifth of adolescents aged 12-19 are overweight. 5 The prevalence of obesity among younger children is uncertain, but is estimated to be between 5% and 25%, 6 and may also be increasing. 7

Increased mortality in adults has been clearly documented as a result of morbid obesity, weight that is at least twice the desirable weight. 8,9 Less severe obesity (e.g., as low as 26.4-28.5 kg/m 2 ) has also been associated with increased mortality in large prospective cohort studies. 10-13 Although some studies have reported greater mortality among the thinnest individuals, 14 a 1993 prospective cohort study that carefully controlled for smoking and illness-related weight loss found a linear relationship between BMI and mortality. 15 Two cohort studies suggest that overweight children and adolescents may have increased mortality as adults. 16,17 Childhood obesity may be a significant risk factor for adult obesity, with adolescent obesity being a better predictor than obesity at younger ages. 8,18,19

Persons who are overweight are more likely to have adult-onset diabetes, hypertension, and risk factors for other diseases. 8,20 The prevalence of diabetes and hypertension is 3 times higher in overweight adults than in those of normal weight. 21 Observational studies have established a clear association between overweight and hypercholesterolemia and suggest an independent relationship between overweight and coronary artery disease. 8,10,11,20-23 Being overweight has also been associated with several cardiovascular risk factors in children and adolescents, including hypercholesterolemia and hypertension. 24-26 An elevated waist/hip circumference ratio (WHR), which may indicate central adiposity, has been shown to correlate with the presence of these conditions independent of BMI, 27-35 and may predict the complications of obesity in adults better than BMI does. 35,36 Obesity has also been associated with an increased risk of certain cancers (including those of the colon, rectum, prostate, gallbladder, biliary tract, breast, cervix, endometrium, and ovary), and with other disorders such as cholelithiasis, obstructive sleep apnea, venous thromboembolism, and osteoarthritis. 8,20,37 Finally, obesity can affect the quality of life by limiting mobility, physical endurance, and other functional measures, 8 as well as through social, academic, and job discrimination. 38-40 top link


[a] Overweight refers to an excess of body weight relative to height that includes all tissues and therefore may reflect varying degrees of adiposity. Despite the distinction between obesity and overweight, the majority of overweight persons are also obese, and these terms tend to be used interchangeably in the medical literature. top link

Accuracy of Screening Tests

Extremely overweight individuals can be identified easily in the clinical setting by their physical appearance. More precise methods may be necessary, however, to evaluate persons who are mildly or moderately overweight. The complications of obesity occur among those with elevated body fat composition, which is most accurately measured by underwater (hydrostatic) weighing, isotopic dilution measures, and other sophisticated techniques that are not suited to clinical practice. 41 Bioelectric impedance, which provides an estimate of total body water from which the percentage of body fat can be calculated, is not widely available in clinical practice. This method has been reviewed elsewhere. 42

The most common clinical method for detecting obesity is the evaluation of body weight and height based on a table of suggested or "desirable" weights. e.g., 43-45 These tables generally reflect the weight at which mortality is minimized, and they only approximate the extent of fatness. The criteria for healthy body weight are a matter of controversy among experts and vary considerably as presented in different weight-for-height tables. 46,47 Weights for children and adolescents are typically evaluated in relation to average weight for age, height, and gender. This information can be obtained from growth charts that are based on percentile distributions of body size attained at specific ages. 2 An alternative measure to using weight-for-height tables or growth charts is the BMI, a weight-height index that is calculated by dividing the body weight in kilograms by the square of the height in meters (kg/m2). The BMI is easily performed, is highly reliable, 48 and has a correlation of 0.7-0.8 with body fat content in adults. 49-52 BMI also correlates with body fat content in children and adolescents. 50,51,53 In adults, overweight has been defined by the National Center for Health Statistics as a BMI >= 27.8 for men and >= 27.3 for women (the 85th percentile values for persons aged 20-29 in NHANES II) 2 a BMI at this level has been associated with increased risk of morbidity and mortality. 8 In adolescents, a BMI exceeding the 85th percentile for age and gender has been suggested as one definition for overweight 2 or for those at risk of overweight. 54

Other anthropometric methods that may be useful in the clinical setting include the measurement of skinfold thickness and the indirect assessment of body fat distribution. Skinfold thickness is a more direct measure of adiposity than BMI and correlates well with body fat content in both adults and children, but this technique requires training and has lower intra- and interobserver reliability than height and weight measurements used to calculate BMI. 55,56 The WHR, the circumference of the waist divided by the circumference of the hips, which may be a better predictor of the sequelae associated with adult obesity than BMI, can also be measured in the clinical setting. The reliability of the WHR is comparable to that of BMI. 57 A WHR greater than 1.0 in men and 0.8 in women has been shown to predict complications from obesity, independent of BMI, 36 although the WHR has not been evaluated in all ethnic groups. top link

Effectiveness of Early Detection

The purpose of screening for obesity is to assist the obese individual to lose or at least maintain weight and thereby prevent the complications of obesity. Such screening may also assist with counseling other patients regarding maintaining a healthy weight. Most studies of interventions for obesity involve subjects who are overweight we found no studies evaluating interventions for persons identified solely on the basis of an elevated WHR. Although there is little evidence from prospective studies that weight loss by obese individuals improves their longevity, there is evidence that obesity is associated with increased mortality 8-13 and that weight loss in obese persons reduces important risk factors for disease and mortality. 8,58 Prospective cohort studies 59,60 and randomized clinical trials 61-64,66 have demonstrated that caloric restriction or weight loss reduces systolic and diastolic blood pressures as well as the requirements for antihypertensive medication in obese adults with hypertension. These effects were independent of sodium restriction. In controlled 67 and uncontrolled trials 68,69 of low-calorie diets in obese diabetic patients, weight reduction was associated with improved glycemic control and reduced need for oral hypoglycemic agents and insulin. Weight loss generally improves the blood lipid profile 70-72 and can reduce symptoms related to obstructive sleep apnea. 73,74 To benefit from the detection of obesity, however, patients must be motivated to lose weight, must have access to an efficacious method of reducing body weight, and must maintain the resulting weight loss.

Various weight-reducing regimens are available, but many have only short-term efficacy and fail to achieve long-term weight loss. 6,9,75,76 Research to explain the difficulty in achieving long-term weight loss is ongoing. One theory is that obesity is related to an internal "set-point" that maintains excess body fat in certain individuals. 77 Some evidence suggests that energy expenditure decreases to compensate for reduced body weight, 78 which would tend to return body weight to the usual weight. Such a decrease in energy expenditure could contribute to the failure of most weight-reducing regimens to achieve long-term benefits.

Dietary modification is the most commonly used weight-loss strategy, and can achieve weight reduction over the short-term in both adults and children. 6,76,79 Very-low-calorie diets (< 800 kcal/day), which have been used for moderately to severely obese adults who have failed more conservative approaches, 80 produce greater short-term weight loss than standard low-calorie diets of 1,000-1,500 kcal/day. 76,79,80 Long-term results, however, are similar with both types of programs: the majority of participants eventually return to their pre-treatment weight within 5 years, 76,79 although sustained weight loss may be achieved by some patients. 81-85 Cohort studies and randomized controlled trials of behavioral modification, often combined with dietary therapy, have shown modest long-term benefits in adults 86-88 and children. 89,90 The results of the intensive dietary and behavioral interventions evaluated in these studies may not necessarily be applicable to the type of counseling likely to be given in a busy clinical primary care practice, and referral to other qualified providers or to qualified weight-management programs 1 may be necessary to achieve similar results. The amount of weight loss that can be achieved with exercise, either alone or in combination with other methods, is relatively limited in adults 76,91-93 and children, 94-96 but physical activity may be beneficial in maintaining weight loss 76,92,97,98 and reducing the WHR 92,99 in adults. Numerous randomized clinical trials have shown that various appetite-suppressant drugs can be effective in producing short-term weight loss in adults. 100-109 The effects, however, are limited to periods when the drug is taken, and some studies have shown a plateauing or gradual regain of weight with prolonged use. 76,100,101,104,105,107,110,111 Surgical techniques such as vertical band gastroplasty and gastric bypass may benefit selected adults who are morbidly obese, 112-114 but other procedures such as intragastric balloon insertion have not been shown to be effective. 115-118

Certain weight reduction methods may cause important adverse effects. Very-low-calorie diets can cause fatigue, hair loss, dizziness, and symptomatic cholelithiasis. 76,119 Pharmacologic agents may cause palpitations, dizziness, insomnia, headache, and gastrointestinal discomfort. 120 Surgical therapies such as gastroplasty and balloon insertion can lead to gastric ulceration, perforation, and bowel obstruction. 121 Some cohort studies have reported that weight change or fluctuation in weight (weight cycling) among adults is associated with increased cardiovascular morbidity and mortality, but a review by the National Task Force on the Prevention and Treatment of Obesity concluded there is insufficient evidence that weight cycling is associated with adverse effects. 122 There is conflicting evidence regarding the potential adverse effects of caloric restriction and weight loss on growth velocity and development in obese children and adolescents. 123-127 top link

Recommendations of Other Groups

The American Academy of Family Physicians, 128 the American Heart Association, 129 the Institute of Medicine, 130 the American Academy of Pediatrics, 131 the Bright Futures guidelines, 132 and the American Medical Association guidelines for adolescent preventive services (GAPS) 133 all recommend measurement of height and weight as part of a periodic health examination for patients. Bright Futures and GAPS also recommend the determination of BMI for all adolescents. 132,133 The Canadian Task Force on the Periodic Health Examination concluded that there is insufficient evidence to recommend the inclusion or exclusion of height, weight, BMI, or skinfold measurement to screen for obesity in a routine health examination of either children or adults. 134 The Canadian Task Force does, however, recommend measuring and plotting the height and weight of infants and children in order to identify those who are failing to thrive. top link

Discussion

Evidence is limited that screening for obesity and implementing weight-reducing or weight maintenance strategies are effective in decreasing long-term morbidity and mortality. This is unlikely to improve in the near future due to the difficulty and cost of conducting controlled trials of weight loss with these outcome measures and of separating the effect of obesity from that of other risk factors. An additional obstacle is the low rate of long-term success in maintaining weight loss. Obesity is a chronic disorder that requires continuing treatment, which could explain the failure of short-term interventions in achieving long-term success. Although losing weight has not been proven to reduce morbidity and mortality, it is clear that weight loss reduces an individual's risk for major chronic diseases such as hypertension and coronary artery disease, and it also improves the management of both hypertension and diabetes. Periodic height and weight measurements are inexpensive, rapid, reliable, and require minimal training to perform. They may also be useful for the detection of medical conditions causing unintended weight loss or weight gain, such as cancer or thyroid disorders, and the detection of growth abnormalities in childhood. Once height and weight have been determined, the BMI or standard height and weight tables may be used as a means of evaluating adolescents and adults for obesity. In addition, determination of the WHR may be useful for assessing some adults, particularly those whose weight or BMI is borderline for classification as overweight and who have personal or family medical histories placing them at increased health risk. There are inadequate data to determine the optimal frequency of obesity screening, and this is best left to clinical discretion.

CLINICAL INTERVENTION

Periodic height and weight measurements are recommended for all patients ("B" recommendation). In adults, BMI (body weight in kilograms divided by the square of height in meters) or a table of suggested weights e.g., 43-45 may be used, along with the assessment of other factors such as medical conditions or WHR, as a basis for further evaluation, intervention, or referral to specialists. In adolescents, a BMI exceeding the 85th percentile for age and gender may be used as a basis for further assessment, treatment, or referral. 54 The height (or length if appropriate) and weight of infants and children may be plotted on a growth chart e.g., 2 or compared to tables of average weight for height, age, and gender to determine the need for further evaluation, treatment, or referral. The optimal frequency for measuring height and weight in the clinical setting has not been evaluated and is a matter of clinical discretion. There is insufficient evidence to recommend for or against determination of the WHR as a routine screening test for obesity ("C" recommendation).

All patients should receive appropriate counseling to promote physical activity (see Chapter 55) and a healthy diet (see Chapter 56).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Barbara Albert, MD, MS, and Carolyn DiGuiseppi, MD, MPH, based in part on background papers written for the Canadian Task Force on the Periodic Health Examination by James Douketis, MD, William Feldman, MD, FRCPC, and Brenda Beagen, MA. top link

22. Screening for Iron Deficiency Anemia --- Including Iron Prophylaxis

Burden of Suffering

Anemia is defined by the presence of a hemoglobin level that is below the normal range of values for the population (see Accuracy of Screening Tests ). 1-4 U.S. populations with a high prevalence of anemia include blacks, Alaska Natives and Native Americans, immigrants from developing countries, and individuals of low socioeconomic status. 5-7 Anemia may be due to a variety of underlying conditions. Iron deficiency is an important cause among young children and women of reproductive age in the U.S. 8-10 The prevalence of iron deficiency anemia in U.S. children has declined in recent years 6,11-14 and in 1993 was estimated to be at or below 3% for children aged 1-5. 15 In low-income populations and certain ethnic groups, such as Alaska Natives, the prevalence of iron deficiency anemia in children <5 years of age may be substantially higher, ranging from 10-30% (unpublished data, Centers for Disease Control and Prevention, 1993). 6 Among middle-class children, on the other hand, anemia is uncommon and tends to be mild (i.e., within 1% of the hematocrit level defining anemia). 13,14 The exact prevalence of iron deficiency anemia among pregnant women is uncertain, but national data suggest that <2% of nonpregnant women aged 20-44 years have iron deficiency anemia. 5 Among low-income, pregnant U.S. populations, a low hemoglobin level and/or low hematocrit is present in 6% of white women and 17% of black women during the first trimester and in 25% of white women and 46% of black women during the third trimester. 7 The high rates of anemia in pregnant women may not be attributable to iron deficiency, however. In a large cohort of urban, low-income, mostly minority pregnant women, only 12.5% of anemic women were iron deficient. 16

As early as the 1960s, researchers demonstrated that, in general, decreased hemoglobin alone does not have readily apparent adverse effects unless it is below 10 g/dL (100 g/L). 17-19 Clearly, persons with markedly reduced hemoglobin levels are at risk for cardiopulmonary and other complications. Reduced work productivity, endurance, and exercise capacity have been associated with anemia or iron deficiency anemia in adults, most of whom were from developing countries. 20-26 Iron deficiency and iron deficiency anemia during infancy and early childhood have been associated with abnormal infant behavior, growth, and development, although it is unclear how much of this association is actually attributable to other factors often associated with iron deficiency (e.g., poor nutrition, low socioeconomic status). 27-35 Hemoglobin levels well below what is considered normal for pregnancy have been associated with increased risk of low birth weight, preterm delivery, and perinatal mortality. 16,36-42 top link

Accuracy of Screening Tests

The hemoglobin concentration and hematocrit are the principal screening tests for detecting anemia. The World Health Organization hemoglobin cut-points for diagnosing anemia in adults have been widely adopted: men, <13 g/dL menstruating women, <12 g/dL pregnant women, <11 g/dL. 1 The Centers for Disease Control and Prevention (CDC) has also produced criteria for anemia: in infancy and childhood, <11 g/dL for 0.5-4.9 years and <11.5 mg/dL for 5.0-11.9 years in pregnancy, <11 g/dL during the first and third trimesters and <10.5 g/dL in the second trimester. 2 Studies have shown that automated electronic cell counters and chemical analyzers provide accurate and reliable data on red blood cell number and size and on the concentration of hemoglobin. 43,44 Although sampling of capillary blood is more convenient in ambulatory practice and is especially useful for infant testing, results obtained from capillary blood specimens are less reliable than those from venous blood. 45,46 One study found the capillary microhematocrit to have a sensitivity of 90% and a specificity of 44% when compared with values obtained from venous blood with an automated cell counter. 46

While sensitive for iron deficiency anemia, hemoglobin is not sensitive for iron deficiency because mild deficiency states may not affect hemoglobin levels. 47 Hemoglobin is also nonspecific, since many cases of anemia are due to causes other than iron deficiency. 9,10,16 Reported sensitivity of low hemoglobin for detecting iron deficiency ranges from 8-90% and specificity from 65-99% depending on the population, reference standard and cut-point used. 47-50 In a national sample, detection of a hemoglobin <12 g/dL had a sensitivity of 90% and specificity of 78% for iron deficiency in black women, whereas sensitivity and specificity were 36% and 95% in white women. 50 Other tests (i.e., total iron binding capacity, serum iron, transferrin saturation, erythrocyte protoporphyrin, mean cell volume, red blood cell distribution width, and serum ferritin) may be more accurate for the detection of iron deficiency, but they are poor screening tests for iron deficiency anemia. 48,51-59 Among these tests, serum ferritin has the best sensitivity and specificity for diagnosing iron deficiency in anemic patients. 60 top link

Effectiveness of Early Detection

Evidence is limited that in the asymptomatic general adolescent or adult U.S. population, early detection and treatment significantly reduces morbidity from anemia, iron deficiency, or the conditions that cause them. 17-19,61 This evidence is further limited by the fact that studies often use inconsistent or vague definitions of anemia and iron deficiency. Observational studies in developing countries have reported decreased physical endurance and maximal exercise capacity in association with iron deficiency anemia. 21-24,26 The extent to which this might affect daily activities that do not involve maximal exercise capacity is unknown. Clinical trials and case series from developing countries have reported conflicting results regarding a benefit of iron supplementation on work productivity in anemic or iron-deficient workers. 22,25,62-64 There is little evidence evaluating adverse effects from the mild degree of anemia that is most often detected by screening asymptomatic persons in developed countries. In a Swedish cohort, anemic women (Hgb <12 g/dL) reported no increase in reported infections, fatigue, or other symptoms, but they were significantly more likely to report low work productivity compared to nonanemic women. 20 In a small, randomized placebo-controlled trial of Welsh women with anemia (hemoglobin <10.5 g/dL) detected by population-based screening, iron therapy did not result in clinically or statistically significant improvements in psychomotor function tests, symptoms, or subjective well-being, despite increased hemoglobin concentrations. 65 Trials evaluating the effects of iron supplementation on physiologic outcomes such as running speed and maximum running time in nonanemic, iron-depleted runners have been inconclusive. 66,67 Although the evaluation of anemia may disclose underlying diseases (e.g., occult malignancies) that benefit from early detection, 68 there are no data to suggest that testing for anemia is an effective means of screening for these conditions.

A number of trials have evaluated whether infants with iron deficiency anemia benefit from early treatment. Randomized controlled trials have demonstrated the efficacy of iron supplementation in correcting iron deficiency anemia in infants and children, but its effect on clinical outcomes is less clear. 27,28,69,70 Four relatively large, generally well-conducted, randomized controlled trials in developing countries have evaluated the effects of iron supplementation on the behavior and development of anemic infants. 27,28,31,70 Three trials failed to show a significant effect of treatment on standardized developmental test scores after short-term (6-10 days) iron therapy. 27,28,31 In one trial, 31 an additional 3 months of iron therapy for all infants corrected their anemia but did not significantly improve developmental test scores. Another of the trials 27 reported that after 3 months of iron therapy infants whose iron deficiency was completely corrected (36%) had developmental scores similar to iron sufficient subjects, mainly because the scores of the latter group declined. In the remaining 64%, treatment corrected anemia but not iron deficiency, and test scores remained lower in this group. This trial did not have a true placebo group, and other causes of anemia (such as vitamin A deficiency) were not adequately excluded. There was no delayed benefit of iron therapy in this trial children with hemoglobin <=10 g/dL as infants still had lower developmental scores at school entry 5 years later. 33 The largest and most recent trial, which enrolled 12-18-month-old infants with hemoglobins <= 10.5 g/dL, reported sizable, statistically significant improvements in both mental and motor development after 4 months of oral therapy. 70 A small randomized double-blind placebo-controlled trial 71 in the United Kingdom evaluated 2 months of iron supplementation in urban, underprivileged toddlers (aged 17-19 months) with mild to moderate anemia, using the Denver developmental screening test to assess outcome, which is not as well standardized or validated as the test used in the other trials. Effects on developmental outcomes were inconsistent, but iron supplements significantly increased rate of weight gain. Long-term results were not evaluated. It is unclear why the results of these trials differ, although adequacy and duration of therapy may account for some of the differences.

Clinical trials and a cohort study in older children have also demonstrated improved iron and hemoglobin status with therapy, 35,69,72 but evidence for a clinical benefit from treatment of iron deficiency anemia is limited. A double-blind randomized controlled trial in 1,358 9-11-year-old Thai children failed to show any effect of iron treatment on intelligence test scores in anemic children, despite the large sample size. 73 A series of small randomized controlled trials conducted in India did report small improvements in IQ with iron treatment, 74,75 however, and two small randomized, controlled trials suggested a benefit of oral iron on some tests of learning in anemic school-aged Indonesian children. 76,77 All of these studies suffered from important design limitations, such as use of unvalidated tests, multiple significance testing, high dropout rates, and addition of folic acid to the treatment regimen. Conflicting results have been reported concerning the effect of iron supplementation on infection rates in children. 35,78-80 On the other hand, improved growth and weight gain with 3-6 months of iron supplementation have been reported consistently in placebo-controlled trials of anemic, malnourished children in developing countries. 34,35,81,82 Another controlled trial reported a significant benefit from iron treatment on physical performance and submaximal work capacity in anemic Indian boys. 83 It is unclear whether the results of these studies are generalizable to U.S. children, who are likely to be healthy and otherwise adequately nourished.

Early detection and treatment of iron deficiency anemia in pregnancy has been assumed to be beneficial because moderate to severe anemia (i.e., <9.0-10.0 g/dL) has been associated with a 2-3-fold increased risk of low birth weight, preterm delivery, and perinatal mortality in numerous cross-sectional and longitudinal observational studies in industrialized countries. 16,36-42,84 The consistency of these results across different study designs and population samples is noteworthy, although such studies do not conclusively prove that anemia directly influences pregnancy outcomes. Many of the studies did not control for other factors that may have had adverse effects (e.g., smoking, maternal malnutrition), or for increases in hemoglobin and hematocrit that occur as gestation approaches term most did not differentiate iron deficiency anemia from anemia due to other causes. A large body of data suggests that iron supplements are effective in improving the hematologic indices of pregnant women, 85-91 but there is limited evidence that improving hematologic indices in anemic women results in improved clinical outcomes for the mother, fetus, or newborn. Most published trials evaluating the effects of iron supplementation on pregnancy outcomes systematically excluded anemic women (i.e., those with hemoglobins <10 g/dL or hematocrits <=0.3) and are therefore not necessarily relevant to pregnant women with iron deficiency anemia. They are described later in the chapter (see Primary Prevention ). One controlled trial enrolled Indian women with hemoglobins as low as 7.0 g/dL who attended rural health centers, randomizing the women by health center to receive either iron and folic acid supplements for 100 days or no supplements. 92 The trial reported a significantly higher mean birth weight and lower rate of low birth weight infants in women who completed 100 days of supplements compared to controls. Among those receiving supplementation, the increase in birth weight was significantly related to rise in hemoglobin. A large (n = 601), retrospective cohort study from Kenya compared women with severe anemia (hemoglobin <=8.8 g/dL) who were or were not treated with ferrous sulfate. 93 Treatment was associated with markedly reduced preterm delivery and stillbirth rates, and increased mean birth weight, but there was no change in neonatal death rates. Women with "mild anemia" (hemoglobin >=8.9 g/dL) who received iron therapy before 30 weeks also had lower preterm delivery and perinatal mortality rates compared to those receiving no iron or iron after 30 weeks. Neither significance testing nor adjustment for covariates was performed, limiting the conclusions that can be drawn from these data. The detection of anemia and the determination of its etiology may also lead to the discovery of other correctable obstetrical risks (e.g., poor nutritional status, medical illness) that might otherwise escape detection, but the effectiveness of anemia screening in improving outcomes related to these risks has not specifically been evaluated in developed countries.

Adverse effects of iron therapy include unpleasant gastrointestinal symptoms (e.g., nausea and constipation) that are dose-related and, at normal doses, reversible. 94-97 Iron therapy can cause complications of excessive iron storage in patients with an underlying iron storage disorder (e.g., idiopathic hemochromatosis). 98,99 A potential hazard of iron supplements is unintentional overdosage by children in the home 20,330 cases of ingestion of iron or iron-containing vitamins by children under 6 years, including 3 fatalities, were reported to poison control centers in 1993. 100 Iron supplements accounted for 30% of fatal pediatric pharmaceutical overdoses occurring between 1983 and 1990. 101 Other potential adverse effects of iron mentioned in the literature (e.g., birth defects, cancer, heart disease, infection, metabolic imbalances of other minerals, and harmfully high hemoglobin levels) 109 have not been proven. top link

Primary Prevention (Iron Prophylaxis)

Studies of the effects of iron fortification of formula and cereal on healthy, nonanemic infants have focused primarily on laboratory rather than clinical outcomes. Randomized and nonrandomized controlled trials, observational studies, and time series studies have demonstrated substantial reductions in the incidence of iron deficiency and iron deficiency anemia in healthy infants fed iron-fortified formula, iron-fortified cereal, or breast milk (with iron-fortified cereal added at 4-6 months), compared to infants fed cow's milk or unfortified formula. 12,102-107 Evidence is more limited regarding clinical benefits from iron fortification of infant diets. A cohort study reported that infants fed iron-fortified formula beginning at age 6 months had significantly fewer diarrheal episodes compared to infants fed whole cow's milk (0.16 vs. 0.30 per child in the second 6 months of life) the incidence of other medical conditions (e.g., otitis media, dermatitis, wheezing) did not differ. 106 This study did not have standardized criteria for diagnosing medical conditions and had limited control for potentially confounding variables, however. A controlled trial randomized healthy infants at a mean age of 1.3 months to either iron-fortified or nonfortified milk-based formula. 108 Infants in the fortified group had a significantly greater height (by 0.9 cm) and growth rate at 12 months, but this group was also significantly taller at birth. At 12 months, there were no other statistically significant differences in clinical outcomes, such as weight, number of acute illnesses, or psychomotor development. Intake of the iron-fortified formula may have been insufficient to produce a clinical effect, however by 8 months of age, 92% of study infants and 85% of controls were drinking cow's milk rather than study formula. In another randomized controlled trial in healthy infants from very low income families, infants randomized to nonfortified formula had significantly worse psychomotor development (Bayley Scales) at 9 and 12 months of age compared to those given iron-fortified formula. 108a Differences were no longer significant at 15 months, although sample size may have been inadequate: only half the sample was assessed at 15 months. There were no differences at any age in standardized tests of cognitive development or behavior. These results suggest a clinical benefit from iron-fortified formula, but further trials are needed to confirm these results and determine their long-term impact.

Evidence is limited that iron supplementation in healthy pregnant women with mild or no anemia results in important clinical benefits. 109 Clinical trials have reported that iron supplements in healthy pregnant women with initial hemoglobins >=10 g/dL are efficacious in correcting red cell indices and iron stores, but they do not improve birth weight, length of gestation, or other outcome measures when compared to placebo or to no supplements. 110-116 Few of these trials had sufficient statistical power to detect small positive effects of iron supplementation, however. One small randomized controlled trial in young pregnant women suggested a modest beneficial effect of routine iron supplementation on some tests of psychomotor function, but it had important discrepancies in reported data and analyses. 118 In a large randomized controlled trial of healthy, nonanemic pregnant women, routine iron therapy was compared to selective iron therapy given only when a confirmed hemoglobin level below 10 g/dL was detected after 14 weeks. 119 Women in the selective group had poorer self-reported overall health and increased rates of transfusion and operative delivery, although differences were small and may have been due to nonblinding. The routine supplementation group had more subjective side effects attributable to iron, more postdate gestations, and higher perinatal mortality the latter difference was probably attributable to chance, given small numbers and multiple comparisons. Evidence thus does not confirm important clinical benefits from routine iron supplementation in nonanemic pregnant women.

Cohort studies have reported no important adverse effects with iron-fortified formula, 120,121 nor were serious side effects reported in the clinical trials of iron-fortified food or formula previously cited. Routine iron supplementation may produce mild, reversible gastrointestinal symptoms similar to those seen with iron therapy (see above). In one small, randomized controlled trial, oral iron supplements in iron-sufficient children resulted in significantly less weight gained after 4 months of treatment, 122 but additional studies are needed to confirm these results. Trials of routine iron supplementation in pregnancy have not reported adverse effects on pregnancy outcome. The doses of oral iron typically offered for routine iron supplementation (e.g., in pregnancy) are unlikely to cause complications of excessive iron storage. 98,99 As with iron therapy, an important hazard of iron prophylaxis is unintentional overdosage by children in the home many reported cases of iron ingestion involve iron-containing prenatal vitamins. 100 top link

Recommendations of Other Groups

A number of organizations recommend some form of anemia screening during infancy and pregnancy. The Institute of Medicine (IOM), the Canadian Task Force on the Periodic Health Examination, and the Bright Futures report recommend screening high-risk infants (e.g., preterm or low birth weight, low socioeconomic status, those fed with cow's milk or nonfortified formula before 12 months of age) between 6 or 9 and 12 months of age IOM recommends screening before 3 months in preterm infants. 15,123,124 The American Academy of Pediatrics (AAP) and the American Academy of Family Physicians (AAFP) recommend that all infants receive a hemoglobin or hematocrit measurement once during infancy the AAP recommends that it be done at or before 9 months of age. 125,126 The recommendations of the AAFP are currently under review. Prenatal screening for anemia is recommended by the Canadian Task Force, 127 the American College of Obstetricians and Gynecologists (ACOG), 128 and the IOM. 15 ACOG recommends measuring a hemoglobin or hematocrit at the earliest prenatal visit and again early in the third trimester 128 IOM recommends measuring hemoglobin once in each trimester. 15

Routine screening of older children or nonpregnant adolescents and adults is not advocated by most organizations. 15,124,126,128,129 Some organizations recommend screening selectively in specific high-risk populations: adolescents at increased risk due to heavy menses, chronic weight loss, nutritional deficit, or athletic activity 124 recent immigrants from undeveloped countries 129 the institutionalized elderly 129 and nonpregnant women aged 15-25 or otherwise at increased risk (e.g., with large menstrual blood loss, high parity, poverty, recent immigration). 15 The AAP recommends at least one measurement of hemoglobin or hematocrit for all menstruating adolescents, preferably at age 15 years. 125

Primary prevention of iron deficiency anemia in infancy by breastfeeding, feeding iron-fortified formula if not breastfeeding, and feeding iron-fortified cereal after 4-6 months of age, is recommended by the Canadian Task Force, 123 IOM, 15 AAP, 130,131 and Bright Futures. 124 The AAFP recommends counseling parents of children under 6 years of age and women of childbearing age on the benefits of iron-enriched food and iron intake. 126 The AAP recommends iron supplements in breastfed term infants who do not receive iron-fortified cereal beginning at 4 months of age. 131 The Canadian Task Force found insufficient evidence to recommend for or against the routine use of iron supplements in pregnant women. 123 The IOM does not recommend routine iron supplements in nonanemic pregnant women. 15 ACOG and AAP recommend dietary supplements including iron during pregnancy if dietary intake is inadequate to meet need, or if there are other risk factors for iron deficiency. 128 top link

Discussion

The burden of suffering from iron deficiency anemia in the general child, adolescent, and adult populations in the U.S. is low. Although it is prevalent in certain high-risk groups, mild iron deficiency anemia in the absence of symptoms appears to have only subtle health consequences in these individuals. Trials of iron therapy in school children, adolescents, and adults have not proven important clinical benefits in well-nourished populations in developed countries. Thus, there is little evidence to suggest that early detection of iron deficiency anemia in these groups is beneficial. Treatment of some forms of anemia not caused by iron deficiency (e.g., vitamin B12 or folate deficiency), and some medical disorders that cause anemia, which would also be detected if hemoglobin measurement was performed routinely, can produce dramatic results. These disorders are too rare in most subgroups of the population to justify mass screening, however. There is therefore little basis for large-scale efforts to screen for anemia in the general population.

There is fair evidence to support screening for anemia in pregnant women, based on numerous observational studies reporting an association between severe to moderate anemia (hemoglobin <9-10 g/dL) and poor pregnancy outcome, and weak evidence from a nonrandomized controlled trial and a cohort study that iron treatment of anemic women improves obstetric outcomes. Women of low socioeconomic status and immigrants from developing countries, among whom iron deficiency anemia is more common, are most likely to benefit from such screening. Because hemoglobin measurement is a nonspecific test for iron deficiency, further evaluation should be performed to identify the etiology of anemia detected by screening. Serum ferritin appears to have the best sensitivity and specificity for diagnosing iron deficiency in anemic patients. Although routine iron supplementation improves hematologic indices and iron status, there is at present insufficient evidence from published clinical research to suggest that routine iron supplementation of healthy pregnant women with hemoglobins >=10 g/dL is beneficial in improving clinical outcomes for the mother, fetus, or newborn. 109

The prevalence of iron deficiency anemia in the general infant population is low, and when it occurs in low-risk populations, it tends to be mild. In healthy, low-risk populations, there are few observational data showing adverse effects of iron deficiency anemia, nor have there been trials of early detection and correction of iron deficiency anemia. There is therefore little evidence to support routine hemoglobin measurement in infancy for the detection of iron deficiency anemia. On the other hand, multiple observational studies in high-risk populations (i.e., low socioeconomic status or developing countries) have found an association between iron deficiency anemia in childhood and abnormal growth and development. The largest and most recent trial 70 of iron therapy in a high-risk population showed an important effect of iron therapy on development, while several trials in high-risk, often malnourished, infants and children have found beneficial effects of iron on growth and growth rates. In the U.S., certain infants (e.g., recent immigrants from developing countries, those of low socioeconomic status, members of certain minority and ethnic groups, preterm infants, those who begin cow's milk before 12 months) have a substantially higher prevalence of iron deficiency anemia and may also be more likely to suffer from general malnutrition. There is therefore fair evidence to support screening high-risk infants and toddlers for iron deficiency anemia, using hemoglobin (or hematocrit). As for pregnant women, evaluation to determine the etiology of anemia is appropriate. Both breastfeeding and eating iron-fortified formula and cereal are effective in the primary prevention of iron deficiency anemia. Given the absence of known adverse effects from such dietary interventions, and the other important benefits of breastfeeding (see Chapter 56), evidence supports encouraging mothers to breastfeed and to include iron-enriched foods in the diet of infants and young children.

CLINICAL INTERVENTION

A hemoglobin analysis or hematocrit is recommended for pregnant women at their first prenatal visit ("B" recommendation). There is insufficient evidence to recommend for or against repeated prenatal testing for anemia in asymptomatic pregnant women lacking evidence of medical or obstetrical complications ("C" recommendation). Screening for anemia with hemoglobin or hematocrit in high-risk infants, preferably at 6-12 months of age, is also recommended ("B" recommendation). Examples of high-risk infants include infants living in poverty, blacks, Native Americans and Alaska Natives, immigrants from developing countries, preterm and low birth weight infants, and infants whose principal dietary intake is unfortified cow's milk. Although capillary blood specimens are easier to obtain in infants, a venous blood count provides more accurate and reliable data. Serum ferritin testing may be useful as an additional screening test in selected high-risk infants. There is currently insufficient evidence to recommend for or against periodic screening for high-risk infants not found to be anemic at initial screening ("C" recommendation). There is also insufficient evidence to recommend for or against routine testing for anemia in other asymptomatic persons, but recommendations against such screening may be made on the grounds of low prevalence, cost, and potential adverse effects of iron therapy ("C" recommendation).

Guidelines for normal hemoglobin ranges for infants and pregnant women have been published. 1-4 Appropriate hematological studies and nutrition counseling should be provided for patients found to have anemia. Compared to other diagnostic tests, serum ferritin has the best sensitivity and specificity for detecting iron deficiency in patients found to be anemic. Screening for hemoglobinopathies is discussed in Chapter 43.

Encouraging mothers to breastfeed their infants and advising parents to include iron-enriched foods in the diet of infants and young children is recommended for the primary prevention of iron deficiency anemia ("B" recommendation). There is also good evidence to recommend breastfeeding based on proven benefits unrelated to iron deficiency (see Chapter 56). Pregnant women should receive specific nutritional guidance to enhance fetal and maternal health (see Chapter 56). There is currently insufficient evidence to recommend for or against the routine use of iron supplements for healthy infants or pregnant women who are not anemic ("C" recommendation).

See the relevant background paper: U.S. Preventive Services Task Force. Routine iron supplementation during pregnancy. JAMA 1993270:2846-2854.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, and was based in part on a previously published background paper by the U.S. Preventive Services Task Force (review article by Steven H. Woolf, MD, MPH) and on materials prepared for the Canadian Task Force on the Periodic Health Examination by John W. Feightner, MD, MSc, FCFP. top link

23. Screening for Elevated Lead Levels in Childhood and Pregnancy

Burden of Suffering

Prevalence.

The prevalence of elevated blood lead levels in the U.S. population has declined 78% in the past decade, 1 due primarily to marked declines in lead in gasoline, soldered cans, and air. 1-6 In a 1988-1991 national survey of children aged 1-5 years, 9% and 0.5% had blood lead levels >=10 micro-g/dL and >=25 micro-g/dL, respectively, down from 88% and 9% a decade before. 7 (The units micro-g/dL will be used throughout this chapter to convert to micro-mol/L, divide by 20.72.) Prevalence varies widely among different communities and populations, however, with studies reporting 2-41% of children having blood lead levels >=10 micro-g/dL, 0-3% >=25 micro-g/dL, and 0-0.5% >=40 micro-g/dL. 8-19 Current national data for pregnant women have not been published, but only 0.5% of U.S. women aged 12-49 years of age have blood lead levels >=10 micro-g/dL. 7 Two large surveys of low-income pregnant women found 0% 20 and 6% 21 with blood lead levels >15 micro-g/dL. top link

Risk Factors for Elevated Lead Levels.

The highest mean blood lead levels in the U.S. occur in children aged 1-2 years (mean 4.1 micro-g/dL) and in adults >=50 years of age (4.0 micro-g/dL), with the lowest in adolescents (1.6 micro-g/dL). 7 Among adults, geometric mean levels are significantly higher in males than in females. Correlates of higher blood lead levels at all ages include minority race/ethnicity, central city residence, low income, low educational attainment, and residence in the Northeast region of the U.S. 7,22 These factors are associated with increased exposure to important lead sources, including dilapidated, pre-1950 housing with lead-based paint, lead-soldered pipes and household lead dust and lead in dust and soil from heavy traffic and industry. 22-27 Other potential sources of household lead exposure include clothing or waste material brought home by workers in lead-based industries or hobbies, lead-based paint and dust contamination in pre-1950 housing that is undergoing remodeling or renovation, dietary intake from lead-soldered cans and lead-based pottery, and traditional ethnic remedies. 23,24,28 top link

Neurotoxic Effects of Lead Exposure in Children.

Very high levels of inorganic lead exposure can produce serious neurologic complications, which may result in death or long-term sequelae. 23 A growing number of studies have reported associations between neurotoxic effects and blood lead levels once thought to be harmless. Adequately designed and conducted prospective cohort studies from a broad range of child populations have reported that a rise in blood lead from 10 to 20 micro-g/dL is associated with a likely decrement of about 2 points (reported range -6 to +1) in intelligence test scores (IQ). 29-35 In these studies, the mean blood lead levels at age 1-2 years (7.7-35.4 micro-g/dL) were higher than the current U.S. mean for this age group (4 micro-g/dL), but most levels were below 35 micro-g/dL. A meta-analysis 36 that included the five oldest of these cohort studies concluded that a doubling of blood lead levels from 10 to 20 micro-g/dL measured at age 2 years was associated with a statistically significant mean reduction of 1-2 IQ points evidence was inconclusive regarding an association of IQ with mean postnatal blood lead levels. Although most cross-sectional studies evaluating the association of tooth and blood lead with IQ suffer from methodologic problems such as selection bias and limited adjustment for covariates, they have been generally consistent in reporting small negative effects of elevated lead levels on IQ. [e.g., in 36, 37] A meta-analysis that included studies of whole tooth lead published since 1979 reported a statistically significant 1 point reduction in IQ associated with a doubling of tooth lead from 5 to 10 micro-g/g. 36 Evidence is not sufficient to quantify the exact nature of the relationship between IQ and higher blood lead levels (i.e., 40-100 micro-g/dL). Cross-sectional studies 38-42 have consistently reported small, inverse associations between blood or tooth lead and reaction (attentional) performance, but studies evaluating the effect of mildly elevated lead levels on other measures of neurodevelopmental function (e.g., behavior, learning disorders, auditory function) have produced inconclusive results. These have been less thoroughly evaluated than IQ, however.

In most studies, the size of the estimates of lead effects on IQ are reduced when adjusted for potentially confounding variables, 36 suggesting that some of the observed association may be due to imperfectly measured or unmeasured covariates. Studies in rodents and primates, however, which can avoid most of the methodologic weaknesses of observational studies in humans, report cognitive, attentional, and behavioral deficits, as well as auditory and visual dysfunction, with mildly elevated blood lead levels, 43-45 supporting a causal relationship between low-level lead exposure and neurotoxic effects in children. Studies demonstrating laboratory abnormalities (e.g., impaired vitamin D metabolism) in persons with blood lead levels as low as 10-15 micro-g/dL 23,46-48 also support a causal relationship. top link

Adverse Effects of Lead Exposure on Pregnancy Outcomes.

The effects of very high blood lead levels during pregnancy on reproductive outcomes such as abortion and stillbirth have been recognized for many years. 23 Observational studies in pregnant women with blood lead levels <30 micro-g/dL have reported associations between elevated levels and birth weight, length of gestation (including preterm delivery), and neonatal head circumference. 49-56 The associations have been small, variable in direction of effect, and not statistically significant in most studies. These studies failed to detect important effects on other reproductive outcomes. Inconsistent results may be due in part to imprecise measures of fetal lead exposure. 55-59 All but one 34 of six previously cited cohort studies, 29-34 as well as the meta-analysis described above, 36 reported no association between antenatal or perinatal maternal blood lead levels and full-scale IQ measured at preschool or school age. Although very high lead levels in pregnancy are clearly hazardous, the adverse effects on the fetus of antepartum lead levels in the range typically found in the U.S. are not established. top link

Other Adverse Effects of Lead Exposure.

Lead exposure affects many organ systems, including cardiovascular, renal, and hepatic, but most clinically apparent (i.e., symptomatic) effects occur with blood lead levels >=50 micro-g/dL. 23,60-63 Small increases in systolic blood pressure have been asso-ciated with mildly elevated blood lead levels (i.e., 1-3 mm Hg for a rise in blood lead from 10 to 20 micro-g/dL) in most large, population-based, cross-sectional studies evaluating nonpregnant adults and pregnant women. 64-70 In children, evidence of blood pressure effects is more limited: one crosssectional study found no association between elevated blood lead levels (range 7-70 micro-g/dL) and elevated blood pressure. 71 Adverse effects on height from lead levels well below 40 micro-g/dL have been suggested by analyses of national cross-sectional data, 72,73 but cohort studies with more extensive covariate adjustment report either transient or no effect of elevated lead levels (peak sample means 11-17 micro-g/dL) on growth. 35,74,75 top link

Accuracy of Screening Tests

Screening tests considered for detecting lead exposure include blood lead and free erythrocyte (or zinc) protoporphyrin levels. Blood lead concentration is the more sensitive of the two for detecting modest lead exposure, but its accuracy, precision and reliability can be affected by environmental lead contamination during blood collection, day-to-day biologic variability, and laboratory analytic variation. Lead contamination of collecting equipment, and skin contamination during capillary sampling, may each positively bias blood lead levels by up to 1.0 micro-g/dL, on average, although individual effects of skin contamination may be much greater. 76-80 Studies defining abnormal results as blood lead levels above 10 or 20 micro-g/dL have reported false-positive rates of 3-9% for capillary sampling compared to simultaneously collected venous blood lead. 77-78 Day-to-day biologic variability and trends over time contribute to higher false-positive rates for initial capillary samples when compared to results from venous testing done at a later date. 77,81 False-negative rates with capillary sampling appear to be lower, reported in one study as 1-8% compared to venous blood. 78 In published surveys, 76,82 80-90% of clinical laboratories participating in proficiency testing programs met performance criteria for blood lead (within +/-4 micro-g/dL of target values, for values <40 micro-g/dL 82 ) unpublished national data show >95% of participating laboratories meeting these criteria and >80% achieving accuracy to within +/-2 micro-g/dL of target values (unpublished data, Centers for Disease Control and Prevention, November 1993). Nonparticipating laboratories are likely to be less proficient. Reported blood lead values may differ by as much as 5 micro-g/dL from true values due to these sources of variability and bias, which may affect the predictive value of a positive test. Results from capillary samples may vary even more, although recent studies suggest the positive bias can be reduced with increased attention to reducing skin lead contamination. 77,78

The erythrocyte protoporphyrin (EP) test, an indirect measure of lead exposure based on lead's effects on the hematopoietic system, is unaffected by contamination with environmental lead and is easily performed on capillary blood specimens, making it more acceptable for use with young patients. Erythrocyte (or zinc) protoporphyrin is insensitive, however, to modest elevations in blood lead levels. 21,83-89 The test also lacks specificity, 21,83,84,86,87,90 thus limiting its predictive value. In one study, EP measurements were taken on 47,230 suburban and rural children although 4.7% of the children had an elevated erythrocyte protoporphyrin level, only 0.6% had elevated blood lead levels. 91

In communities where there is a low prevalence of lead levels requiring individual intervention with chelation or residential lead hazard control, blood lead screening will have a low yield, and many unaffected children will be tested at potentially high cost and inconvenience. A questionnaire that can predict those at high risk for elevated lead levels would allow targeted screening in low prevalence areas, increasing the yield of blood testing by increasing the pretest probability of elevated lead levels in those who are tested. Cross-sectional studies 13-15,92-93a in urban and suburban, mostly midwestern, populations have shown that one or more positive responses to five questions (about exposures to deteriorated paint from older or renovated housing, to other lead-poisoned children, or to lead-related hobbies or industry) 128 detects 64-87% of children with blood lead levels >=10 micro-g/dL. Three studies reported higher sensitivities (81-100%) for blood lead levels >=15-20 micro-g/dL. 15,92,93a None of these studies evaluated the ability of questionnaires to detect levels above 20 micro-g/dL, in part because so few patients had levels so high. Specificity among the studies ranged from 32% to 75%. In the samples with a lower prevalence (2-7%) of levels >=10 micro-g/dL, the proportion of those with a negative questionnaire who had elevated blood lead levels was predictably low (0.2-3.5%), but increased to 19% when the population prevalence of elevated lead levels was higher (17-28%). top link

Effectiveness of Early Detection

Detection of lead exposure before the development of potentially irreversible complications permits the clinician to recommend environmental interventions to limit further exposure and, when necessary, to begin medical treatment with chelating agents. Early detection may also result in interventions that prevent exposure of other children to lead (the child with elevated blood lead level acting as a sentinel for a hazardous environment). There is relatively little convincing evidence that these interventions improve health, however. One issue is that most available studies in asymptomatic children evaluate the effects of various interventions on blood lead levels rather than on clinical outcomes. Second, blood lead levels typically decline with the passage of time. On average, blood lead levels in childhood decrease with age after peaking at about 2 years of age, even without intervention. 7 Longitudinal studies of asymptomatic children with elevated lead levels have shown reductions in blood lead levels after short- and long-term follow-up in the absence of any intervention, 94,95 a result attributable at least in part to regression to the mean, random variation, and laboratory error. To evaluate adequately the effects of interventions on blood lead levels, studies must take into account these changes over time, preferably by the use of controls who do not receive the intervention.

Effect of Screening on Clinical Outcomes.

Evidence is not available to demonstrate that universal screening for blood lead results in better clinical outcomes than either screening targeted to high-risk persons or individualized testing in response to clinical suspicion. Several older studies reported that, compared to historical results from individualized testing, intensive screening programs targeted to children in high-risk neighborhoods reduced case fatality rates, mortality rates, and proportions of children detected with very high blood lead levels or who developed symptomatic lead poisoning. 96-98 In the absence of concurrent controls, it is not clear whether the reported reductions in mortality and case fatality rates were due to screening, or to improvements in medical care over time. Reductions in mean lead levels may also have been due to secular trends, changes in screening tests, and to screening greater numbers of children, including many at low risk for severe lead poisoning. Thus, the available evidence regarding the efficacy of screening programs is weak. top link

Effect of Interventions to Lower Blood Lead Levels on Clinical Outcomes.

In contrast to substantial evidence that chelating agents benefit children with symptomatic lead poisoning, few studies have compared potential clinical benefits of chelation therapy with its adverse effects in asymptomatic children. Ethical considerations preclude such trials for children with blood lead levels above 45 micro-g/dL. A large randomized controlled trial assessing the effect of chelation therapy on IQ in young children with venous blood lead concentrations of 20-45 micro-g/dL is currently under way (G. Rhoads, personal communication, Environmental and Occupational Health Sciences Institute, Piscataway, NJ, January 1994). An observational study 99,100 compared children with blood lead levels between 13 and 46 micro-g/dL (median 30 micro-g/dL), who did and did not receive EDTA chelation therapy depending on the results of a lead mobilization test. There was no effect of chelation on IQ at either 7 weeks or 6 months follow-up after controlling for age and initial IQ. Changes in concentrations of blood lead, bone lead, and EP also did not differ significantly between chelated and unchelated children. The greatest reductions in blood lead were associated with the highest initial lead levels, independent of chelation. The method of treatment assignment (i.e., based on a positive mobilization test) was most likely to have biased the study toward finding an effect of chelation, yet no effect was observed. There is thus little evidence presently available to confirm a clinical benefit from chelation therapy for children with lead levels <45 micro-g/dL. A comprehensive literature review found no studies evaluating clinical effects of residential lead hazard control. Their effects on blood lead levels are reviewed below. top link

Effects of Chelation Therapy on Blood Lead Levels.

In uncontrolled experiments and case series in asymptomatic children with initial blood lead levels ranging from 40 to 471 micro-g/dL, chelating agents reduced blood lead levels substantially, to levels <40-70 micro-g/dL (varying with initial levels) these reductions were maintained for weeks to years after therapy was discontinued. 101-105 Most of these children were also returned to homes that had undergone lead hazard reduction, and the effect of this additional intervention was not specifically evaluated. Chelating agents have caused short-term reductions in blood lead levels in children whose pretreatment values ranged from 20 to 49 micro-g/dL in nonrandomized comparative trials, cohort studies, and uncontrolled experiments these reductions have not been sustained over longer periods in the absence of repeated or continuing chelation therapy or environmental interventions. 104,106-109 Most of these studies did not report whether chelation therapy was combined with environmental interventions. With such weak evidence, including the previously cited cohort study reporting no effects of chelation on IQ, 99 it is difficult to make a convincing argument that chelation therapy to lower moderately elevated blood lead levels has a long-term benefit. top link

Effect of Residential Lead Hazard Control on Blood Lead Levels.

For most asymptomatic children with elevated lead levels, the primary goal of intervention is to reduce exposure to lead-contaminated paint, dust, and soil in the child's home environment, since these sources account for most excessive lead exposure. Residential lead-based paint hazard control methods have become increasingly effective for reducing exposure to lead paint and lead-contaminated dust. 25,110,111 These new techniques are now replacing the older strategies, which often created lead dust during the intervention process, but there are currently few published studies of their effect on blood lead levels.

Because most published studies used older, less effective techniques, the effects of residential interventions on blood lead reported in the literature, and outlined below, probably indicate the minimum possible benefit of residential lead-paint and lead-contaminated dust hazard control. In an early cohort study 112 of 184 children with initial blood lead levels >= 50 micro-g/dL, children discharged after chelation therapy to lead-free (i.e., new or completely gutted and renovated) housing had significantly lower mean blood lead levels when compared to children exposed to "legally abated" or to inadequately abated housing (28.8 micro-g/dL vs. 38.5 and 57 micro-g/dL, respectively). Children in lead-free housing also had fewer recurrences of levels >=50 micro-g/dL at 12 and 24-30 months follow-up. A nonrandomized trial 113 of households with children having initial blood lead levels >29 micro-g/dL compared more intensive experimental lead-reduction procedures with the lead-reduction procedures commonly in use in the study community. Neither intervention had any effect on mean dust or blood lead levels as tested 6 months after abatement, but an untreated control group was not included. Published 114 and unpublished 115 retrospective cohort studies suggest that residential lead paint hazard control is associated with modest declines (4-10 micro-g/dL) in mean blood lead levels in children with initial blood lead levels >= 25 micro-g/dL, although in one study 114 those with initial blood lead levels <35 micro-g/dL benefitted little from intervention. Case series and uncontrolled experiments, both weak study designs, have also evaluated lead-paint hazard control efforts in children with initial blood lead levels of 25-55 micro-g/dL 100,116-119 several were published only as abstracts or summaries. 115,120 These studies reported statistically significant declines in mean blood lead levels, ranging from 2.5 to 10.2 micro-g/dL, 6-12 months after residential lead-based paint hazard control. All of the studies cited suffer from important design flaws, such as substantial drop-out rates and inadequate control for confounding variables such as season and age. Despite their flaws, the consistency of the results from these studies suggests a small, beneficial effect of lead-based paint hazard control on blood lead levels. As noted, there are as yet no published studies evaluating the effects on blood lead levels of newer residential lead hazard control techniques.

There are important problems with using one-time residential lead-paint hazard control as the sole method to reduce lead exposure in children. 121 Poor, inner-city families tend to move frequently, so that treating the current residence may have limited long-term benefit to the child, although benefit may accrue to other children moving in to that residence (see below). Residential lead-paint hazard control is costly and labor-intensive, resulting in low rates of intervention, especially in poor communities. 24,122 Lead dust is ubiquitous and highly mobile, so that recontamination by nearby lead sources, including soil lead, may occur after lead-paint hazard control efforts take place in a dwelling. 110,123,124 These problems indicate a need for additional individual interventions, as well as more comprehensive community-based interventions, to reduce household lead exposure.

The small effect noted in studies evaluating lead-paint hazard control methods may be attributable in part to recontamination of the dwelling by nearby lead sources and from subsequent deterioration of painted surfaces. 110,123,124 Several studies have evaluated measures designed to reduce ongoing lead-dust contamination from lead-contaminated paint and soil. In a nonrandomized controlled trial among children with blood lead levels of 30-49 micro-g/dL, having a research team wet-mop all lead-contaminated interior surfaces twice a month with a high-phosphate detergent cleanser resulted in significantly greater adjusted declines in mean blood lead levels of children in intervention households compared to children in control households (6.9 vs. 0.7 micro-g/dL) at 1-year follow-up. 125 There have been no controlled studies to evaluate whether counseling families to perform similar cleaning would be equally effective in reducing blood lead levels. In one uncontrolled experiment, the families of 78 children with blood lead levels of 10-35 micro-g/dL, who were living in the vicinity of a defunct lead smelter, received intensive (30-45 minutes) in-home education and literature on prevention of lead exposure. 126 The mean blood lead levels in the 51 (65%) children who had follow-up blood lead levels at 4 months declined from 15.0 to 7.8 micro-g/dL (and maximum levels from 35.0 to 12.7 micro-g/dL). Without concurrent controls, it is not possible to determine how much regression to the mean and seasonal and age variations contributed to these reductions in blood lead levels. There is also evidence that clinician counseling at the worksite to reduce lead dust ingestion by workers (e.g., through personal hygiene practices) can significantly reduce mean blood lead levels at 1-year follow-up, 127 but this study also lacked controls and may not be generalizable to the residential setting.

A third focus of residential lead hazard control is exposure to soil lead. In a randomized controlled trial 123 of young children with initial blood lead levels of 7-24 micro-g/dL, extensive soil abatement, one-time dust abatement, and removal of loose interior paint resulted in a statistically significant reduction in mean blood lead levels of 1.2-1.3 micro-g/dL compared to loose paint removal alone. This clinically insignificant decline was associated with a substantial reduction in soil lead from a median 2,000 to 105 ppm. Preliminary results of the U.S. Environmental Protection Agency's Three City Urban Soil Lead Abatement Demonstration Project similarly suggest that substantial declines in soil lead cause only modest reductions in mildly elevated blood lead concentrations. 124 The small effect was due at least in part to rapid recontamination with dust lead in households undergoing soil abatement. Among children living near a closed lead smelter, only 3% of the variance in blood lead levels was attributable to soil lead. 127a

An important potential benefit of residential lead hazard control is its effect on the lead levels or clinical outcomes of other children who live in the same household as a child identified with elevated lead levels, or who subsequently move into the remediated residence. The literature review revealed no published evidence evaluating the effect of residential lead hazard control measures on such children. Based on the biokinetics of lead, 23 it is reasonable to believe that environmental interventions conducted before children are exposed are likely to prevent increases in blood lead levels more effectively than the same interventions in children who have already been exposed. top link

Effect of Nutritional Interventions on Blood Lead Levels.

In most settings, neither residential lead-based paint or dust hazard control nor chelation therapy is routinely offered to children with blood lead levels <20 micro-g/dL, but some experts have recommended offering these children dietary counseling to reduce their blood lead levels. 128 Diets deficient in calories, calcium, and zinc have been associated with increased gastrointestinal absorption of lead, 129,130 but there is only limited evidence that counseling to correct such nutritional inadequacies will reduce blood lead levels or prevent further increases. Results of experimental studies of the effects of iron deficiency on lead absorption and retention in adult humans have been equivocal. 129,131 In a cohort study of children with initial blood lead levels of 13-46 micro-g/dL, 99 all children who were iron deficient or depleted were prescribed iron supplementation. Although most children were still iron deficient at the end of the study, there were improvements in ferritin level that were not associated with either declines in blood lead or improvements in cognitive function. Cross-sectional and cohort studies have failed to establish a clear association between mean blood lead levels and measures of iron status in women at midpregnancy or delivery, in newborns (cord blood), or in children. 57,99,131-134 top link

Adverse Effects of Screening and Intervention.

The most common adverse effects of screening for elevated lead levels are false-positive fingerstick results, and the anxiety, inconvenience, work or school absenteeism, and financial costs associated with return visits and repeat tests. An EDTA lead mobilization test, used for some children with blood lead levels of 30-44 micro-g/dL, 135 requires intramuscular or intravenous infusion, a stay at the clinical center for at least 8 hours, and for young children, application of urine collection bags. 136 Residential lead-based paint and dust hazard control, when improperly done, 25 may produce acute increases in blood lead levels in resident children and abatement workers, occasionally necessitating hospitalization and chelation therapy. 113,116,137-139 Currently recommended techniques for lead hazard reduction are likely to reduce these adverse effects. 25 Chelating agents for asymptomatic lead poisoning have also been associated with important adverse effects. EDTA and dimercaprol (BAL) have transient renal, hepatic, and other toxicity, require intravenous or intramuscular injection, and generally require hospitalization for administration. 128,140,141 Common adverse effects of d-penicillamine are penicillin-like sensitivity reactions and transient nephrotoxicity there are rare life-threatening reactions. 96,105,107,128 Succimer (meso-2,3-dimercaptosuccinic acid, or DMSA) causes mild gastrointestinal and systemic symptoms, rashes, and transient elevations in liver function tests, in up to 10% of cases. 104,106,108,142 top link

Recommendations of Other Groups

Several states mandate either universal screening for lead exposure or selective screening of populations at high risk for lead exposure. 143 Periodic screening of children with blood lead measurement is also required for Medicaid's Early and Periodic Screening, Diagnostic, and Treatment Program. 144 The American Academy of Pediatrics 145 and the Bright Futures guidelines 146 recommend: (a) screen all children for lead exposure at about 12 months of age, and possibly again at about 24 months of age (b)take a history of lead exposure (using questionnaires provided with the guidelines) between the ages of 6 months and 6 years to identify high-risk children who should be screened earlier or more frequently and (c)provide education to parents on safe environmental, occupational, nutritional and hygiene practices to protect their children from lead exposure. Follow-up screening intervals should be based on risk assessment and previous blood lead levels. The Centers for Disease Control and Prevention (CDC) recommends screening all children at 12 months of age using a blood lead test, except in communities where no childhood lead poisoning problem exists high-risk children require earlier and more frequent screening. 128 The American Academy of Family Physicians (AAFP) 147 and the Canadian Task Force on the Periodic Health Examination 148 recommend screening all children who are at high risk of lead exposure (e.g., due to exposure to heavy traffic and industry, or to dilapidated older housing). The recommendations of the AAFP are currently under review. The American Medical Association recommends regularly screening all children under the age of 6 years for lead exposure through history-taking and, when appropriate, blood lead testing. 149 They recommend that the decision to employ universal or targeted screening be made based on prevalence studies of blood lead levels in the local pediatric population.

No major organizations currently recommend screening pregnant women for elevated lead levels. top link

Discussion

There is fair evidence that screening for elevated lead levels in asymptomatic children at increased risk for lead exposure will improve clinical outcomes. Because there have been no controlled trials directly evaluating screening for elevated lead levels, this conclusion is based on a chain of evidence constructed from studies of weaker design. First, in young asymptomatic children, blood lead levels as low as 10 micro-g/dL are associated with measurable neurodevelopmental dysfunction. Second, although the national prevalence of elevated lead levels has declined substantially in the past decade, a high prevalence persists in some communities, particularly poor urban communities in the northeastern U.S. Third, measurement of venous blood lead concentration is a convenient, reliable, precise and reasonably valid screening test for assessing lead exposure. Fourth, current interventions, including residential lead hazard control and chelation therapy, can reduce blood lead levels in children identified with levels >=25 micro-g/dL, although the quality of evidence supporting their effectiveness is weak and a beneficial effect on IQ or other clinical outcomes has not yet been demonstrated. There is also weak evidence that screening high-risk children for elevated lead levels results in improved clinical outcome compared to historical controls identified by case-finding. Based on this evidence of the current burden of suffering and the effectiveness of early detection, the Task Force recommends screening children at increased risk for lead exposure.

While no studies have evaluated a specific age at which to screen, the natural history of blood lead levels in children, which increase most rapidly between 6 and 12 months and peak at age 18-24 months, suggests that screening at about 12 months of age is likely to be most effective for the early detection of elevated lead levels.

For those children who are screened and found to have initial blood lead levels <25 micro-g/dL, there is as yet little evidence regarding the effectiveness of early detection and intervention, or of repeated screening to detect further increases in blood lead. Longitudinal and cross-sectional studies suggest that in children >=2 years, most such levels will decline naturally with time, but elevated levels may persist in children who are chronically exposed. 101

There is no direct evidence comparing the outcomes of universal screening with the outcomes from targeted screening for elevated lead levels. Recent studies indicate that the prevalence of elevated blood levels in the U.S. has declined dramatically in the past decade, but that local prevalence is highly variable, with more than 10-fold differences between communities. In a community with a low prevalence of elevated blood lead levels, universal screening may result in disproportionate risks and costs relative to benefits. The prevalence level at which targeted screening can replace universal screening is a public health policy decision requiring consideration of factors in addition to the scientific evidence for effectiveness of early detection, such as available resources, competing public health needs, and costs and availability of alternative approaches to reducing lead exposure. Good quality analyses are needed to determine the population prevalence below which universal lead screening is not cost-effective. Clinicians can consult with their local or state health department regarding appropriate screening policy for the local child population.

In communities where data suggest that universal screening is not indicated, there may nevertheless be some children who are at increased risk of blood lead levels in the range for which individual intervention by chelation therapy or residential lead hazard control has been demonstrated to be effective. These children may have had exposure to lead sources such as lead-based hobbies or industries, traditional ethnic remedies, or lead-based pottery. Selective blood lead screening of such high-risk children is appropriate even in low prevalence communities. There is fair evidence that a validated questionnaire of known and acceptable sensitivity and specificity can identify those at high risk. In several studies, the CDC 128 and similar questionnaires correctly identified 64% to 87% of urban and suburban children who had blood lead levels >=10 micro-g/dL. These questionnaires have not been adequately evaluated as a screening tool to detect higher blood lead levels (e.g., >=20-25 micro-g/dL), or to detect exposure in other populations (e.g., migrant workers, rural communities). Locale-specific questionnaires that inquire about likely local sources of lead exposure may lead to improved prediction.

As is the case in children, there are no controlled trials evaluating screening for elevated lead levels in pregnant women, nor are there sufficient data to construct an adequate chain of evidence demonstrating benefit. The prevalence of levels >15 micro-g/dL appears to be quite low in pregnant women. There is fair evidence that mildly elevated lead levels during pregnancy are associated with small increases in antepartum blood pressure, but limited evidence that these levels have important adverse effects on reproductive or other outcomes, including intelligence of offspring. An extensive literature search failed to identify studies evaluating screening or intervention for lead exposure in pregnant women. There are potentially important adverse effects of chelation therapy on the fetus, and of residential lead hazard control on both the pregnant woman and fetus if they are not performed according to established standards. Removal to a lead-free environment would theoretically be effective in reducing lead exposure but has not been specifically evaluated in pregnancy. There is thus insufficient evidence to recommend for or against screening pregnant women for the detection of elevated lead levels.

Population-based interventions for the primary prevention of lead exposure are likely to be more effective, and may be more cost-effective, than office-based screening, treatment and counseling. Community, regional, and national environmental lead hazard reduction efforts, such as reducing lead in industrial emissions, gasoline, and cans, have proven highly effective in reducing population blood lead levels. 1-6,150,151 Remaining important sources of lead (e.g., lead paint and pipes in older homes, lead-contaminated soil) are, however, more difficult to address on a population-wide basis. Studies of community-based efforts to reduce lead exposure from these and other sources in order to prevent the occurrence of elevated lead levels are ongoing. 25,110,152 Evaluation of the effectiveness of community-based interventions, and recommendations regarding their use, are beyond the scope of this document.

CLINICAL INTERVENTION

Screening for elevated lead levels by measuring blood lead at least once at age 12 months is recommended for all children at increased risk of lead exposure ("B" recommendation). All children with identifiable risk factors should be screened, as should children living in communities in which the prevalence of blood lead levels requiring individual intervention, including chelation therapy or residential lead hazard control, is high or is undefined. If capillary blood is used, elevated lead levels should be confirmed by measurement of venous blood lead. The optimal frequency of screening for lead exposure in children, or for repeated testing of children previously found to have elevated blood lead levels, is unknown and is left to clinical discretion consideration should be given to the degree of elevation, the interventions provided, and the natural history of lead exposure, including the typical peak in lead levels at 18-24 months of age.

In communities where the prevalence of blood lead levels requiring individual intervention is low, a strategy of targeted screening, possibly using locale-specific questionnaires of known and acceptable sensitivity and specificity, can be used to identify high-risk children who should have blood lead testing. Examples of individual risk factors include: (a)living in or frequently visiting an older home (built before 1950) with dilapidated paint or with recent or ongoing renovation or remodeling, (b)having close contact with a person who has an elevated lead level, (c)living near lead industry or heavy traffic, (d)living with someone whose job or hobby involves lead exposure, (e)using lead-based pottery, or (f)taking traditional ethnic remedies that contain lead. 128 There is currently insufficient evidence to recommend an exact population prevalence below which targeted screening can be substituted for universal screening. The results of cost-benefit analyses, available resources and public health priorities are among the determinants of the prevalence below which targeted screening is recommended for a community. Clinicians can seek guidance from their local or state health department.

There is insufficient evidence to recommend for or against routine screening for lead exposure in asymptomatic pregnant women ("C" recommendation). Recommendations against such screening may be made on the grounds of limited and conflicting evidence regarding the current burden of suffering, high costs, and the potential for adverse effects from intervention.

There is insufficient evidence to recommend for or against trying to prevent lead exposure by counseling families to control lead dust by repeated household cleaning, or to optimize caloric, iron, and calcium intake specifically to reduce lead absorption ("C" recommendation). For high-risk individuals or those living in high-prevalence communities, such recommendations may be made on other grounds, including minimal risk of adverse effects from the cleaning or the dietary advice, and the additional, unrelated benefits from optimizing nutrition (see Chapter 22, Screening for Iron Deficiency Anemia, and Chapter 56, Counseling to Promote a Healthy Diet).

Recommendations regarding community- or population-based interventions for the primary prevention of lead poisoning, assessment of community lead contamination, or the setting of community priorities for lead hazard reduction, are beyond the scope of this document.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

24. Screening for Hepatitis B Virus Infection

Burden of Suffering

Each year in the U.S., an estimated 200,000-300,000 persons become infected with HBV, more than 10,000 require hospitalization, and 250 die of fulminant disease. 1,2 The greatest reported incidence occurs in adults aged 20-39. 3,4 Of note, the number of reported cases peaked in 1985 and has shown a continuous gradual decline since that time. 4 While most infections resolve with time, it is estimated that 1.0-1.25 million individuals have chronic, asymptomatic HBV infection (i.e., chronic carriers). 5 This places them at risk for developing chronic active hepatitis, cirrhosis, and primary hepatocellular carcinoma (PHC). In the population as a whole, 50-67% of acute HBV infections are asymptomatic, whereas over 90% of early childhood infections are asymptomatic. 3,6 Individuals with asymptomatic infections are still at risk for the development of chronic HBV infection and its sequelae. An estimated 22,000 births occur to HBV-infected women each year in the U.S. 6 Infants whose mothers are positive for hepatitis B e antigen (HBeAg) have a 70-90% chance of becoming infected perinatally. 6-8 Infections during infancy, while estimated to represent only 1-3% of cases, account for 20-30% of chronic infections. 6 The risk of developing a chronic HBV infection (i.e., carrier state) is inversely related to age at the time of infection. 6,9,10 This risk is 85-90% for infected infants and rapidly decreases to a steady risk of 6-10% in older children and adults. The risk of developing PHC or cirrhosis depends on the length of time that an individual has been chronically infected. It is estimated that infants who become chronically infected have a 25% lifetime risk and adults have a 15% lifetime risk of PHC or cirrhosis. 3 An estimated 5,000 hepatitis B-related deaths occur each year as a result of cirrhosis and PHC, with the median age of death occurring in the fifth decade of life. 1,11-13

The principal risk factors for HBV infection in the U.S. are injecting illicit drugs heterosexual contact with HBV-infected persons or with persons at high risk for HBV infection (e.g., injection drug users) sexual contact with multiple sex partners and male homosexual activity. 14-17 In 1990, heterosexual activity accounted for 27% of cases, homosexual activity for 11%, and injection drug use for 14%. 15 No associated risk factor can be identified in over 30% of patients with HBV infection. 6 In recent years, a growing number of injection drug users have become infected currently, between 60% and 80% of persons who use illicit drugs parenterally have serologic evidence of HBV infection. 1 In a study of inner-city pregnant women, those who presented for delivery without prenatal care, with a positive drug screen, or with a past history of any illicit drug use were at increased risk for HbsAg positivity. 18 For example, those with no prenatal care and positive urine drug screens were 29 times more likely to be seropositive than those without these risk factors. Alaska Natives, Pacific Islanders, immigrants and refugees from HBV endemic areas (including Asia, Africa, and Eastern Europe), hemodialysis patients and staff, and residents and staff in institutions for the developmentally disabled, are also at increased risk. 1,19 top link

Accuracy of Screening Tests

The principal screening test for detecting current (acute or chronic) HBV infection is the identification of HBsAg. Immunoassays for detecting HBsAg have a reported sensitivity and specificity of greater than 98%. 20-20d Spontaneous clearance of HBsAg occurs each year in 1% of persons with chronic HBV infection. 21 top link

Effectiveness of Early Detection

There is good evidence that early detection of HBsAg in pregnant women can prevent infection in the newborn. Controlled trials, 22-30a a cohort study, 31 and multiple time series 8,32,33 have shown that hepatitis B vaccine alone and in combination with hepatitis B immune globulin (HBIG) is effective in preventing the development of chronic HBV infection in infants born to HBsAg-positive mothers. Vaccine, in combination with a single dose of HBIG given within 12 hours of birth, is 75-95% efficacious in preventing chronic HBV infection, 22,25-27,30-31 whereas vaccine alone has an efficacy of 65-96%. 22,26,27,29,30a Although the ranges of efficacy overlap, the efficacy of hepatitis B vaccine in combination with HBIG was generally greater than that of vaccine alone in studies that directly compared the two strategies, with the difference reaching statistical significance in two studies. 24,31

In the past, prenatal testing for HBsAg was recommended only for pregnant women at high risk of having acquired HBV infection. 34 Recent studies in urban and minority populations have shown that only 35-65% of HBsAg-positive mothers are identified when testing is restricted to high-risk groups. 35-39 It is thought that many women at risk are not tested because their sexual and drug-related histories are not discussed with clinicians or because their clinicians are unfamiliar with perinatal transmission of HBV and recommended preventive measures. 40 In addition, many women who have asymptomatic chronic HBV infection may not acknowledge having risk factors even when a careful history is taken.

Detecting acute or chronic HBV infection may also be important in preventing virus transmission to others besides newborns. Screening tests coupled with counseling have the potential to influence certain behaviors (e.g., having sex with multiple partners, sharing needles among injection drug users, donating blood products) in infected persons, and thereby prevent transmission. Sexual contacts and persons with possible percutaneous exposure may also be identified in the process and offered vaccination (see Chapter 67). The effectiveness of routine screening of asymptomatic persons in the clinical setting as a means of reducing HBV transmission needs further study, however. Routine counseling on preventive behaviors to reduce the risk of infection and transmission, and appropriate vaccination, may be more effective strategies (see Chapters 62,65, and 66).

There is little evidence that early detection of asymptomatic HBV infection reduces the risk of developing chronic liver disease or its complications. Interferon eliminates HBsAg positivity in some individuals with a diagnosis of chronic hepatitis B, 41-45 but whether this results in a reduction in long-term morbidity and mortality has not been adequately evaluated.

A strategy of targeting high-risk populations for screening and immunizing those found to be seronegative has been ineffective in reducing the population incidence of HBV infection. 5,16 This approach has failed as a public health strategy because a high percentage (>30%) of patients have no identifiable risk factors 6 and because high-risk individuals (e.g., injection drug users) may not have access to screening and vaccination services. Nevertheless, individuals known to be at high risk who are found to be seronegative on screening can be immunized and thus protected from HBV infection (see Chapters 65 and 66). top link

Recommendations of Other Groups

The Advisory Committee on Immunization Practices (ACIP), 5 the American College of Obstetricians and Gynecologists, 46,47 the American Academy of Pediatrics, 47,48 and the American College of Physicians (ACP) 49 recommend that all pregnant women be tested for HBsAg during an early prenatal visit. The test may be repeated in the third trimester if acute hepatitis is suspected, an exposure to hepatitis has occurred, or the woman practices a high-risk behavior such as injection drug use. No major organizations recommend universal screening of nonpregnant individuals for HBV infection. ACIP and ACP recommend making decisions to test potential vaccine recipients for prior infection on the basis of cost-effectiveness, and state that testing in groups with the highest risk of HBV infections (i.e., HBV marker prevalence >20%) 1 is usually cost-effective. 1,49 top link

Discussion

Because many HBsAg-positive women are not detected during pregnancy when only high-risk women are screened, routine HBsAg testing of all pregnant women is a more effective strategy for the prevention of perinatal HBV transmission. It has been calculated that screening all of the more than four million pregnant women each year in the U.S. would detect about 22,000 HBsAg-positive mothers, and treatment of their newborns would prevent the development of chronic HBV infection in an estimated 6,000 neonates each year. 50 Several studies have demonstrated that the long-term benefits of preventing chronic liver disease make routine prenatal HBsAg testing as cost-effective as other widely implemented prenatal and blood donor screening practices. 39,51-53 Despite current recommendations for universal vaccination of newborns against HBV (see Chapter 65), screening all pregnant women for HBV infection is recommended as an effective intervention because the vaccine alone appears to be less efficacious than the combination of vaccine and HBIG in preventing HBV infection of infants exposed to HBsAg-positive mothers.

A recommendation for universal screening of the general population would require proof that intervention reduces the morbidity and mortality associated with asymptomatic chronic HBV infection, or that it reduces or prevents HBV transmission. While interferon therapy appears promising as an intervention, current data are insufficient to recommend its use in asymptomatic HBV-infected persons in order to improve clinical outcome. Similarly, there is little evidence to support screening and counseling seropositive persons as an effective intervention to prevent HBV transmission. Given the low burden of suffering, lack of evidence of benefit, and the costs and inconvenience associated with testing, universal screening in the nonpregnant population cannot be recommended at this time.

Routine vaccination with hepatitis B vaccine is discussed in Chapters 65 and 66. Prevaccination screening is likely to be cost-effective and should be considered in high-risk groups where the rate of previous infection is high (e.g., >20-40%), in order to avoid vaccinating immune individuals or persons with chronic HBV infections. 54-56 In these populations, screening with antibody to hepatitis B core antigen (anti-HBc), which identifies all previously infected individuals, including those with chronic HBV infection, may be preferable. 5 In other high-risk adolescents and adults, routine vaccination without screening may be more cost-effective (see Chapters 65 and 66).

CLINICAL INTERVENTION

Screening with hepatitis B surface antigen (HBsAg) to detect active (acute or chronic) HBV infection is recommended for all pregnant women at their first prenatal visit ("A" recommendation). The test may be repeated in the third trimester if the woman is initially HBsAg-negative and engages in high-risk behavior such as injection drug use or if exposure to hepatitis B virus during pregnancy is suspected. Infants born to HBsAg-positive mothers should receive hepatitis B immune globulin (HBIG) (0.5 mL) intramuscularly within 12 hours of birth. Hepatitis B vaccine, at the appropriate dosage, should be administered intramuscularly concurrently with HBIG (at a different injection site). The second and third doses of vaccine should be given 1 and 6 months after the first dose. Depending on the brand of vaccine utilized, the dosage of vaccine given to an infant born to a HBsAg-positive mother may differ from that given routinely to infants born to HBsAg-negative mothers. For neonates born to women whose HBsAg-status is unknown at the time of delivery, administering vaccine within 12 hours of birth, using the same dosage as that for infants whose mothers are HbsAg-positive, is recommended. Maternal testing for HBsAg-should be performed at the same time. If the mother is found to be HBsAg-positive, HBIG should be administered to her infant as soon as possible and within 7 days of birth. Contacts (sexual or household) of HBsAg-positive pregnant women should be either vaccinated or tested to determine susceptibility to HBV and vaccinated if susceptible (see also Chapter 67). The decision to do prevaccination testing may be made based on cost-effectiveness analysis.

Routine screening for HBV infection in the general population is not recommended ("D" recommendation). There is insufficient evidence to recommend for or against routinely screening asymptomatic high-risk individuals for HBV infection in order to determine eligibility for vaccination, but recommendations for screening may be made based on cost-effectiveness analyses ("C" recommendation). Such analyses suggest that screening is usually cost-effective in groups with an HBV marker prevalence >20%. 1,49 See Chapters 65 and 66 for further recommendations on hepatitis B vaccination and Chapter 67 for information about passive and active immunization of persons with possible exposure to HBV-infected individuals or blood products. Counseling on preventive behaviors to reduce the risk of HBV infection and transmission is discussed in Chapter 62.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Peter W. Pendergrass, MD, MPH, and Carolyn DiGuiseppi, MD, MPH. top link

25. Screening for Tuberculous Infection-- Including Bacille Calmette-Guérin Immunization

Burden of Suffering

About 10-15 million persons in the United States are infected with Mycobacterium tuberculosis. 1 More than 24,000 reported cases of tuberculosis (TB) occurred in the U.S. in 1994. 1a,2 This disease is associated with considerable morbidity from pulmonary and extrapulmonary pathology. Pulmonary symptoms are progressive and include cough, hemoptysis, dyspnea, and pleuritis. Extrapulmonary TB can involve the bones, joints, pericardium, and lymphatics, and it can cause spinal cord compression from Pott's disease. Death is more common in older patients and infants, with estimated case-fatality rates ranging from 0.3% in adolescents to 18.5% in the elderly. 3 Newborns and infants also experience significant morbidity from this disease.

The incidence of TB is greatest in Asians, Pacific Islanders, blacks, American Indians, Alaska Natives, and Hispanics. About one third of all reported cases in the U.S. occur in blacks, 20% occur in Hispanics, 14% occur in Asians and Pacific Islanders, and 1% occur in American Indians and Alaska Natives. 2 About 30% of new cases occur in foreign-born immigrants. 2 The prevalence in homeless persons is 1-7% for clinically active TB and 18-51% for asymptomatic M. tuberculosisinfection. 4

After experiencing a steady decline from 1963 to 1984, reported TB cases increased by 20% from 1985 to 1992. 2 A disproportionately large number of new cases are occurring among black and Hispanic persons, among whom there was a 41% increase in reported TB cases between 1985 and 1992. 2 Infection with human immunodeficiency virus (HIV) is a major contributor to the recent increase in TB cases. Persons infected with HIV are more than 100 times more likely to develop active TB than are persons with competent immune systems, and the onset of the disease is often more rapid. 5 Reports of multidrug-resistant TB have also increased in recent years. Nationally, the proportion of new cases resistant to both isoniazid (INH) and rifampin increased from 0.5% in 1982 to 3.3% in the first quarter of 1991. 6 In New York City, as many as 33% of cases are resistant to INH or rifampin. 7 Reported case-fatality rates in patients with multidrug-resistant TB, most of whom are infected with HIV, have been as high as 72-89% (the rate is about 30-40% for immunocompetent individuals). 5 The latest data indicate a 9% decrease in annually reported TB cases from 1992 to 1994, in part reflecting intensified federal, state, and local TB control efforts. 1a top link

Accuracy of Screening Tests

Tuberculin skin testing is the principal means of detectingM. tuberculosisinfection in asymptomatic persons. Although some authors recommend chest radiography as a first-line test in high-risk populations, 8 roentgenography is generally considered to be inappropriate as the initial screening test for detecting tuberculous infection in asymptomatic persons. It is important, however, as a follow-up test to identify active pulmonary TB in infected persons identified through tuberculin testing. The most accurate tuberculin skin test is the Mantoux test, in which 5 units (5 TU) of tuberculin purified protein derivative (PPD) are injected intradermally to detect delayed hypersensitivity reactions within 48-72 hours.

The frequency of false-positive and false-negative tuberculin skin tests depends on a number of variables, including immunologic status, the size of the hypersensitivity reaction, and the prevalence of atypical mycobacteria. In certain geographic areas, cross-reacting atypical mycobacteria (as well as previous BCG vaccination) can produce intermediate size reactions, thereby limiting the specificity of the test. 9-11 False-positive results can also be produced by improper technique (e.g., measuring erythema rather than induration), hypersensitivity to PPD constituents, an Arthus reaction, and cellulitis. Prior BCG vaccination may produce false-positive indurations, but these are generally less than 10 mm in diameter. Moreover, because many BCG vaccinees either lose their immunity over time or do not convert, large indurations cannot be confidently attributed to prior BCG vaccination. 12,13 False-negative reactions, which are estimated to occur in about 5-10% of patients, can be observed early in infection before hypersensitivity develops, in anergic individuals and those with severe illnesses (including active TB), in newborns and infants less than 3 months of age, and as a result of improper technique in handling the PPD solution, administering the intradermal injection, and interpreting the results. 11 Other limitations of the Mantoux test include the time and skill required for proper administration and variability among clinicians in interpreting results. 14

Multiple puncture tests (e.g., tine, Mono-Vacc) are less expensive and easier to administer than the Mantoux test. Studies evaluating the accuracy of these devices, however, have produced inconsistent results. In general, the evidence suggests that multiple puncture tests have poor specificity and may have inadequate sensitivity when compared with the Mantoux test. 15-17 Some of this inaccuracy is due to inconsistencies in the dose of injected tuberculin delivered by multiple puncture tests. Patient compliance can also affect the effectiveness of tuberculin skin testing because patients must return to the clinician 48-72 hours after the injection to have the test interpreted. Studies in pediatric patients report noncompliance rates of 28-82%. 18-20

Persons who are tuberculin test negative may need repeat testing, but there are inadequate data from which to determine the optimal frequency of PPD screening. In the absence of such data, clinical decisions regarding the need for repeat testing and its frequency should be based on the likelihood of further exposure to TB and the clinician's level of confidence in the accuracy of the test results. Some negative reactions to tuberculin skin tests require immediate retesting (two-step testing) to help determine whether future positive reactions are due to the booster phenomenon or to new conversion. A positive result on the second test, typically performed 1-3 weeks later, suggests that the patient has been previously infected (boosted reaction), whereas a negative result on the second test followed by a positive result on subsequent testing suggests recent conversion. Two-step testing has become more common in screening health care workers 21 and other population groups (e.g., elderly nursing home residents) for tuberculous infection. top link

Effectiveness of Early Detection

The early detection of tuberculin reactivity is of potential benefit because chemoprophylaxis with INH is an effective means of preventing the subsequent development of active TB. 22 A review of 14 controlled trials found that efficacy in preventing clinical disease ranges between 25% and 88% among persons assigned to a 1-year course of INH. 22 Among individuals who complete the course of chemoprophylaxis, efficacy is greater than 90%. 23,24 Some studies suggest that 6 months of INH therapy in adults are nearly as effective as 12 months of treatment. 23,26 Preventive INH therapy is also of potential public health value in preventing future disease activity and transmission of the organism to household members and other close contacts.

A number of factors, however, limit the effectiveness of INH chemoprophylaxis. Some organisms are resistant to INH and other agents. 5 Patient compliance with a 6-12-month regimen is often difficult. The most important limitation of INH is its potential hepatotoxicity. INH-induced hepatitis occurs in about 0.3-2.3% of patients, 27 the frequency increasing with age and other factors (e.g., alcohol use). The condition can be fatal, but the exact frequency of fatal INH-induced hepatitis is uncertain. Mortality rates for persons with INH-related hepatitis were reported to be as high as 4-7% in one major study, with risk increasing directly with age (zero for persons less than 20 years of age, 0.3% for persons 20-34 years of age, 1.2% for persons 35-49 years of age, 2.3% for persons 50-64 years of age). 28 These data may overestimate the actual mortality from INH-induced hepatitis because the local incidence of cirrhosis-related deaths was increased in one of the communities participating in the study. 29 More recent analyses of published and unpublished data have estimated that the incidence of fatal INH-induced hepatitis is about 1-14/100,000 persons started on preventive therapy. 30,31 The risk may be lowered by performing periodic liver function tests while patients take INH. In persons who develop complications from INH, the resulting interruption of INH therapy before completion of the 1-year course may also lower the effectiveness of TB prevention. 24

Although the benefits of INH probably outweigh its side effects in persons at high risk for developing active TB (see Clinical Intervention for description of high-risk groups), it is uncertain from available data whether low-risk, asymptomatic persons with a reactive tuberculin skin test are at sufficient risk of developing TB to justify the risks of INH-induced hepatitis. Epidemiologic calculations suggest that the annual incidence of TB in a low-risk population is less than 0.1%, 32,33 and that the lifetime probability of developing active TB ranges from 1.2% at age 20 to 0.37% at age 80. 34 Depending on the risk of INH-induced hepatitis, it is possible for complications from INH treatment to be more likely than the development of TB. In the absence of definitive clinical studies to clarify this issue, investigators have used decision analysis techniques to compare the benefits and risks of INH in tuberculin skin reactors of different ages. The results of these analyses have been inconsistent. One group concluded that benefits outweigh risks until the patient exceeds age 45 35 another found that treatment was beneficial at all ages 27 and another analysis concluded that INH should be withheld at all ages in the absence of other risk factors. 34 A decision analysis in young adults concluded that treatment was not beneficial in this age group. 32 An analysis for elderly tuberculin skin reactors concluded that INH would neither improve nor worsen 5-year survival but would decrease the risk of developing active disease. 33 An analysis for HIV-infected injection drug users concluded that, with the exception of black women, such patients would benefit from INH therapy even in the absence of tuberculin skin testing. 36 top link

Bacille Calmette-Guérin (BCG) Vaccination

Primary prevention through vaccination represents an alternative approach to the prevention of TB. BCG, a live vaccine derived from attenuated Mycobacterium bovis,has been used worldwide for more than 50 years to prevent TB. Clinical trials of the efficacy of BCG have yielded inconsistent results since the early 1930s, however, with reported levels of protection ranging from -56% to 80%. 37,38 Observational studies have shown that the incidence of the disease is lower in vaccinated children than in unvaccinated controls. 39-43 Factors contributing to the wide variation in results in BCG vaccine efficacy include genetic changes in the bacterial strains as well as differences in production techniques, methods of administration, and the populations and environments in which the vaccine has been studied. 44 A meta-analysis of 14 trials and 12 case-control studies concluded that BCG offered 50% protection against TB overall and 64-71% protection against TB meningitis and TB-related death. 45

The potential adverse effects of BCG vaccination include prolonged ulceration and local adenitis, which occur in about 1-10% of vaccinees. The risk varies with the type of vaccine used, the population, and the methods used to measure complications. Osteomyelitis and death from disseminated BCG infection are estimated to occur in one case per million doses administered. 44

In the U.S., where the risk of becoming infected with M. tuberculosisis relatively low, the disease can currently be controlled most successfully by screening and early treatment of infected persons. However, BCG vaccination may have a role in the U.S. for persons with special exposures to individuals with active TB, such as uninfected children who are at high risk for continuous or repeated exposure to infectious persons who are undetected or untreated, 44 or a future role in light of escalating multidrug resistance. top link

Recommendations of Other Groups

The Centers for Disease Control and Prevention (CDC), American Thoracic Society (ATS), and other members of the Advisory Committee for Elimination of Tuberculosis recommend screening the following groups for tuberculous infection: persons infected with HIV close contacts of persons with TB persons with medical risk factors associated with TB immigrants from countries with high TB prevalence medically underserved low-income populations injection drug users and residents and employees of high risk facilities. 46 Similar recommendations have been issued by the American Academy of Family Physicians. 55 Although the Canadian Task Force on the Periodic Health Examination recommends screening high-risk groups, it gave an "E" recommendation (good evidence against performing the maneuver in the periodic health examination) to screening low-risk persons. 46a The American Academy of Pediatrics (AAP) recommends against routine annual skin testing of children who lack risk factors and live in low-prevalence communities. The AAP does recommend annual Mantoux testing of high-risk children, as well as consideration of less frequent periodic testing (e.g., at ages 1, 4-6, and 11-16 years) of low-risk children who live in high-prevalence communities or have unreliable histories. 47 The Bright Futures guidelines recommend annual testing for persons of low socioeconomic status, those in high prevalence areas, those exposed to TB, and immigrants. 48 The American Medical Association's Guidelines for Adolescent Preventive Services (GAPS) recommend annual testing for adolescents in high-risk settings including those in homeless shelters, correctional institutions, and health care facilities. 49

Recommendations on how to perform tuberculin skin testing have been issued by the CDC and ATS. 11 The CDC has recently issued guidelines on preventing transmission in health care facilities, which include specific recommendations on the categories of health care workers to include in skin-testing programs and the frequency with which they should be tested. 50 Guidelines for the treatment of converters have been issued in a joint statement by the ATS, AAP, CDC, and Infectious Disease Society of America. 51,52 The CDC has also issued recommendations on multidrug preventive therapy for converters with suspected contact with drug-resistant TB. 5 Screening certain populations for tuberculous infection is required by law in 44 states. 7

Recommendations on BCG vaccination have been issued in a joint statement by the Immunization Practices Advisory Committee and the Advisory Committee for Elimination of Tuberculosis. 44 They recommended limiting BCG vaccination in the U.S. to tuberculin-negative infants and children who cannot be placed on INH and who have continuous exposure to persons with active disease, those with continuous exposure to patients with organisms resistant to INH or rifampin, and those belonging to groups with a rate of new infections greater than 1% per year and for whom the usual surveillance and treatment programs may not be operationally feasible.

CLINICAL INTERVENTION

Screening for tuberculous infection by tuberculin skin testing is recommended for all persons at increased risk of developing tuberculosis (TB) ("A" recommendation). Asymptomatic persons at increased risk include persons infected with HIV, close contacts of persons with known or suspected TB (including health care workers), persons with medical risk factors associated with TB, immigrants from countries with high TB prevalence (e.g., most countries in Africa, Asia, and Latin America), medically underserved low-income populations (including high-risk racial or ethnic minority populations), alcoholics, injection drug users, and residents of long-term care facilities (e.g., correctional institutions, mental institutions, nursing homes). The Mantoux test involves the intradermal injection of 5 units of tuberculin PPD and the subsequent examination of the injection site 48-72 hours later. Current minimum criteria for a positive skin test, based on observational data and expert opinion, are 15-mm diameter for low-risk individuals, 10-mm diameter for high-risk individuals (e.g., immigrants, medically underserved low-income populations, injection drug users, residents of long-term care facilities, persons with conditions that increase TB risk, infants, and children less than 4 years of age), and 5-mm diameter for persons at very high risk (e.g., persons infected with HIV, persons with abnormal chest radiographs, recent contacts of infected persons). Prior BCG vaccination is not currently considered a valid basis for dismissing positive results. Persons with negative reactions who are at increased risk of anergy (e.g., HIV-infected individuals) can be skin-tested for anergy, 53 but this procedure is now considered optional in current CDC guidelines. 46 Treatment decisions in HIV-infected anergic patients should be made on an individual basis. 54 The frequency of tuberculin skin testing is a matter of clinical discretion.

Persons with a positive PPD test should receive a chest x-ray and clinical evaluation for TB. Those lacking evidence of active infection should receive INH prophylaxis if they meet criteria defined in recent guidelines. 52 Briefly, these criteria recommend INH prophylaxis in persons under 35 years of age who are from high-prevalence countries medically underserved, low-income, high-prevalence populations or long-term care facilities. It is also recommended in persons of any age with HIV infection or increased risk of HIV infection, other medical conditions that increase the risk of TB, or close contact with patients with newly diagnosed TB or skin test conversion. Screening for HIV infection may be indicated in recent converters (see Chapter 28). Patients with possible exposure to drug-resistant TB should be treated according to current recommendations for multidrug preventive therapy. 5 Directly observed therapy -- observation of the patient by a health care worker as the medication is taken -- may be indicated in patients who are unlikely to be compliant.

BCG vaccination against TB should be considered only for tuberculin-negative infants and children who cannot be placed on INH and who have continuous exposure to persons with active disease, those with continuous exposure to patients with organisms resistant to INH or rifampin, and those belonging to groups with a rate of new infections greater than 1% per year and for whom the usual surveillance and treatment programs may not be operationally feasible ("B" recommendation). These groups may also include persons with limited access to or willingness to use health care services.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH. top link

26. Screening for Syphilis

Burden of Suffering

Syphilis is caused by infection with the bacterium Treponema pallidumwhich can be transmitted congenitally or by sexual contact. In 1994, 20,627 cases of primary and secondary syphilis were reported in the United States. 1 Primary syphilis produces ulcers of the genitalia, pharynx, or rectum, and secondary syphilis is characterized by contagious skin lesions, lymph-adenopathy, and condylomata lata. 2 Systemic spread, including invasion of the central nervous system, can occur early in infection and may be symptomatic during early or late stages of syphilis. The disease then evolves into a latent phase in which syphilis is clinically inapparent. If left untreated, as many as one third of patients progress to have potentially severe late gummatous, cardiovascular, and neurologic complications. 3 Cardiovascular syphilis produces aortic disease (insufficiency, aneurysms, aortitis), and neurosyphilis can result in meningitis, peripheral neuropathy (e.g., tabes dorsalis), meningovascular brain lesions, and psychiatric illness. Persons with tertiary syphilis may have decreased life expectancy, and they often experience significant disability and diminished productivity as a result of their symptoms. Long-term hospitalization is often necessary for patients with severe neurologic deficits or psychiatric illness. Syphilis has been associated epidemiologically with acquisition and transmission of infection with human immunodeficiency virus (HIV). 4,5

The incidence of syphilis has decreased by 50% since 1990, but it is still high and now approximates 1970 rates. 1 A growing proportion of cases is being reported among commercial sex workers and persons who use illicit drugs, especially those using crack cocaine and those who exchange sex for drugs. 6,7 There are pronounced geographic differences in the incidence of syphilis in different communities. In recent data, nearly all counties with a high incidence of reported syphilis cases (more than 10/100,000 persons) were in large metropolitan areas or in southern states nearly two thirds of all counties in the U.S. reported no cases of primary or secondary syphilis in the most recent year. 1 The incidence of reported infections among Hispanics and blacks is 5-60 times higher than that in non-Hispanic whites. 1 Individual communities may experience substantial fluctuations in incidence rates independent of national trends.

The incidence of congenital syphilis had increased sharply in the last 15 years, but it has fallen since 1991. 1 Congenital syphilis results in fetal or perinatal death in 40% of affected pregnancies, as well as in an increased risk of medical complications in surviving newborns. 8 The incidence of congenital syphilis increased steadily in the United States from 1978 to 1991, 9 reaching 108 cases per 100,000 live births in 1991. 1 (The reporting definition changed in 1989 to reflect both confirmed cases and infants at high risk of infection.) The rate dropped from 1991 to 1994, to 56 cases per 100,000 live births. 1 top link

Accuracy of Screening Tests

Serologic tests are currently the mainstay for syphilis diagnosis and management. Nontreponemal tests are used to screen patients for the presence of nonspecific reagin antibodies that appear and rise in titer following infection. Although VDRL (Venereal Disease Research Laboratory) and RPR (rapid plasma reagin) are the most commonly used nontreponemal tests, others are available. The sensitivity of nontreponemal tests varies with the levels of antibodies present during the stages of disease. In early primary syphilis, when antibody levels may be too low to detect, results may be nonreactive, and the sensitivity of nontreponemal tests is 62-76%. 10 Antibody levels rise as disease progresses titers usually peak during secondary syphilis, when the sensitivity of nontreponemal tests approaches 100%. In late syphilis, titers decline, and previously reactive results revert to nonreactive in 25% of patients in untreated late syphilis, test sensitivity averages only 70%. 10 Nontreponemal test titers decline or revert to normal after successful treatment.

Nontreponemal tests can produce sustained or transient false-positive reactions due to preexisting conditions (e.g., collagen vascular diseases, injection drug use, advanced malignancy, pregnancy) or infections (e.g., malaria, tuberculosis, viral and rickettsial diseases), or due to laboratory-associated errors. 10-12 The specificity of nontreponemal tests is 75-85% in persons with preexisting diseases or conditions, and it approaches 100% in persons without them. 10,13 Because nontreponemal serodiagnostic tests may be falsely positive, all reactive results in asymptomatic patients should be confirmed with a more specific treponemal test such as fluorescent treponemal antibody absorption (FTA-ABS), which has a sensitivity of 84% in primary syphilis and almost 100% for other stages, and a specificity of 96%. 14 Two less expensive and easier to perform confirmatory tests are the MHA-TP (microhemagglutination assay for antibodies to Treponema pallidum) and HATTS (hemagglutination treponemal test for syphilis). 13

Treponemal tests should not be used as initial screening tests in asymptomatic patients, as they are considerably more expensive and remain reactive in patients with previous, treated infection. Used in concert with nontreponemal tests, however, the positive predictive value of treponemal tests is high, and reactive results are likely to represent true infection with syphilis. Treponemal tests may also be useful in patients with suspected late syphilis and nonreactive nontreponemal tests, since declining antibody titers may produce false-negative nontreponemal tests. All test results should be evaluated in concert with a clinical diagnosis and history.

Infection with HIV may alter the clinical presentation and performance of serologic tests for syphilis. Co-infection with HIV and syphilis does not generally impair the sensitivity of syphilis testing, although there are sporadic reports of absent or delayed response to nontreponemal tests. 14,15 In contrast, HIV infection may reduce the specificity of syphilis testing several studies have noted increased reactivity to nontreponemal tests among HIV-infected persons without syphilis. 15,16 Persistence of elevated nontreponemal titers after treatment for syphilis has also been reported in some HIV-infected persons, making it difficult to confirm the adequacy of treatment. 17,18 At the same time, treponema-specific tests may become nonreactive after treatment of syphilis in HIV-infected persons, limiting the ability to document past infection. 14,19,20 top link

Effectiveness of Early Detection

Early detection of syphilis in asymptomatic persons permits the initiation of antibiotic therapy to eradicate the infection, thereby preventing both clinical disease and transmission to sexual contacts. Antibiotic therapy with penicillin G benzathine (or tetracycline hydrochloride if neurosyphilis has been excluded) has been shown to be highly effective in eliminating T. pallidum. Early detection and penicillin treatment during pregnancy have the added benefit of reducing the risk to the fetus of acquiring congenital syphilis. 9 Prenatal antibiotic therapy is effective in preventing congenital syphilis when the mother is treated with penicillin early in pregnancy (desensitization for penicillin allergy may be required). 21 Failures can occur, however, if women are treated with erythromycin, an antibiotic with limited efficacy in preventing congenital syphilis, or if antibiotic therapy is not started until the third trimester. 21 top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists and the American Academy of Pediatrics recommend routine prenatal screening for syphilis at the first prenatal visit, after exposure to an infected partner, and in the third trimester for patients at high risk. 22,23 In the event of incomplete or equivocal data on maternal serology or treatment, neonatal testing is recommended. The American Academy of Family Physicians 24 and the American College of Physicians 25 recommend serologic screening for syphilis in high-risk adults (prostitutes, persons who engage in sex with multiple partners in areas in which syphilis is prevalent, contacts of persons with active syphilis). The American Academy of Family Physicians, 24 American Academy of Pediatrics, 22,26 American Medical Association, 27 and Bright Futures 28 all recommend routine syphilis screening for sexually active adolescents at increased risk. The Centers for Disease Control and Prevention recommends obtaining serology for syphilis from all women at the first prenatal visit. 21 In communities and populations with high syphilis prevalence or for patients at high risk, serologic testing should be repeated during the third trimester and again at delivery. 21 The Canadian Task Force on the Periodic Health Examination recommends testing for syphilis in pregnant women and sexually active persons in high-risk groups. 29 top link

Discussion

Since the annual incidence of syphilis is less than 10 cases per 100,000 persons, 1 routine screening of the general population is likely to have low yield. Populations at increased risk due to high-risk sexual activities include commercial sex workers, persons who exchange sex for drugs, persons with other sexually transmitted diseases (STDs) including HIV, and contacts of persons with active syphilis. The value of screening for asymptomatic infection in other persons will depend on both individual risk factors (e.g., the number and nature of sex partners) and on local epidemiology. Experience with HIV and other STDs demonstrates that sexual history is not sufficiently sensitive to identify infected persons in high-risk communities some persons may not report risk factors, and even monogamous patients may be at risk from an infected partner. Conversely, in communities where syphilis is uncommon, screening asymptomatic persons is likely to detect few cases of syphilis, even when patients have high-risk behaviors.

Routine screening in both high- and low-risk areas is justified among pregnant women, because of the severe neonatal morbidity and mortality associated with congenital syphilis, as well as its potential preventability. Determination of sexual risk factors is often insensitive in pregnant women, who may be reluctant to admit some behaviors or unaware of risk factors in their partners. 10 Several studies have demonstrated that prenatal screening for syphilis is cost-effective, even when the prevalence of the disease among pregnant women is as low as 0.005%. 30,31 Currently, congenital syphilis occurs in 0.05% of all live births. 1

CLINICAL INTERVENTION

Routine serologic testing for syphilis is recommended for all pregnant women and for persons at increased risk for infection, including commercial sex workers, persons who exchange sex for money or drugs, persons with other STDs (including HIV), and sexual contacts of persons with active syphilis ("A" recommendation). The local incidence of syphilis in the community and the number of sex partners reported by an individual should also be considered in identifying persons at high risk of infection. The optimal frequency for such testing has not been determined and is left to clinical discretion.

All pregnant women should be tested at their first prenatal visit. For women at high risk of acquiring syphilis during pregnancy (e.g., women in the high-risk groups listed above), repeat serologic testing is recommended in the third trimester and at delivery. Follow-up serologic tests should be obtained to document decline in titers after treatment. They should be performed using the same test initially used to document infection (e.g., VDRL or RPR) to ensure comparability.

See Chapter 62 for recommendations on counseling to prevent sexually transmitted diseases.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by James G. Kahn, MD, MPH, and A. Eugene Washington, MD, MSc.top link

27. Screening for Gonorrhea --- Including Ocular Prophylaxis in Newborns

Burden of Suffering

Nearly 420,000 N. gonorrhoeae infections were reported in the U.S. in 1994 1 the actual number of new infections is estimated to be closer to 800,000 per year, due to incomplete reporting. 2 Between 10% and 20% of untreated gonococcal infections in women lead to pelvic inflammatory disease (PID), which may require hospitalization or surgery. 3 PID is an important cause of chronic pelvic pain, ectopic pregnancy, and infertility approximately one out of four women with a prior history of PID is unable to conceive. 4 Pregnant women with gonococcal infections are at increased risk for obstetric complications (e.g., stillbirth, low birth weight). 5,6 Infected women can give birth to infants with gonococcal conjunctivitis (ophthalmia neonatorum), a condition that often produces blindness if not treated. 5 Gonococcal infections produce urethritis, epididymitis, and proctitis in men, but few long-term complications. Disseminated gonococcal infection can cause tenosynovitis, septic arthritis, endocarditis, and meningitis, especially in persons with complement disorders.

The overall incidence of gonorrhea has steadily declined in the U.S. since 1975, 1,7 but rates of infection have remained very high in some groups of young men and women. 8 Over 60% of gonococcal infections occur in persons under age 25 7 adolescents (ages 15-19) have rates of infection comparable to young adults (ages 20-24). A number of demographic and behavioral characteristics are associated with higher reported rates of gonorrhea: being unmarried, urban residence, low socioeconomic status, early sexual activity, multiple sex contacts, and a prior history of gonorrhea. 10,11 Rates of gonorrhea are highest in poor, minority communities in large cities and in the rural Southeast. Black male and female adolescents have a 10-20-fold higher rate of infection than their Hispanic or white counterparts. 7,8,11 Geographic variation is substantial, with the highest rates of infection in the Southeastern states. 1,7 Since most data come from public health clinics, the demographics of reported disease may not be entirely representative of the true distribution of infection.

Up to 80% of women infected with gonorrhea are asymptomatic, 12 and asymptomatic men and women comprise an important reservoir for new infection. Nearly half of all male partners of infected women, and over three quarters of female partners of infected men, are infected. 13,14 While the majority of infected men eventually develop symptoms, initial asymptomatic periods may last up to 45 days. 15 The prevalence of asymptomatic gonorrhea in high-risk communities is generally higher in women (4-5%) 16,17 than in men (1.5-2.5%). 15,18,19 Asymptomatic gonorrhea was uncommon, however, among women at a university health clinic (0.4%), 17 private practice patients in Montreal (0.4%), 20 and non-Medicaid patients in Boston (<1%). 21 Among asymptomatic male adolescents screened at urban teen clinics or detention centers, up to 5% are infected with N. gonorrhoeae. 22-25

The majority of pharyngeal infections are asymptomatic, but infection may be transmitted to genital sites through oral sex 26 or progress to disseminated gonococcal infection. 27 Pharyngeal gonorrhea, however, usually occurs in association with anogenital infection, and it responds to usual treatment regimens for anogenital gonorrhea (i.e., broad-spectrum cephalosporins and fluoroquinolones). 9 Persons with gonorrhea may be infected with other sexually transmitted diseases (chlamydia, syphilis, HIV) up to 50% of persons with gonorrhea have a coexistent chlamydial infection. 12

The frequency of antibiotic-resistant N. gonorrhoeaehas steadily increased in the U.S. Recent surveillance data estimate that 32% of gonorrhea isolates nationwide are resistant to penicillin or tetracycline. 28 These organisms are currently sensitive to broad-spectrum cephalosporins such as ceftriaxone and several other antibiotics, but the emergence of new resistance remains a concern. 9 top link

Accuracy of Screening Tests

The most sensitive and specific test for detecting gonococcal infection in asymptomatic persons is direct culture from sites of exposure (urethra, endocervix, throat, rectum). Under quality-controlled conditions, the sensitivity of culture is high for both male and female anogenital gonorrhea, and for pharyngeal gonococcal infections. In women, a single endocervical culture is estimated to have a sensitivity of 80-95%. 29,30 Sensitivity of cultures may be limited by inadequate clinical specimens, improper storage, transport or processing, and inhibition of growth by antibiotics in selective culture mediums. 31

Microscopic examination of Gram-stained urethral or cervical specimens can detect infection with N. gonorrhoeae. The sensitivity of Gram-stained urethral specimens is higher in symptomatic men (90-95%) than in asymptomatic men (70%). 31 The Gram stain is less sensitive for cervical infections in women (30-65%), and it is not useful for diagnosing pharyngeal or rectal infections. The specificity of stained smears is high in men (97-99%), but lower in women (90-97%), due to presence of vaginal flora. 31

In clinical settings where handling and storage of culture medium is difficult, other methods of testing have become increasingly popular. DNA probes and enzyme immunoassays (EIA) are currently the most widely used nonculture diagnostic tests. Compared to culture, the sensitivity, specificity, and positive predictive value (PPV) of EIA are generally high using urethral specimens from symptomatic men (>95%). 32,33 Accuracy of EIA is significantly lower in endocervical specimens, however: sensitivity 60-100%, specificity 70-98%, and PPV 78-85%. 32-35 Among patients in sexually transmitted disease (STD) clinics (prevalence of gonorrhea 9-10%), a DNA probe had a very high sensitivity and specificity (97-99%) and high PPV (>90%) and was more sensitive than a single culture. 36,37 The accuracy of nonculture tests has not been adequately studied in an asymptomatic, primary care population, however. Among asymptomatic persons, in whom the prevalence of gonorrhea is often very low, a substantial proportion of positive EIA or DNA probe results may be false positives. 32,34 Serology is neither sufficiently sensitive nor specific for use in screening. None of the nonculture tests provides information on antibiotic susceptibility.

Screening for gonorrhea in asymptomatic men has been limited by the discomfort and inconvenience of obtaining urethral specimens. 24 EIA using urine specimens produces accurate results in symptomatic men, 38 but when used to screen a low-prevalence population, the majority of positive EIA results have been false positives. 39 Urine dipstick for leukocyte esterase (LE) is an inexpensive, rapid, and noninvasive test for urethritis in men. Among asymptomatic high-risk young men (ages 15-25, prevalence of infection 3%), urinary LE had a sensitivity of 46-60% and a specificity of 93-96% for gonorrhea. 24,25 The PPV of LE in an asymptomatic population is low (30-43%), although some false-positive results may be due to other infections that require treatment (e.g., chlamydia).

Information from the sexual history and clinical examination have been used to improve screening strategies. In one study of 1,441 women in Boston undergoing routine pelvic examinations, five factors were independently associated with gonococcal infection: partners with gonorrhea or urethral discharge, endocervical bleeding induced by swab, age at first intercourse less than 16, payment by Medicaid (a proxy measure for low socioeconomic status), and low abdominal or pelvic pain. 21 The prevalence of infection among women with one or more risk factors was 2.5%, compared to 0.2% for women with no risk factors. In a second study, young age (under 20), vaginal discharge, or a sex partner suspected of having gonorrhea identified all infected women in a low-income, urban population (prevalence of gonorrhea 3%). 17 top link

Effectiveness of Early Detection

Early detection and treatment of gonococcal infection in asymptomatic persons offers the potential benefits of preventing future complications of infection, reducing transmission to uninfected partners, and identifying sexual contacts who are likely to be infected. Due to ethical considerations that preclude placebo-controlled trials of treatment, the benefits of early detection are based largely on indirect evidence: the effectiveness of antibiotic treatment and the high morbidity of untreated gonorrhea. The decline in the reported incidence of gonorrhea over the last two decades and decline in hospitalizations for PID may also indicate a benefit of current screening strategies, but other factors (increased use of condoms) have presumably had an impact as well. 7 Due to high rates of reinfection among those at greatest risk, screening may be of little benefit to some individuals unless it is accompanied by measures to prevent future infections. Early detection and treatment of gonorrhea during pregnancy has the potential to decrease morbidity from the obstetric complications of gonococcal infections, although this benefit has never been tested in a controlled trial. top link

Ocular Prophylaxis of Newborn Infants

Between 30% and 50% of infants exposed to gonococci will develop ophthalmia in the absence of treatment. 40 Gonococcal ophthalmia can cause severe conjunctivitis and lead to corneal scarring, abscess, eye perforation, and permanent blindness. 41 Blindness due to ophthalmia neonatorum declined dramatically with the institution of widespread prophylaxis of infants with silver nitrate. Studies of ocular prophylaxis in developing countries, using historical controls, report reductions of 80-90% in the transmission of gonococcal ophthalmia neonatorum with prophylaxis with silver nitrate, tetracycline or erythromycin. 42,43 In a U.S. study, failure rates were similar after prophylaxis with silver nitrate, erythromycin, or tetracycline (0.03-0.1%). 44 A recent controlled trial in Kenya reported that povidone-iodine, erythromycin, and silver nitrate were each effective in preventing conjunctivitis due to N. gonorrhoeae. 45 Tetracycline-resistant strains of gonorrhea have been reported in the U.S. and other countries. 45 The optimal prophylactic agent against penicillinase-producing strains of N. gonorrhoeae(PPNG) has not been determined. top link

Recommendations of Other Groups

The Canadian Task Force on the Periodic Health Examination advises against routine screening for gonorrhea in the general population but recommends screening of high-risk patients: individuals under 30 years, particularly adolescents, with at least two sex partners in the previous year prostitutes sexual contacts of individuals known to have an STD and persons under age 16 years at first intercourse. 46 The Centers for Disease Control and Prevention (CDC) recommends screening asymptomatic women in the following priority groups: all pregnant women, sexually active adolescents, and women with multiple sex partners. 9,12 The American Academy of Family Physicians recommends screening prostitutes, persons with multiple sex partners or whose sex partner has multiple sex contacts, sexual contacts of persons with culture-proven gonorrhea, or persons with a history of repeated episodes of gonorrhea. 47 These recommendations are under review. Bright Futures, 48 the American Medical Association Guidelines for Adolescent Preventive Services (GAPS), 49 and the American Academy of Pediatrics (AAP) 50 recommend annual screening of sexually active adolescents. The AAP recommends dipstick urinalysis for leukocytes in all adolescents. 51 An expert panel convened in 1994 by the Institute of Medicine, National Academy of Sciences, is developing recommendations for public health strategies to control STDs, including gonorrhea.

The American College of Obstetricians and Gynecologists (ACOG) recommends obtaining endocervical cultures in pregnant women during their first prenatal visit only if they are in one of the high-risk categories for gonorrhea. 10 ACOG and CDC recommend repeating culture late in the third trimester for high-risk women. 10,12 ACOG recommends that all cases of gonorrhea should be diagnosed or confirmed by culture to facilitate antimicrobial susceptibility testing. 10 The American Academy of Pediatrics, 52 American Academy of Family Physicians, 47 and CDC 2 recommend administering ointment or drops containing tetracycline, erythromycin, or 1% silver nitrate solution to the eyes of all infants shortly after birth (within 1 hour). In the absence of universal prenatal screening for gonorrhea, the Canadian Task Force also recommends universal ocular prophylaxis with any of these antibiotic agents. 46 top link

Discussion

Gonorrhea remains an important public health problem and a major source of morbidity for women. Although definitive proof (e.g., controlled trials) that screening reduces future morbidity is not available, selective screening of high-risk women can be justified by the substantial prevalence of asymptomatic infection, the availability of accurate screening tests and effective treatments, and the high morbidity of untreated gonorrhea in women. Early identification and treatment of asymptomatic individuals is also likely to reduce transmission of gonorrhea. The benefits of early detection may be reduced by the high likelihood of reinfection unless effective measures are taken to identify and treat sex partners.

Performing routine cultures for gonorrhea on all sexually active adults would be inefficient due to the low prevalence of infection in the general population and the wide variation in the rates of gonorrhea in different communities. Clinicians must base their decision to screen for gonorrhea on both individual risk factors and the local epidemiology of disease, realizing that sexual history is often an unreliable indicator of actual risk of infection. Therefore, routine screening of all sexually active young women may be more effective in communities where gonorrhea is prevalent, while more selective screening is appropriate when the rate of infection is known to be low.

In men, the prevalence of asymptomatic infection and the morbidity from gonorrhea is much lower than in women. Asymptomatic men, however, represent an important reservoir for transmitting infection, and opportunities to identify and treat infected men are often limited. Screening asymptomatic men with currently available urine tests may generate a large proportion of false-positive results. As more reliable, noninvasive methods for testing men become available (e.g., polymerase or ligase chain reaction assays of urine), screening young men may help reduce the incidence of gonorrhea in high-risk communities. The effectiveness of such a strategy, however, deserves to be tested in a prospective study.

The primary rationale for screening all pregnant women has been the prevention of ophthalmia neonatorum. Due to the low prevalence of gonorrhea in average-risk pregnant women, and the efficacy of universal ocular prophylaxis with antibiotic ointment, the benefits of screening for gonorrhea in all pregnant women are uncertain. Screening pregnant women at high risk for gonorrhea, however, may also help prevent other complications associated with gonococcal infection during pregnancy.

CLINICAL INTERVENTION

Routine screening for gonorrhea is recommended for asymptomatic women at high risk of infection ("B" recommendation). High-risk groups include commercial sex workers (prostitutes), persons with a history of repeated episodes of gonorrhea, and young women (under age 25) with two or more sex partners in the last year. Actual risk, however, will depend on the local epidemiology of disease. Clinicians may wish to consult local health authorities for guidance in identifying high-risk populations in their community. In communities with high prevalence of gonorrhea, broader screening of sexually active young women may be warranted. Clinicians should remain alert for findings suggestive of cervical infection (e.g., mucopurulent discharge, cervical erythema or friability) during routine pelvic examinations.

Screening is recommended at the first prenatal visit for pregnant women who fall into one of the high-risk categories ("B" recommendation). An additional test in the third trimester is recommended for those at continued risk of acquiring gonorrhea. There is insufficient evidence to recommend for or against universal screening of pregnant women ("C" recommendation). Erythromycin 0.5% ophthalmic ointment, tetracycline 1% ophthalmic ointment, or 1% silver nitrate solution should be applied topically to the eyes of all newborns as soon as possible after birth and no later than 1 hour after birth ("A" recommendation).

There is insufficient evidence to recommend for or against screening high-risk men for gonorrhea ("C" recommendation). In selected clinical settings where asymptomatic infection is highly prevalent in men (e.g., adolescent clinics serving high-risk populations), screening sexually active young men may be recommended on other grounds, including the potential benefits of early treatment for preventing transmission to uninfected sex partners. Screening men with urine LE dipstick is convenient and inexpensive, but requires confirmation of positive results. Routine screening of men or women is not recommended in the general population of low-risk adults ("D" recommendation). The optimal frequency of screening has not been determined and is left to clinical discretion.

Culture of endocervical specimens is the preferred method for screening asymptomatic women. When EIA or DNA probe tests are used for initial screening, verification of positive results may be necessary, depending on the underlying risk in the patient and potential adverse consequences of a false-positive result. Treatment should employ regimens effective against penicillin- and tetracycline-resistant organisms and should include treatment for co-infection with chlamydia and treatment of sex partners. 9 All sexually active individuals should be counseled about effective means of preventing STDs (see Chapter 62). Clinicians should follow local gonorrhea disease-reporting requirements.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Mark S. Smolinski, MD, MPH, and David Atkins, MD, MPH, based in part on materials prepared for the Canadian Task Force on the Periodic Health Examination by Brenda Beagan, MSc, Elaine E. L. Wang, MD, CM, FRCPC, and Richard B. Goldbloom, MD, FRCPC. top link

28. Screening for Human Immunodeficiency Virus Infection

Burden of Suffering

It is estimated that 0.8-1.2 million persons in the U.S. are infected with the human immunodeficiency virus (HIV-1) and that 40,000-80,000 new infections occur each year. 1,2 Most people infected with HIV eventually develop the acquired immunodeficiency syndrome (AIDS), defined by opportunistic infections or severe immune dysfunction. 3 Within 10 years of infection with HIV, about 50% of persons develop clinical AIDS, and another 40% or more develop other illnesses associated with HIV infection. 4 A small proportion (5-10%) of persons remain well 10-15 years after HIV infection, 5 but there is currently no available curative treatment for AIDS. Of the 476,899 cases of AIDS reported to the Centers for Disease Control and Prevention (CDC) through June 1995, 62% had died, including over 90% of those diagnosed before 1988. 6 HIV infection is now the leading cause of death among men ages 25-44, and the fifth leading cause of years of potential life lost before age 65. 7,8 By the end of 1995, it is projected that there will be 130,000-205,000 persons living with AIDS in the U.S., at an annual cost of treatment in excess of $15 billion. 9

High-Risk Groups.

Men who have sex with men and injection drug users (IDUs) together accounted for over 80% of AIDS cases reported in 1994. 6 HIV infection is widely prevalent in these groups. 10 In seroprevalence surveys conducted in 1991-1992 at STD clinics and drug treatment facilities, the median prevalence of HIV infection among men having sex with men was 26% (range 4-47%) [a] and ranged from 2-7% among IDUs in Western cities to 12-40% among IDUs on the East Coast. 10 HIV infection is also prevalent among heterosexual persons with other STDs (median 0.6%), prisoners (range 1-15%), and residents of homeless shelters (range 1-21%). 10 Very high rates of HIV infection (up to 30%) have been detected among inner-city young adults who smoke crack, especially among women who exchange sex for drugs. 11 More than 10,000 cases of AIDS have been attributed to transfusion of infected blood or blood components (i.e., clotting factor) between 1977 and 1985, but the current risk of becoming infected from blood or tissue products is extremely low. 12 top link


[a] Data from patients at STD clinics, by selecting for high-risk sexual practices, will overstate the rate of infection in the general population of homosexual and bisexual men. top link

Pregnant Women.

By July 1995, over 5,900 cases of AIDS had been attributed to perinatal infection. 6 An estimated 0.17% of all childbearing women in the U.S. were infected with HIV in 1991-1992, but prevalence varied from 0-0.05% in 16 states to 0.6% in New York State. Of 144 urban family planning clinics conducting blinded surveillance, 13 reported a prevalence of HIV infection over 1%. 10 Of the estimated 7,000 infants born to infected mothers each year, 80% are born in 20 states where prevalence of antibody-positive newborns was 0.1% or greater. 13 The probability of vertical transmission from mother to infant is between 13% and 35%, 14-16 increasing with severity of disease in the mother. 17 top link

General Population.

The distribution of new AIDS cases may not reflect the changing pattern of the HIV epidemic due to the long delay between infection and clinical AIDS. Heterosexual transmission is the most rapidly growing source of new AIDS cases, accounting for 10% of new AIDS cases in 1994 (up from 2% in 1985). 6 It is the leading cause of new HIV infections in American women. 6,18 Young black and Hispanic women in large East Coast cities are at greatest risk (due to the higher prevalence of HIV in their male partners and the more efficient transmission of virus from men to women during intercourse), 19 but women of all races experienced comparable increases in heterosexually acquired AIDS cases between 1992 and 1994. 18 Large surveys in 1991-1992 of Job Corps applicants ages 16-21 (seroprevalence 0.27%), military recruits (seroprevalence 0.06%), and patients at urban hospitals (range 0.1-5.8%) indicate that the prevalence of HIV infection varies markedly among different demographic groups and geographic areas. 10 The rate of AIDS and the prevalence of HIV are substantially higher in Atlantic Coast states than in Midwest and Mountain states, in large metropolitan areas (population 500,000 or greater) than in smaller cities or rural areas, and in black and Hispanic young persons than in whites, Native Americans, or Asians/Pacific Islanders. 6,10,20 Blinded screening of over 20,000 primary care patients in smaller cities and rural areas detected HIV in 0.15% of all patients without known disease, however. 10 The regional variations in HIV prevalence generally mirror the local prevalence of infection among drug users. IDUs account for a large proportion of new HIV infections 21 and are a leading source of heterosexual and perinatal transmission of HIV. 6 top link

Accuracy of Screening Tests

The initial screening test to detect antibodies to HIV is the enzyme immunoassay (EIA). Commercially available EIAs use antigens from whole disrupted virus (first generation), recombinant viral proteins (second generation), or chemically synthesized peptides (third generation). 22 EIA results are considered "reactive" only when a positive result has been confirmed in a second test of the original sample. In subjects with clinical AIDS, the sensitivity of EIA is close to 100%, but newly infected individuals may not develop detectable antibodies for periods of weeks to months after infection. 23 The median interval between infection and seropositivity has been estimated at 3 months with earlier EIAs, with 95% seroconverting within 6 months. 24 Estimates based on more sensitive third-generation EIAs, which may be positive within 1 week of peak antigen levels, suggest that the "window" period with current tests is substantially shorter (3-4 weeks). 25,25a Specificity of EIA is above 99.5% with most tests. Third-generation EIAs had specificities of 99.7-99.9% when tested against uninfected controls. 25-27 False-positive results can be caused by nonspecific reactions in persons with immunologic disturbances (e.g., systemic lupus erythematosus or rheumatoid arthritis), multiple transfusions, or recent influenza or rabies vaccination. 28,29 Food and Drug Administration (FDA) approval has been granted to rapid tests (<15 minutes) using colorimetric assays, 30 and to an EIA-based test using oral fluid samples collected with a cotton pad. 31,32 Although these tests are sensitive and specific (>99%), neither method is recommended for the definitive diagnosis of HIV infection. 33

To prevent the serious consequences of a false-positive diagnosis of HIV infection, confirmation of positive EIA results is necessary, using an independent test with high specificity. The Western blot (WB) is the most commonly used confirmatory test in the U.S. Indirect immunofluorescence assay (IFA), which is less expensive and less time-consuming, is also approved for confirmatory testing. 34 The specificity of WB is close to 100% under controlled settings, 35 but it is dependent on the skill and experience of the laboratory and on criteria used to determine positive WB results. 33,35 The sensitivity of WB was 98.5% and specificity 92.5% in a 1989 CDC quality control survey of 140 laboratories 26 most errors involved misclassification of positive or negative samples as "indeterminate." Criteria for interpreting WB have been developed to improve accuracy in clinical testing. 35 Many laboratories doing a large volume of HIV testing achieved sensitivities and specificities of 100%. 36

The precise false-positive rate of current HIV testing is not known. A false-positive rate of less than 1 in 100,000 was reported in experienced, centralized laboratories using careful quality control procedures. 37,38 In practice, false-positive diagnoses can result from contaminated or mislabeled specimens, 39 cross-reacting antibodies, 33 failure to perform confirmatory tests, 40 misinterpretation of WB patterns, or misunderstanding of reported results by clinicians or patients. 41 In one study, 8 of 900 women referred for treatment of HIV were not infected on repeat testing. 40 To prevent errors due to mistakes in specimen handling or WB interpretation, a consensus laboratory panel recommended confirming all new diagnoses of HIV with tests on a freshly obtained specimen. 42

Indeterminate Western Blot Results.

Indeterminate WB results, due to antibody patterns that do not meet full criteria for positive test, occur in 3-8% of EIA-positive specimens. 33,43 In a survey of over 1 million newborn specimens for maternal HIV antibody, <1 in 4,000 screened samples produced an indeterminate WB result. 43a An indeterminate WB result may indicate evolving antibody response in recently infected subjects, but in low-risk persons it usually represents the presence of nonspecific antibodies. Follow-up of nearly 700 EIA positive/WB indeterminate blood donors documented seroconversion in only eight subjects, all but one of whom reported a high-risk behavior. 43-45 An indeterminate WB is more significant in high-risk subjects, but the risk of seroconversion is variable (13-28%). 44,46 Antibodies to p24 antigen were present in 29 of 30 subjects who subsequently converted. 43,44 A new indeterminate WB in subjects who were previously seronegative is more likely to represent recent infection. 47 Seroconversion usually occurs within 2-3 months (range 2-16 weeks in one study). 44 A stable indeterminate WB after 6 months can be assumed to be due to nonspecific antibody reaction rather than HIV infection. 33,48 top link

Viral Culture and Polymerase Chain Reaction Assays.

Viral culture is the most specific test for HIV infection, but it is time-consuming, expensive, technically difficult, and insufficiently sensitive for use as a screening test. 22 Polymerase chain reaction (PCR) can detect viral genetic material in subjects who have not yet developed antibodies, but trace levels of contamination can produce false-positive results. 22 In proficiency testing of five experienced laboratories, sensitivity of PCR was 98-100% and specificity was 96-100%. 49 With newer, more sensitive EIAs, the additional value of screening with PCR is small in adults. In over 250 high-risk persons with nonreactive EIAs, PCR detected only two confirmed cases of HIV. 50-53 Since PCR is not 100% specific, a large proportion of positive PCR results in seronegative patients may be false positives. Despite their limitations as screening tests, viral culture, PCR, and viral antigen assays are useful for evaluating patients with symptoms of acute HIV infection. top link

Diagnosis of HIV in Infants.

Diagnosing infection in infants born to HIV infected mothers is difficult, since maternal antibodies to HIV are present in both infected and uninfected infants. Uninfected infants serorevert an average of 10 months after birth, but maternal antibody may persist up to 18 months. 54 Viral culture and PCR are highly specific for infection in infants, although PCR is occasionally positive in infants who eventually serorevert and are presumed not to be infected. 55 Reported sensitivity for culture is 60-90% 22,56 and 84-98% for PCR, increasing after 3 months. 57,58 Using culture and/or PCR, an estimated 50% of infected infants can be identified at birth, and up to 90% by 3 months. 59 Other tests for perinatal infection include assays for viral antigen (sensitivity 30-60%) 56,59 and IgA (sensitivity 50-80%). 59,60 top link

Screening by Risk Factor Assessment.

Patient history is an important but imperfect way to assess risk for HIV infection. Patients may conceal high-risk behaviors, and others (especially women in high-risk areas) may be unknowingly at risk from an infected sex partner. In high-prevalence family planning clinics, testing women who reported drug use or an IDU partner detected only 41-57% of all cases. 61,62 Offering testing routinely to all women resulted in greater acceptance (up to 96%), 61 and detected 87% of all HIV infections. 62 Even in low-prevalence areas such as Sweden, 37% of infected pregnant women did not report clear risk factors for HIV. 63 There are few data comparing the sensitivity of targeted versus routine screening in male patients. In a seroprevalence study in primary care practices, physicians were not aware of HIV risk factors in roughly one third of infected patients. 10 top link

Consent, Confidential Versus Anonymous Testing, and Partner Notification.

There is general consensus that informed consent should be obtained prior to HIV testing. 14,64 The diagnosis of HIV has serious consequences, and compulsory testing may discourage persons from seeking care. Alternate forms of consent -- right of refusal (i.e., passive consent) versus explicit consent, active recommendation versus nondirective counseling -- have been considered to facilitate screening during pregnancy. 13,65 Universal newborn screening without consent of the mother was considered but rejected by the New York State legislature. 66

Half of all states require confidential reporting of HIV-infected persons by name to state or local health departments, and all require reporting of AIDS patients. 6,67 Partner notification (contact tracing) can alert exposed persons to the need to be tested. Between 50% and 90% of sex partners can be identified through organized partner notification programs, 68 and costs per case identified compare favorably to screening high-risk groups. 69

In one trial, offering the option of anonymous testing resulted in higher rates of testing. 70 Two thirds of all patients at a public clinic (and a majority of seropositive patients) chose anonymous testing over confidential testing. 71 Test kits that would allow individuals to submit specimens (blood or saliva) collected at home for anonymous testing, using coded identifiers, are being considered by the FDA. 71a top link

Frequency of Testing.

The appropriate frequency of HIV screening is not known, but it depends in part on the incidence of new infections. Rates are highest among IDUs in Northeastern cities (3-6 infections/100 patient-years [py]), 21 compared to less than 1/100 py in homosexual men in most cities, 72 and 0.04/100 py among military personnel aged 25-29. 73 Due to the low incidence of new infection and increased sensitivity of new tests, repeat testing simply to confirm an initial negative EIA is rarely indicated. top link

Effectiveness of Early Detection

Detection of asymptomatic HIV infection permits early treatment to slow disease progression, interventions to reduce perinatal transmission, and counseling to prevent transmission of virus to uninfected sex partners or persons sharing injection needles.

Effectiveness of Early Therapy in Asymptomatic Adults.

Antiretroviral medications (e.g., zidovudine [ZDV or AZT], didanosine, zalcitabine) reduce mortality in AIDS patients and delay progression to AIDS in symptomatic HIV infection. 74,75 Due to eventual resistance developed by the HIV virus, 76 however, there is no clear benefit of initiating antiretroviral therapy (with current drugs as monotherapies) before patients become symptomatic. 74,77,78 In an overview of six randomized, controlled trials (RCTs) of ZDV for asymptomatic HIV infection, there was no long-term benefit of early treatment versus deferred treatment (begun at onset of symptoms) on survival or progression to AIDS. 79 For asymptomatic patients with more advanced immunodeficiency (CD4 200-500/ micro-L), early ZDV delayed progression to AIDS or AIDS-related complex over the short-term but did not improve outcomes beyond the first 1-2 years. 74,79,80 In Concorde, the largest and longest trial -- over 1,700 men and women with asymptomatic HIV infection followed over 3 years -- there was no significant difference between patients receiving early versus deferred treatment in the combined incidence of AIDS or death (18% in each group), or in total mortality (8% and 6%, respectively). 81 The early benefits from delaying disease progression are offset by the adverse effects of ZDV (e.g., nausea, headache, and fatigue). 82 Early combination antiretroviral therapy, by reducing drug resistance, may be more effective than monotherapy in reducing viral burden and delaying immunodeficiency, but long-term clinical trials of these approaches have not yet been completed. 83,84

Chemoprophylaxis can reduce the risk of Pneumocystis cariniipneumonia (PCP) in patients with more advanced immunodeficiency. The annual incidence of PCP rises to 18-25% in persons with CD4 < 200/ micro-L, 85,86 and half of all HIV-infected persons are still asymptomatic at this stage of disease. 85,87,88 In retrospective analyses, prophylaxis with trimethoprim-sulfamethoxazole (TMP-SMX) or inhaled pentamidine reduced the incidence of primary PCP by 67-83%, 86,89-91 delayed onset of AIDS by 6-12 months, and prolonged survival in those with low CD4 count by almost 1 year. 92-94 In a recent 3-year trial, TMP-SMX, dapsone, and inhaled pentamidine were equally effective as initial therapy for primary prophylaxis against PCP. 95 TMP-SMX was more effective in shorter studies 89,90,96 and for patients with CD4 < 100, 95 and it is recommended as the preferred prophylactic agent 97 long-term therapy is limited by the high incidence of side effects (e.g., leukopenia, fever, rash, or gastrointestinal side effects). TMP-SMX and dapsone also provide protection against toxoplasmosis, 98,99 an uncommon complication of asymptomatic HIV infection in the U.S. 95 Long-term benefits of chemoprophylaxis are limited by the continuing decline in immune function. 92

HIV-infected persons infected with Mycobacterium tuberculosisare at increased risk of developing active tuberculosis, primarily at CD4 counts below 500/ micro-L. In a randomized trial among asymptomatic HIV-infected persons in Haiti, an endemic tuberculosis area, chemoprophylaxis with isoniazid reduced the incidence of tuberculosis nearly 75% and delayed progression to AIDS. 100 In a U.S. cohort study, isoniazid prophylaxis reduced tuberculosis among HIV-positive, PPD-positive drug users. 101,102 Decision analyses suggest that the benefits of prophylaxis may outweigh risks in both PPD-positive and anergic patients with HIV when exposure to tuberculosis is prevalent (e.g., in IDUs, homeless persons, and immigrants from endemic areas see Chapter 25). 103 Chemoprophylaxis with rifabutin is also effective against Mycobacterium aviumcomplex (MAC), but few asymptomatic patients have sufficiently advanced disease (CD4 < 75/micro-L) to warrant MAC prophylaxis. 97 top link

Effectiveness of Early Therapy in Asymptomatic Children.

There are few randomized trials of interventions in asymptomatic HIV-infected children, but TMP-SMX is safe and effective for PCP prophylaxis in other immunocompromised children. 104 PCP may be the first indication that infants are infected with HIV: 44% of the children who developed PCP in one study had never been evaluated for HIV. 105 Between 7% and 20% of children with HIV develop PCP within the first year of life (peak incidence between 3 and 9 months), 105,106 and mortality is high (up to 30%). 107 Adverse reactions (rash, cytopenia) from TMP-SMX occur in up to 15% of HIV-infected children. New CDC guidelines for PCP prophylaxis in infants recommend prophylaxis for all HIV- infected or possibly infected infants during the first year of life. 107 Studies are currently under way to evaluate the benefit of different antiretroviral therapies in asymptomatic children.

A variety of precautions are routinely recommended for HIV-infected children and adults, due to the increased susceptibility to viral and bacterial infections: 108-110 vaccination against influenza, pneumococcus, hepatitis B, and Haemophilus influenzae 111 avoidance of oral (live virus) polio vaccine in children 112 attention to nutrition 113 avoiding uncooked foods and high-risk sexual practices and more frequent Papanicolaou (Pap) screening for women (due to an increased risk of invasive cervical cancer). 67,114 The benefit of these interventions in persons with asymptomatic HIV infection has not been determined. top link

Effectiveness of Early Intervention in Asymptomatic Pregnant Women.

A randomized placebo-controlled trial (ACTG 076) demonstrated that a regimen of ZDV begun between weeks 14 and 34 of pregnancy and continued through delivery and for 6 weeks in newborn infants significantly reduced perinatal HIV infection (8.3% vs. 25.5%) among infants born to seropositive mothers with mildly symptomatic HIV infection (CD4 > 200/micro-L). 16 ZDV is associated with a low incidence of severe side effects in mothers or infants, but the long-term effects on the health of infants, and on the course of HIV disease in mothers, are not known. Cesarean delivery is also associated with lower rates of vertical transmission a trial of operative versus vaginal delivery in HIV-infected pregnancies is under way in Italy. 115

Identifying seropositive women during pregnancy may have other important benefits: some women may be candidates for PCP prophylaxis male partners can be advised to be tested and to use condoms infants can be monitored for evidence of HIV infection and started on appropriate therapy and early involvement of social services may facilitate care for infected mothers and infants. Infected mothers are advised to avoid breastfeeding: a meta-analysis of cohort studies estimated that breastfeeding increases vertical transmission by 14%. 116 The extent to which early detection actually leads to these benefits is difficult to estimate. There is no clear effect of testing and counseling on fertility decisions: in two U.S. studies, pregnancy rates were similar for HIV-positive and HIV-negative women. 117 Infected women were more likely to choose abortion than uninfected women in one study, 118 but the large majority of women with HIV chose to continue the pregnancy. 63 top link

Effectiveness of Testing and Counseling to Prevent Transmission of HIV.

Determining whether HIV testing and counseling reduces HIV transmission is complicated by many factors: 119 a paucity of well-controlled trials variable quality of counseling interventions reliance on intermediate endpoints, such as self-reported changes in behavior differences between screened and unscreened patients in observational studies population-wide changes in high-risk behavior and the uncertain importance of testing versus counseling. The effect of testing and counseling varies with the specific behavior and population being targeted. 117 Programs that targeted couples in which one member was infected provide the strongest evidence of the benefits of testing and counseling: compared with historical controls, counseled couples increased regular condom use and significantly reduced the rate of seroconversion in partners. 120-123

Among homosexual men, high-risk sexual practices and new HIV infections have declined substantially since the development of HIV tests. Community-wide changes may have been more important than identification of seropositive persons, however, and there are worrisome signs of persisting unsafe practices among younger gay men. 124 Condom use is generally higher among seropositive men than seronegative or untested men, but longitudinal studies suggest increasing use of condoms across all groups. 125-127 Unprotected anal intercourse and number of sex partners have declined over time in both tested and untested men. The absolute change in risk may be greater in seropositive men, who engage in higher-risk behavior at baseline. 128 A substantial proportion of seropositive men continue to have sex with multiple partners and engage in oral intercourse 5-15% continue unprotected anal intercourse. 128

The effects of screening in injection drug users are less consistent. Among drug users in treatment, drug use and needle sharing decline after HIV counseling and testing but also declined among unscreened patients. 117,129 In a community survey, IDUs who had received testing and counseling were half as likely to share needles as untested subjects. 130 Testing has inconsistent effects on high-risk sexual behavior among drug users. Seropositive IDUs are more likely to report condom use in some cross-sectional studies, but 30-70% use condoms only occasionally or not at all. 117,131

Counseling and testing patients at STD clinics has had variable results. Rate of recurrent STDs was unchanged 1 year after testing in one study, 132 was lower among HIV-positive than HIV-negative subjects in another (15% vs. 23%), 133 and declined among seropositive patients while rising among seronegative patients in a third. 134 In one randomized trial, testing and counseling reduced unprotected intercourse more than counseling alone. 135 Other longitudinal studies suggest little change in sexual behavior in seronegative subjects after testing and counseling. Routine testing and counseling among students at a college health clinic did not improve low rates of condom use. 136 Among women tested in community health clinics, seronegative women did not reduce sexual risk factors after testing and counseling, compared with untested controls. 137

Efforts to modify high-risk behaviors in infected and high-risk persons are often hindered by substance abuse, poverty, limited education, denial, or economic necessity (i.e., prostitutes). A minority of seropositive persons abstain completely from sex or drug use after diagnosis. Those who do not may have trouble obtaining condoms or clean needles but may not inform sex partners that they are infected. 138 Female partners of infected men may minimize risk or be unable to get their male partner to consistently use a condom. 139 top link

Adverse Effects of Testing and Counseling.

The diagnosis of HIV infection can have significant adverse effects, among them intense anxiety, depression, somatization, or anger. 140,141 Among 1,718 newly diagnosed patients with HIV, 21% met criteria for depression at first visit, but psychological distress was related more to symptoms than diagnosis and diminished with time. 142 Testing reduces anxiety in high-risk persons who are seronegative. Despite recent efforts to improve public perceptions and attitudes, the stigma associated with the diagnosis of HIV/AIDS is still significant. 143 Disclosure of test results can result in disrupted personal relationships, domestic violence, social ostracism, and discriminatory action, such as loss of employment, housing, health insurance, and educational opportunities. 144 Recent legislation and the expanded case definitions for AIDS may help prevent some forms of discrimination and help ensure medical care for those with more advanced infection. Information on the frequency or consequences of false-positive diagnoses are largely anecdotal. Although retesting and follow-up can resolve most errors, misdiagnosis may cause irreparable harm (divorce, abortion, etc.). Finally, negative test results may provide false reassurance unless patients are counseled about their continuing risk of infection from drug use and high-risk sexual activity. top link

Recommendations of Other Groups

Counseling and HIV testing of high-risk individuals are recommended by the CDC 14 and numerous medical organizations: the Canadian Task Force on the Periodic Health Examination (CTF), 145 the American Academy of Family Physicians, 146 the American Medical Association (AMA), 147 the American College of Obstetricians and Gynecologists (ACOG), 114 the American College of Physicians, and the Infectious Disease Society of America. 148 High-risk individuals include men who have sex with men persons seeking treatment for STDs injection drug users and their sex partners recipients of transfusion between 1978 and 1985 and persons who have had multiple sex partners or exchanged sex for money or drugs. Bright Futures 149 and the AMA Guidelines for Adolescent Preventive Services (GAPS) 150 recommend offering HIV testing to all at-risk adolescents, including those with more than one sex partner in the last 6 months.

The CDC recommends that health care facilities where the prevalence of infection exceeds 1%, or the AIDS diagnosis rate is greater than 1/1,000 hospital discharges, consider routine, voluntary screening among patients aged 15-54 years. 151 The AMA approves of routine HIV testing in the clinical setting, based on local considerations such as planned medical procedures and local seroprevalence. 147 GAPS 150 and AAFP 146 recommend offering screening to sexually active adolescents and adults from high-prevalence communities.

The U.S. Public Health Service (PHS) released new guidelines in 1995, recommending that all pregnant women be routinely counseled and encouraged to have HIV testing. 152 Similar policies have been approved by American Academy of Pediatrics (AAP) 65 and ACOG. 153 Pregnant women should be screened as soon as the woman is known to be pregnant repeat testing may be indicated near delivery for women at high risk of infection. The CTF (prior to ACTG 076) concluded there was insufficient evidence to recommend for or against routine screening in pregnancy, due to the low rate of infection in Canada. 145 In cases where the HIV serostatus of the mother is not known, PHS and AAP guidelines suggest that health care providers educate the mother about benefits to her infant and encourage her to allow testing for the newborn. 65,152 A 1991 Institute of Medicine task force (before ACTG 076) recommended voluntary screening of all pregnant women in high-prevalence areas and of high-risk women in other areas, but it found insufficient evidence to recommend routine newborn screening. 64

Both the AMA and the AAP have endorsed alternate procedures for pretest counseling and consent, including right of refusal, to facilitate routine testing in specific situations. Mandatory testing for HIV is currently required on entrance to the military, and for donors of blood, organs, and tissue federal prisoners and persons seeking to immigrate to the U.S. Individual state laws vary regarding mandatory testing, confidentiality of results, informed consent, and reporting of seropositive persons to public health officials. 67 The CDC recommends that seropositive persons be instructed how to notify their partners, but suggests that physicians or health department personnel should use confidential procedures to ensure that partners are notified. 14 top link

Discussion

Early detection of asymptomatic HIV infection can reduce morbidity and mortality in infected persons, but the long-term benefits are currently limited by the relentless course of disease in most patients. The most compelling argument for early detection is the potential to prevent transmission of HIV, which may occur over many years before infected persons develop symptoms of HIV infection. Although there is only indirect evidence that screening reduces the incidence of new HIV infections, even small changes in transmission will have important public health benefits.

Screening is most important in the high-risk groups that currently account for the large majority of AIDS cases and new HIV infections. 154 The ability of ZDV to reduce perinatal transmission of HIV provides strong evidence of the benefit of screening during pregnancy. In high-risk communities, offering HIV testing to all pregnant women is more acceptable to patients, easier to implement, and more sensitive than screening on the basis of self-reported risk factors. In areas where HIV infection is uncommon, however, the choice between targeted screening and universal screening is a policy decision. Universal screening may detect occasional cases of HIV infection among women without reported risk factors, but it will subject many women at negligible risk to the potential harms from a false-positive or indeterminate test result. If specificity of testing is 99.98%, routine screening in low-risk populations (e.g., pregnant women in low-prevalence states) will generate one false-positive and many additional indeterminate results for every true-positive. Although indeterminate and false-positive results can generally be resolved with follow-up testing, some women may decide to terminate the pregnancy before infection status can be definitively determined. Counseling and testing many thousands of low-risk women to detect a single case of HIV may also divert resources from more important issues. If routine testing can be provided with the combination of accuracy and low costs achieved by large, centralized screening programs, 37,63,155 the justification for universal screening would be stronger.

A growing number of HIV infections occur in persons who are infected through heterosexual contact. The risk of heterosexual transmission varies widely among different communities, with the highest risk among poor minority women in large cities and the rural South. In high-risk communities or clinics, routine screening of sexually active young women and men may be appropriate. In populations where prevalence of infection is low, more selective screening is likely to be more efficient: 5% of women and 12% of men report multiple sex partners within the last 12 months without consistent condom use. 156 Deciding what constitutes a "high-risk" community is a policy decision, 64 depending in part on available data and resources for testing. A prevalence of 1/1,000 among newborns, 157 or an AIDS diagnosis rate of more than 1/1,000 hospital discharges, 151 has been used to justify routine screening in other settings. Community prevalence provides only a crude measure of individual risk, however, and should not replace careful assessment of high-risk behaviors in each patient.

Screening for HIV in high-risk groups is cost-effective under a wide range of assumptions. In an analysis of federally funded programs (where HIV prevalence was 2.6%) testing and counseling activities saved money even if only one new infection is avoided for every 100 seropositive subjects identified. 158 In Sweden, where prevalence of infection is only 0.01% in pregnant women, routine prenatal HIV testing costs approximately $10 per patient and $100,000 per case identified. 63 Screening is less cost-effective in asymptomatic adults if only the benefits from early treatment are considered. 154 Revised cost-effectiveness analyses of alternate screening strategies for asymptomatic persons are needed, incorporating new data on antiretroviral therapy in pregnant and nonpregnant adults, regional variations in prevalence, targeted versus universal screening, and the costs of testing and counseling in the primary care setting.

CLINICAL INTERVENTION

Clinicians should assess risk factors for HIV infection in all patients by obtaining a careful sexual history and inquiring about drug use. Counseling and testing for HIV should be offered to all persons at increased risk for infection: those seeking treatment for sexually transmitted diseases men who have had sex with men after 1975 past or present injection drug users persons who exchange sex for money or drugs, and their sex partners women and men whose past or present sex partners were HIV-infected, bisexual, or injection drug users and persons with a history of transfusion between 1978 and 1985 ("A" recommendation).

Pregnant women in these categories, and those from communities (e.g., states, counties, or cities) where the prevalence of seropositive newborns is increased (e.g., >=0.1%) should be counseled about the potential benefit to their infant of early intervention for HIV, and offered testing as soon as the woman is known to be pregnant ("A" recommendation). Repeat testing may be indicated in the third trimester of pregnancy for women at high risk of recent exposure to HIV. There is insufficient evidence to recommend for or against universal prenatal screening for HIV in low-prevalence communities ("C" recommendation). A policy of offering screening to all pregnant women may be recommended on other grounds, including patient preference, easier implementation, and increased sensitivity compared to screening based on community prevalence and reported risk factors. Careful quality control measures and patient counseling are essential to limit the potential adverse effects from indeterminate and false-positive test results during pregnancy. Testing infants born to high-risk mothers, with permission of mother, is recommended when antibody status of mother is unknown. ("B" recommendation).

There is insufficient evidence to recommend for or against routine HIV screening in persons without identified risk factors ("C" recommendation). Recommendations to screen sexually active young women and men in high-risk communities can be made on other grounds, based on the increasing burden of heterosexual transmission and the insensitivity of screening based on self-reported risk factors. Similarly, routine HIV screening may be reasonable in groups such as prisoners, runaway youth, or homeless persons, where the prevalence of high-risk behaviors and HIV is generally high. The definition of high-risk community is imprecise. Clinicians should consult local public health authorities for advice and information on the epidemiology of HIV infection in their communities. More selective screening may be appropriate in low-risk areas. Testing should not be performed in the absence of informed consent and pretest counseling, which should includes the purpose of the test, the meaning of reactive and nonreactive results, measures to protect confidentiality, and the need to notify persons at risk. Patients who wish to be tested anonymously should be advised of appropriate testing facilities.

A positive test requires at least two reactive EIAs and confirmation with WB or IFA, performed by experienced laboratories that receive regular external proficiency testing. A separate sample should be submitted for persons found to be seropositive for the first time, to rule out possible error in specimen handling. Patients with indeterminate WB results should be evaluated individually to determine whether findings are likely to represent recent seroconversion. Repeat testing should be performed 3-6 months after indeterminate test results, or sooner if recent seroconversion is suspected. A stable indeterminate WB pattern is not indicative of HIV infection.

Seropositive patients should receive information regarding the meaning of the results, the distinctions between casual nonsexual contact and proven modes of HIV transmission, measures to reduce risk to themselves and others, symptoms requiring medical attention, and available community resources for HIV-infected persons. Clinicians should explore potential barriers to changing high-risk behavior in seropositive and seronegative individuals. Guidelines for HIV counseling have been published by the PHS. 159 Seropositive persons should be evaluated for severity of immune dysfunction and screened for other infectious diseases such as tuberculosis (see Chapter 25). Guidelines for the management of early HIV infection and prevention of opportunistic infections have been published by the Agency for Health Care Policy and Research 67 and the CDC. 97,107,111 Arrangements for follow-up medical care are especially important for drug users, who may require assistance in gaining entrance to a drug treatment program (see Chapter 53). All seropositive individuals should be encouraged to notify sex partners, persons with whom injection needles have been shared, and others at risk of exposure. Seropositive cases should be reported confidentially or anonymously to public health officials in accordance with local regulations.

Persons with nonreactive test results should be informed that the risk of acquiring subsequent HIV infection can be prevented by maintaining monogamous sexual relationships with uninfected partners. Other measures to reduce the risk of infection (consistent use of condoms, etc.) should be specifically mentioned (see Chapter 62). The frequency of repeat testing of seronegative individuals is a matter of clinical discretion. Periodic testing is most important in patients who continue high-risk activities. In patients with recent high-risk exposure (e.g., sex with HIV-infected partner), repeat testing at 3 months may be useful to rule out initial false-negative tests.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins MD, MPH. top link

29. Screening for Chlamydial Infection --- Including Ocular Prophylaxis in Newborns

Burden of Suffering

Infection with C. trachomatis is the most common bacterial sexually transmitted disease (STD) in the U.S., affecting an estimated 4 million persons at a cost of $2.4 billion each year. 1,2 The medical consequences and costs of infection are greatest in women, who may develop urethritis, cervicitis, or pelvic inflammatory disease (PID i.e., salpingitis or endometritis). Chlamydial infections are responsible for 25-50% of the 2.5 million cases of PID that are reported annually in the U.S. 3 PID is an important cause of infertility and ectopic pregnancy in American women and may lead to chronic pelvic pain. Data from other countries suggest that infection with chlamydia may be a cofactor in heterosexual transmission of HIV infection. 4 In men, chlamydia is responsible for 30-40% of the 4-6 million visits each year for nongonococcal urethritis and half of over 150,000 cases of acute epididymitis. 1

Up to 25% of men and 70% of women with chlamydial infection are asymptomatic. 5 Immunologic surveys suggest that chlamydial infection increases the risk of infertility and ectopic pregnancy even in women who never develop clinical PID, most likely because the symptoms of salpingitis may be mild or nonspecific. 1 Asymptomatic infections in men and women also serve as an important reservoir for new infections.

Age is the strongest demographic predictor of chlamydial infection. Men and women under 25 account for the large majority of cases, 6 and prevalence of infection is highest among young women age 15-19. Although risk factors for chlamydia are similar to those for other STDs, chlamydia is distinct in that the prevalence of infection is substantial (>5%) among sexually active female adolescents in general, regardless of race, place of residence, or socioeconomic status. 1,7 For example, infection was present in 5-8% of North American female college students at student health clinics 8,9 and 8-26% of teenage girls attending adolescent clinics. 10,11 The high risk in young women probably reflects both behavioral and physiologic factors (increased exposure of cervical columnar epithelium in young women). 12 Other important risk factors for chlamydial infection include having multiple sex partners, a new sex partner, or an infected sex partner inconsistent use of barrier contraceptives and cervical ectopy on examination. 1,7,13-18 Among 1,800 women ages 15-34 screened in a health maintenance organization, marital status was the single strongest predictor of infection: prevalence was less than 1% among married women, 7% among single women, and 3-4% among those divorced or living as married. 15 Chlamydial infection is more prevalent among blacks than among whites or Hispanics. 15,19 In routine screening, women with vaginal discharge, cervicitis, or cervical friability (i.e., bleeding induced by swab) were more likely to be infected. 7,15 Chlamydial infection is common among women with other STDs, incarcerated women, 20 and women seeking abortions. 21 In high-risk urban communities, chlamydia was detected in 6-11% of asymptomatic, sexually active male adolescents. 22,23

The overall prevalence of chlamydial infection among pregnant women in the U.S. is estimated to be about 5%, but it varies widely (0-37%), depending on age and other risk factors. 24 Many sites serving younger women and high-risk urban communities have reported a substantially higher prevalence of infection (10-25%). 25,26 Infection during pregnancy increases the risk of endometritis, both after delivery and after elective abortion. 1,27 Each year more than 155,000 infants are born to chla-mydia-infected mothers, and the organism is transmitted to the fetus in over half of deliveries. 24 Neonatal infection can result in ophthalmia neonatorum and pneumonia. top link

Accuracy of Screening Tests

The most specific test for chlamydial infection in asymptomatic persons is culture. Urethral and endocervical cultures have been estimated to have a sensitivity of about 70-90% and a specificity of 100%. 1,22 In addition to its variable sensitivity, culture is expensive, not uniformly available, requires careful handling of specimens, and takes 3-7 days for results. In one study, one fourth of women with positive cultures did not return for therapy. 28 In men, screening with culture requires obtaining specimens with urethral swabs, which is unacceptable to many asymptomatic men. 22

A variety of nonculture tests are now available, offering the advantages of easier handling and processing, lower costs, wider availability, and more timely results. Commercially available tests employ enzyme immunoassay (EIA), direct fluorescent antibody (DFA), DNA probe, polymerase chain reaction (PCR), or solid-phase colorimetric assays 28 to detect chlamydia in urethral or cervical specimens. Tests using ligase chain reaction (LCR) are awaiting Food and Drug Administration (FDA) approval. 29 Of these tests, EIA and DFA tests have been most widely evaluated, with reported sensitivities of 70-90% and high specificity (97-99%). 1 False-positive EIA results may result from cross-reaction with other vaginal flora or urinary pathogens, but confirmation of positive tests using blocking antibody increases specificity to close to 100%. Studies in STD clinics indicate that DNA probe, PCR, and LCR can each be very sensitive and specific (>95%). 30,31 Sensitivity of commercial PCR and DNA probe kits was significantly lower (60-75%) in some studies, 32,33 however, and the performance of these assays for screening asymptomatic patients needs further evaluation. The arrival of competitively priced, commercial kits is likely to make these increasingly popular alternatives to chlamydia culture.

The ability to detect chlamydial infection in centrifuged, first-void urine specimens may make screening asymptomatic men more feasible. 22-24 Urine dipsticks can detect leukocyte-esterase (LE) activity, an indicator of urethritis or upper urinary infections. However, the sensitivity of LE testing for chlamydial infection is variable (40-100%), 1 and the low predictive value of LE in asymptomatic young men (11% in one study 22 ) necessitates use of confirmatory tests. Testing urine specimens with EIA is more sensitive (77-91%) and specific (97-100%), but it substantially increases the cost per confirmed case. 22,34 PCR and LCR assays appear to have the highest sensitivity and specificity (95-99%) for chlamydia using urine specimens. 23,29,35 A recent study reported that LCR assays of urine were also very sensitive and specific for chlamydial infection in women (94% and 99.9%, respectively). 36

Even with highly specific tests, the likelihood that a positive result indicates true infection varies with the prevalence of infection in the population being screened. Assuming a sensitivity of 80% and specificity of 98%, the positive predictive value of a test will range from 82% when prevalence of chlamydia is high (10%), to only 45% when prevalence is low (2%). As a result, independent confirmation of positive results from some nonculture tests may be necessary to prevent false-positive results in low-risk patients.

In prospective studies of screening in low-risk populations, risk scores based on age, other risk factors, and findings on physical examination successfully identified a subpopulation of high-risk women (prevalence 6% or higher) who accounted for the large majority of all infections. 15,16,37 top link

Effectiveness of Early Detection

Early detection of chlamydial infections in asymptomatic persons permits initiation of antibiotic therapy to eradicate infection. The benefits of detecting and treating asymptomatic infection in pregnancy have been demonstrated in several large cohort studies of high-risk women screened at the first prenatal visit. 26,38 Infected women who received erythromycin had significantly lower rates of preterm delivery, rupture of membranes, and low birth weight compared to infected women who were untreated or treatment failures. In one study, treatment was associated with lower perinatal mortality among children. 26 Some of the benefit may have been due to effects of erythromycin on pathogens other than Chlamydiaor underlying differences between treated and untreated women.

Eradication of asymptomatic infection is also likely to reduce the complications of chlamydial infection in nonpregnant women. Proving a benefit on long-term se-quelae of infection (e.g., infertility and ectopic pregnancy) is difficult, but a recent trial in a large health maintenance organization demonstrated that at-risk women randomized to receive routine chlamydia screening were less than half as likely to develop PID over the next year (1% vs. 2.2%). 37 Hospitalizations for PID also declined in Sweden in association with increased chlamydia screening, but other changes in sexual behavior are likely to have contributed to this trend. 39 Treatment effectively eradicates chlamydial infection, but it has traditionally required an extended course of medication. A 7-day course of tetracycline or doxycycline results in a short-term cure in 92-100% of women and 97-100% of men. 1 Single-dose therapy with azithromycin is as effective as doxycycline and may be a suitable alternative when noncompliance is a concern. 40 The benefits of early detection are limited by high rates of reinfection or treatment failure in some populations. 1 In follow-up studies of adolescent women treated for chlamydia, 26-39% are infected 2-5 years later. 41,42 Treatment failures are usually due to failure to treat sex partners, noncompliance with therapy, or reinfection. Referral of sex partners of cases is important, since up to one third of male partners, and a majority of female partners, are infected. 5

Chlamydia may cause epididymitis, but serious complications of chlamydial infection are uncommon in men. Although screening and treating high-risk young men has the potential to reduce the incidence of chlamydia, the impact of routine screening in men has not been examined prospectively, or compared to the current strategy of screening women and treating male partners. A variety of other factors will influence whether screening men will significantly reduce the incidence of new infections: duration of asymptomatic period, rates of transmission from asymptomatic men to their female partners, compliance with treatment, and rates of re-infection in young men. top link

Ocular Prophylaxis in Newborns

Between 20% and 50% of all infants born to infected mothers develop chlamydial conjunctivitis, but there is conflicting evidence of the benefit of universal ocular prophylaxis with topical antibiotics (erythromycin, tetracycline, or silver nitrate) after birth to reduce the incidence of chlamydial ophthalmia neonatorum. 43-45 In a recent trial in Kenya, where maternal chlamydial infection is common, povidone-iodine was significantly more effective than erythromycin or silver nitrate for preventing chlamydial conjunctivitis in newborns. 46 The failure rate of ocular prophylaxis for chlamydia has been estimated to be 7-19%, and chlamydial ophthalmia (unlike gonococcal ophthalmia) is rarely associated with serious ocular complications. 45 In a trial among infants born to low-risk American women, prophylaxis with silver nitrate or erythromycin reduced the incidence of conjunctivitis compared to placebo (8-9% vs. 15%) regardless of treatment, however, most cases were mild and due to organisms other than chlamydia 47 top link

Recommendations of Other Groups

Screening for chlamydia in asymptomatic sexually active female adolescents (under 20 years old), and in other women with risk factors for infection, is recommended by the Centers for Disease Control and Prevention (CDC), 1 the American College of Obstetricians and Gynecologists (ACOG), 48 the American Academy of Pediatrics (AAP), 49 Bright Futures, 50 the American Medical Association, 51 the American Academy of Family Physicians (AAFP), 52 and the Canadian Task Force for the Periodic Health Examination (CTF) 53 AAFP recommendations are under review. Some of these organizations also make these recommendations for adolescent males and young men at high risk. Risk factors cited by various organizations include age under 25, new or multiple sex partners in the past 3 months, inconsistent use of barrier contraception, the presence of mucopurulent cervicitis or cervical friability, the diagnosis of other STDs, and others. An expert panel convened in 1994 by the Institute of Medicine, National Academy of Sciences, is developing recommendations for public health strategies to control STDs, including chlamydia.

The CTF recommends that all pregnant women be screened for asymptomatic chlamydial infection. 53 Both ACOG and CDC recommend screening with chlamydial culture in high-risk pregnant women (including those under age 25), at the initial prenatal visit and/or in the third trimester. 1,48 No major organization recommends routine screening of the general population. The CDC, CTF, AAP, and AAFP all recommend routine ocular antibiotic prophylaxis for all newborns, primarily to prevent ophthalmia neonatorum due to Neisseria gonorrhoeae rather than Chlamydia (see Chapter 27). 1,45,49,52 Ocular prophylaxis is required by law in most states in the U.S. top link

Discussion

The substantial long-term morbidity from chlamydia in women, the high prevalence of asymptomatic infection, and the availability of reliable screening tests and effective treatments all suggest that screening for asymptomatic chlamydial infection may be a useful strategy. There is now preliminary evidence from one trial that screening high-risk asymptomatic, nonpregnant women can reduce the incidence of PID. Screening and treatment of infected women and their partners is also likely to reduce the incidence of new infections, although conclusive proof of this is not available. While high-risk sexual behavior is an important determinant of risk of chlamydial infection, the generally high prevalence of chlamydia among sexually active female adolescents supports routine screening in this population. 7

The optimal criteria for screening other women depend on the local burden of disease and resources available for screening. Risk of infection depends on both individual sexual behavior and the prevalence of chla-mydia in the community. Where the prevalence of infection is documented to be low (<5%), targeting screening to women with multiple risk factors for infection may be most efficient. Because self-reported sexual history is often an unreliable indicator of risk, however, broader screening of young women may be preferable in practices or communities where chlamydia is highly prevalent.

There is fair evidence that treatment of chlamydial infection during pregnancy is associated with improved outcomes for both infants and mothers. Due to the low prevalence of infection in women who are older or married, universal screening is not indicated in pregnancy. Since the primary benefit of treatment in pregnancy is to prevent perinatal and postpartum complications, screening high-risk women in the third trimester is likely to be effective and reduce the opportunity for reinfection prior to delivery. Although ocular prophylaxis appears to reduce the risk of chla-mydial ophthalmia neonatorum, screening and treating high-risk mothers may be a more effective means of preventing chlamydial infections in newborn infants.

Screening is less likely to benefit asymptomatic men, but screening young men using urine-based tests (LE, EIA, or PCR) may be a useful strategy to prevent spread of infection in communities where chlamydia is common. Whether routine screening in men is effective in reducing the incidence of chlamydial infection deserves further study, however.

Nonculture methods are appropriate alternatives to cell culture for diagnosis of infection. The choice of optimal testing strategy will depend on available resources, the prevalence of chlamydia, and the potential adverse consequences of false-positive diagnoses. Newer methods such as PCR or LCR are likely to offer advantages due to improved sensitivity and specificity further evaluations of new commercial test kits are needed in asymptomatic populations before specific recommendations can be made.

Cost-effectiveness analyses have concluded that screening for chlamydia with nonculture tests is cost-effective during routine gynecologic visits 54-56 and during pregnancy 57 when prevalence of infection exceeds 6-8%. Others have suggested that screening is cost-effective at even lower prevalence. 13 Screening asymptomatic adolescent males with urine-based tests was calculated to be cost-saving, primarily by reducing infections in female partners, but has not been compared to the current strategy of screening women. 34

CLINICAL INTERVENTION

Routine screening for asymptomatic infection with Chlamydia trachomatis during pelvic examination is recommended for all sexually active female adolescents and for other women at high risk for chlamydial infection ("B" recommendation). Patient characteristics associated with a higher prevalence of infection include: history of prior STD, new or multiple sex partners, age under 25, inconsistent use of barrier contraceptives, cervical ectopy, and being unmarried. Actual risk will depend on number of risk factors and local epidemiology of chlamydial infection. Clinicians may wish to consult local public health authorities for guidance in identifying high-risk populations within their community. Algorithms to identify high-risk women have been published. 15,16 In clinical settings where the prevalence of infection is known to be high (e.g., some urban family planning clinics), routine screening of all women is appropriate. Clinicians should remain alert for findings suggestive of chlamydial infection (e.g., mucopurulent discharge, cervical erythema, or cervical friability) during pelvic examination of asymptomatic women.

Pregnant women at high risk of infection (including age under 25) should be tested for chlamydia ("B" recommendation). The optimal timing of screening in pregnancy is uncertain. There is insufficient evidence to recommend for or against screening all women during pregnancy ("C" recommendation).

There is insufficient evidence to recommend for or against routine screening in high-risk men ("C" recommendation). In clinical settings where asymptomatic infection is highly prevalent in men (e.g., urban adolescent clinics), screening sexually active young men may be recommended on other grounds, including the potential to prevent transmission to uninfected sex partners. Routine screening for chlamydia is not recommended in the general population of low-risk adults ("D" recommendation).

In women, endocervical specimens should be obtained for cell culture or nonculture assays. Verification of positive nonculture results may be necessary, depending on the underlying risk in the patient and potential adverse consequences of a false-positive result. The choice of screening test for asymptomatic men is left to clinical discretion. Urine LE dipstick is much less expensive than urine assays using EIA, PCR, or LCR, but it is also less sensitive and specific for asymptomatic chlamydial infection. The optimal frequency of testing has not been determined for women or men and is left to clinical discretion.

Routine ocular antibiotic prophylaxis with silver nitrate, erythromycin, or tetracycline is recommended for all newborn infants to prevent ophthalmia neonatorum due to gonorrhea and is required by law in most states (see Chapter 27). There is insufficient evidence to recommend for or against universal ocular prophylaxis of newborns solely for the prevention of chlamydial conjunctivitis ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH, based in part on background materials prepared by H. Oladele Davies, MD, MSc, FRCPC, and Richard B. Goldbloom, MD, FRCPC, for the Canadian Task Force on the Periodic Health Examination.top link

30. Screening for Genital Herpes Simplex

Burden of Suffering

Primary genital herpes simplex virus (HSV) infection occurs in approximately 200,000-500,000 Americans each year, 1 mostly in adolescents and young adults. Between 25 and 31 million individuals are chronically infected. 1,2 Both HSV types 1 (HSV-1) and 2 (HSV-2) can infect the genitalia, but HSV-2 causes the majority of primary and recurrent genital herpes infections. 3 Most HSV-2 infections are asymptomatic, detected only by seroconversion 4,5 16% of the adult population is HSV-2 seropositive. 2 In symptomatic genital herpes, the chief clinical morbidity is painful, pruritic vesicles that may coalesce into large ulcerative lesions. 3 Systemic symptoms, such as fever, headache, myalgia, and malaise, are reported by two thirds of patients with primary first-episode genital herpes, and serious complications such as meningitis (reported in 8%) may ensue. 3 After initial infection, the virus enters a latent state in spinal cord ganglia. Infected persons may periodically experience viral reactivations that can be asymptomatic, characterized by viral shedding alone, or symptomatic, marked by a recurrence of signs and symptoms that are less severe than those of primary genital herpes. 3 The sexual contacts of individuals with either symptomatic or asymptomatic disease are at risk of becoming infected. 6

In pregnant women, the rate of genital herpes reported as a maternal risk factor on birth certificates is 8.0/1,000 live births. 7 Pregnant women with genital HSV infection can transmit the virus to their newborns. The majority (82-87%) of neonatal infections occur during delivery, but some also occur in utero or postnatally. 8-10 In 1984 it was estimated that the minimum annual incidence of neonatal HSV infections in the United States, based on voluntary reporting, was 4/100,000 live births 11 intensive local surveillance in one county in Washington found a rate of 12/100,000 live births. 12 One fourth of HSV-infected neonates develop disseminated disease and one third have encephalitis. 13,14 Even with antiviral treatment, the mortality rate is 57% among infants with disseminated disease and 15% among those with encephalitis. 13 Severe neurologic impairment occurs in about one third of those who survive encephalitis or disseminated disease. 13,14 Among infants with infection apparently limited to mucocutaneous involvement, death or severe impairment is rare but other complications such as visual impairment or seizures occur in about 5%. 13,14 top link

Accuracy of Screening Tests

History and physical examination are not adequate screening tests for either active (i.e., transmissible) or latent genital HSV infection, because most infected persons are asymptomatic 4,5,15 their clinical manifestations may resemble a number of other causes of genital ulcerations 16 and viral shedding in association with recurrent disease may be asymptomatic. 3,17

The most commonly used test for detecting active genital HSV infection is viral culture. The sensitivity of this test is variable, however, depending upon the viral titer present, ranging in one study from 93% for vesicles to 72% for ulcers and 27% for crusted lesions, and from 82% for ulcerative lesions in first episodes to 43% for ulcerative lesions in recurrent episodes. 16,18 Since the viral titer in asymptomatic shedding is 10-100 times less than that in symptomatic episodes, 17 the sensitivity of viral culture for detecting HSV infection in asymptomatic individuals is likely to be low. In addition, conventional viral culture is time-consuming and technically demanding, and only 40-48% of positive results are available within 24 hours. 19-21 Viral culture techniques may be modified to produce final test results within 16-24 hours, but sensitivity is then reduced by 5-20% compared with final conventional culture results in symptomatic patients 20-24 sensitivity is likely to be reduced further in asymptomatic patients.

Other rapid screening methods, such as cytology and direct fluorescent antibody staining, are widely available but are substantially less sensitive than is conventional viral culture. 6 Newer methods, not yet licensed for clinical diagnostic testing for HSV, include enzyme immunoassay (EIA), polymerase chain reaction (PCR), and DNA hybridization. EIA and PCR show concordance of >93% with the results of conventional viral culture in symptomatic women. 25-28 In one study in asymptomatic pregnant women, PCR had a reported concordance of 100% with conventional culture. 27 EIA, however, had a concordance of only 59% with conventional viral culture in a large study of samples taken primarily from "presumed asymptomatic" pregnant women. 25 EIA can provide results within several hours, whereas even with automated techniques PCR currently requires more than a day to process and is extremely labor intensive. Both EIA and PCR may react with nonviable virus or viral particles, 25-28 thus overestimating the risk of infectivity. Results from studies of DNA hybridization appear less promising, with a sensitivity of 25% and specificity of 88% compared to conventional culture, 22 and 43% and 71%, respectively, compared to cytology, 29 on samples from symptomatic patients.

An estimated 35-80% of infants with neonatal herpes are born to women with no known history of genital herpes or physical signs of infection at delivery. 10,12,13 Therefore, screening asymptomatic pregnant women has the potential of identifying unrecognized active HSV infections. In order for routine screening at the onset of labor to be useful for clinical decision making regarding surgical or medical intervention to prevent neonatal herpes, rapid and accurate methods of detecting asymptomatic HSV infections likely to be transmitted would be needed. The yield of routine screening by viral culture in asymptomatic pregnant women is quite low large-scale screening studies have isolated HSV by culture from only 0.20-0.35% at the time of delivery. 30-32 A positive viral culture does not necessarily mean an infant will become infected during delivery. The risk of acquiring neonatal herpes infection from an asymptomatic pregnant woman with active viral shedding from reactivated disease is less than 5%, whereas from a woman with first-episode genital disease the risk is 33%. 31 A negative culture, however, does not eliminate an infant's risk of infection. In one large cohort study, the mothers of 30% (3 of 10) of the infected newborns were culture negative at the time of delivery, 31 and in a case series of infants with neonatal herpes, 61% (54 of 89) of the pregnant women had negative cultures within the 2 weeks before delivery. 33 In asymptomatic women with a history of recurrent herpes, surveillance cultures during the 4 weeks before delivery did not correlate with viral shedding at delivery. 34 Thus, screening near term is not adequate to predict accurately the likelihood of HSV transmission from asymptomatic pregnant women to their offspring.

Antibody testing can accurately distinguish HSV-seropositive from HSV-seronegative persons and therefore may be useful to detect asymptomatic carriers at potential risk for transmitting disease, as well as persons susceptible to primary infection. Commercial assays are insensitive to recent infections, however, and they are unreliable for distinguishing HSV-2 from HSV-1 antibodies. 35,36 Antibody test results do not indicate whether the virus is currently capable of being transmitted. top link

Effectiveness of Early Detection

The detection of HSV infection in asymptomatic, nonpregnant individuals would be useful if treatment were available to either eradicate latent HSV infection or to prevent transmission to sex partners by eliminating or reducing viral shedding. There is currently no effective treatment for eradicating latent herpes infection. Both episodic and continuous oral acyclovir reduce viral shedding, lesion healing time, and local and systemic symptoms during symptomatic primary first-episode and recurrent genital HSV infections. 37-42 When used continuously for up to 4 years, oral acyclovir produces only minor side effects and minimal emergence of resistant strains in immunologically normal individuals. 41-43 The beneficial effects of acyclovir on lesion healing and viral shedding in symptomatic individuals have not been documented to prevent or reduce transmission to sex partners, however. Based on a single, small before-after study, oral acyclovir does not appear to prevent asymptomatic viral shedding, 44 and no studies have evaluated its ability to decrease infectivity and disease transmission during episodes of asymptomatic shedding.

Routine screening for HSV-2 antibodies may be useful to identify persons with previously unrecognized infection, 4 who could then be instructed in the recognition of recurrent episodes. Such instruction results in recognition of clinically symptomatic genital herpes on follow-up in 50% of seropositive persons with previously unrecognized infection. 45 Counseling seropositive persons to avoid sexual activity or to use condoms during symptomatic episodes may reduce transmission of herpes to their sex partners. 45,46 Among a series of 144 couples with one partner with recurrent herpes and one without antibody, all of whom were advised to abstain from skin-to-skin contact during active episodes and about the risks of transmission during asymptomatic periods, acquisition of genital herpes occurred in 6% who used barrier contraception and 14% who did not (p = 0.19), but only 15% of couples used condoms routinely. 47 Although not specifically designed to evaluate counseling, this study suggests a limited benefit from knowledge of susceptibility. The effectiveness of this strategy in preventing HSV-2 transmission has not been evaluated adequately it may not provide any incremental benefit over routine counseling of all sexually active adults regarding prevention of sexually transmitted diseases. 48

The early detection of active HSV infection may be of greater importance during pregnancy because cesarean delivery can be performed. This has the potential to reduce the exposure of the neonate to virus in the birth canal that occurs during vaginal delivery, although the evidence for the effectiveness of this intervention is limited. Small, uncontrolled case series of symptomatic women with positive genital cultures during the 1-2 weeks before delivery 49,50 or with positive cervical cultures at the time of delivery 51 suggest a protective effect of cesarean deliveries no controlled trials have evaluated this intervention. None of these studies differentiated primary from recurrent infections, which have different rates of HSV transmission. Cesarean delivery is clearly not completely effective, since large case series of newborns infected with HSV reveal that 19-33% of them were delivered by cesarean delivery. 11,33,52 Information concerning the effectiveness of cesarean delivery in preventing neonatal herpes transmission by asymptomatic pregnant women comes from a large cohort study that screened such women by viral culture during early labor. 31 In this study, 8% (1 of 13) of infants delivered by cesarean delivery to culture-positive women became infected, compared to 14% (6 of 43) of infants delivered vaginally to culture-positive women. Drawing conclusions from this study is difficult, however, because sample size was insufficient to establish statistical significance reasons for selection of vaginal delivery are not given and differences between the two groups in the proportions of primary versus recurrent infections, site of positive culture (i.e., cervical vs. other), and duration of rupture of membranes are not delineated. Thus, the benefit of cesarean delivery in either symptomatic or asymptomatic culture-positive women is not established.

Even if cesarean delivery does offer some benefit in preventing the transmission of HSV to newborns, more definitive studies would be needed to determine the proper indications for abdominal delivery. For example, it is not clear whether cesarean delivery would be indicated when the risk of herpes transmission is low, e.g., in the setting of asymptomatic viral shedding, recurrent symptomatic disease, or when labial but not cervical cultures are positive. 31,34,51,53 In these relatively low-risk situations, the potential benefit to the fetus of averting HSV infection may not outweigh the known risk of complications in the mother and infant due to cesarean delivery. In cohort studies, cesarean delivery has been associated with increases in both maternal morbidity and mortality compared to vaginal delivery, 54-56 even when stratified by maternal diagnosis. A 1993 decision analysis model calculated that cesarean delivery for herpes lesions at delivery in women with recurrent genital HSV leads to 1,580 excess (i.e., performed solely to prevent HSV transmission) cesarean deliveries for every neonate saved from death or neurologic sequelae, and 0.57 maternal deaths for every neonatal death prevented total costs were $2.5 million per case of HSV averted, and $203,000 per quality adjusted life-year (QALY) gained. 57 These estimates are sensitive to risk of vertical transmission (estimated to be 1%) and to the efficacy of cesarean delivery (estimated to be 80%) reductions in either of these could result in maternal deaths exceeding neonatal mortality. The decision analysis results change dramatically if only women with primary HSV infections are entered. In women with herpes lesions at delivery but no previous history of genital HSV, nine excess cesarean deliveries would be performed for every neonate saved, with 0.004 maternal deaths per neonatal death prevented, at a total savings of more than $38,000, saving $2600 per QALY gained. Net benefits persisted across all likely ranges of values entered into the model.

Serologic screening may prove useful for the prevention of primary HSV-2 infections in pregnancy. One study screened pregnant women and their partners for type-specific antibodies to herpes, and found that 10% (18 of 190) of the women were seronegative with seropositive partners, and therefore were at risk of contracting a primary HSV-2 infection during pregnancy 7 of 18 couples continued to have unprotected intercourse after being informed of their serologic status, and 1 of the 7 seroconverted during pregnancy. 5 Studies evaluating the effectiveness of counseling such couples to abstain from sexual intercourse or to use condoms regularly during pregnancy to prevent neonatal herpes transmission have not been performed.

Another potential strategy for preventing the transmission of HSV to newborns is offering prophylactic acyclovir to pregnant women with recurrent herpes. A case series of 15 pregnant women with recurrent genital herpes demonstrated that suppressive treatment with acyclovir after 38 weeks of gestation was well tolerated with no toxicity to the mothers or infants. 58 None of the women experienced new symptomatic recurrences or asymptomatic viral shedding after beginning treatment and none of their infants developed neonatal infection. In a pilot randomized controlled trial, women with recurrent herpes who received acyclovir continuously at least 1 week before expected term had significantly fewer HSV recurrences/positive cultures and a significantly lower rate of cesarean delivery for herpes. 59 Four randomized controlled studies are currently being conducted, in the United States, Norway, and England, to determine the effectiveness and safety of prophylactic acyclovir in reducing the risks of asymptomatic shedding, cesarean delivery, and neonatal transmission when given in late pregnancy to women with histories of recurrent herpes (H. Watts, personal communication, July 1995 L. Scott, personal communication, July 1995). 60,61 Although acyclovir has not been found to be teratogenic in standard animal testing, and no recognizable pattern of birth defects has been detected among 601 reported cases of exposure during pregnancy, current data are only sufficient to exclude a teratogenic risk of at least 2-fold over the 3% baseline risk of birth defects. 62,63 top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists, 64 the American Academy of Pediatrics, 65 the Canadian Task Force on the Periodic Health Examination, 66 and the Infectious Disease Society of America 67 recommend against surveillance cultures for herpes infections in asymptomatic pregnant women. All four groups suggest careful examination of all women at the time of delivery and culture of active lesions, with cesarean delivery for women with positive findings on clinical examination. 64-67 No organizations currently recommend screening for genital herpes simplex virus or antibody in the asymptomatic general population. top link

Discussion

There are currently no commercially available tests that are adequate to detect latent HSV-2 infections in asymptomatic patients. Even if accurate type-specific serology becomes widely available, there is no proven treatment to eradicate latent infection or to eliminate viral shedding in order to prevent disease transmission. Similarly, there is limited evidence that counseling persons known to have HSV offers any benefit over routine counseling of all sexually active adults to prevent sexually transmitted diseases. Evidence does not therefore support screening the asymptomatic general population for HSV infection.

For pregnant women (and those planning conception), the potential benefit of detecting asymptomatic and unrecognized HSV infection is the prevention of neonatal HSV transmission. The risk of transmitting HSV to their infants is slightly increased in pregnant women with asymptomatic shedding of HSV due to reactivated disease at delivery, and it is substantially increased in women with primary HSV infection at delivery. Culture results at the onset of labor are rarely available in time to affect clinical decision making, and there is good evidence that positive viral cultures in the weeks prior to delivery do not accurately predict the risk of neonatal HSV transmission. More rapid tests that could be performed at the onset of labor are either substantially less sensitive than culture or not yet widely available. Women with primary first-episode HSV infection at delivery are more likely to present with symptoms and signs detectable by physical examination, but such examinations have not been shown to be sensitive or specific. Even if the diagnosis of HSV is made by physical examination during labor, the evidence supporting the effectiveness of cesarean delivery in preventing neonatal HSV transmission is of poor quality, while there is fair evidence that cesarean delivery increases risk to the mother and fetus compared to vaginal delivery. A recent decision analysis predicts that if cesarean delivery prevents 85% of neonatal HSV infections that occur following vaginal delivery, a physical examination at labor for symptoms or signs of genital herpes would minimize the ratio of excess cesarean deliveries to cases of neonatal HSV infection averted, compared to other screening methods or no screening. 68 Another model that evaluated performing a physical examination at delivery, followed by cesarean delivery for women with genital herpes lesions, found clear evidence of benefit only for women with no history of genital herpes. 57 For women with recurrent herpes, the risk to the mother may outweigh that to the neonate, depending on assumptions made about the efficacy of cesarean delivery and the likely HSV transmission rate. The use of acyclovir in pregnancy to reduce neonatal HSV has not been adequately evaluated, but trials are ongoing.

Although a history of genital herpes does not accurately predict HSV seropositivity, if the pregnant woman who lacks such a history has a partner known to have genital herpes, counseling to prevent HSV transmission to the woman could prevent primary HSV infection, thereby preventing neonatal HSV at little cost or risk to the patient. When commercially available, HSV serotyping at the first prenatal visit with serotesting of the partners of those who are HSV-2 seronegative would allow more accurate detection of pregnant women at risk for primary HSV infection. The effectiveness of counseling such women regarding primary prevention has not been demonstrated, however.

CLINICAL INTERVENTION

Routine screening for genital herpes simplex in asymptomatic persons, using culture, serology, or other tests, is not recommended ("D" recommendation). See Chapter 62 for recommendations on counseling to prevent sexually transmitted diseases.

Routine screening for genital herpes simplex infection in asymptomatic pregnant women, by surveillance cultures or serology, is also not recommended ("D" recommendation). Clinicians should take a complete sexual history on all adolescent and adult patients (see Chapter 62).

As part of the sexual history, clinicians should consider asking all pregnant women at the first prenatal visit whether they or their sex partner(s) have had genital herpetic lesions. There is insufficient evidence to recommend for or against routine counseling of women who have no history of genital herpes, but whose partners do have a positive history, to use condoms or abstain from intercourse during pregnancy ("C" recommendation) such counseling may be recommended, however, on other grounds, such as the lack of health risk and potential benefits of such behavior.

There is also insufficient evidence to recommend for or against the examination of all pregnant women for signs of active genital HSV lesions during labor and the performance of cesarean delivery on those with lesions ("C" recommendation) recommendations to do so may be made on other grounds, such as the results of decision analyses and expert opinion. There is not yet sufficient evidence to recommend for or against routine use of systemic acyclovir in pregnant women with recurrent herpes to prevent reactivations near term ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Paul Denning, MD, MPH, and Carolyn DiGuiseppi, MD, MPH. top link

31. Screening for Asymptomatic Bacteriuria

Burden of Suffering

Asymptomatic bacteriuria is defined as a significant bacterial count (usually >=105 or 106 organisms/mL) present in the urine of a person without symptoms. Asymptomatic bacteriuria may precede symptomatic urinary tract infection, characterized by dysuria, frequency, pain, fever, etc., which accounts for over 6 million outpatient visits each year. 1 Urinary tract infection may be associated with renal insufficiency and increased mortality in adults, but these complications rarely occur among those without underlying structural and functional diseases of the urinary tract. 2 In both institutionalized and noninstitutionalized elderly, urinary tract infection is the most common cause of bacteremia, which may be associated with a 10-30% case fatality rate. 3,4 Most such bacteremia occurs in residents with indwelling catheters or urinary tract abnormalities, however. Similarly, most of the 300,000 hospitalizations each year for urinary tract infections 1 involve patients with indwelling urethral catheters.

In children, asymptomatic bacteriuria may be a sign of underlying urinary tract abnormalities. About 10-35% of infants and children with asymptomatic bacteriuria have vesicoureteral reflux and 6-37% have renal scarring or other abnormalities (the lower prevalences generally reflecting more stringent definitions of abnormality), 2,5-8 whereas such abnormalities are uncommon in the general population of children. 2,9 Children with major structural abnormalities, chronic pyelonephritis, or severe vesicoureteral reflux are at increased risk of renal scarring, obstructive renal atrophy, hypertension, and renal insufficiency. 2 Pyelonephritis, reflux nephropathy, and urinary tract malformations may cause as much as one fifth of cases of renal failure in children. 10 In pregnancy, 13-27% of untreated women with asymptomatic bacteriuria develop pyelonephritis, usually requiring hospitalization for treatment. 11-14 Bacteriuria in pregnant women increases the risk for preterm delivery and low birth weight about 1.5-2-fold, and may also increase the risk of fetal and perinatal mortality. 15-23

The risk of acquiring bacteriuria varies with age and sex. Asymptomatic bacteriuria in term infants is more common in males (estimated prevalence of 2.0-2.9% vs. 0.0-1.0% in females), but it is considerably more common in girls after age 1 (0.7-2.7% in girls vs. 0.0-0.4% in boys). 2,5-8,24 Approximately 5-6% of girls have at least one episode of bacteriuria between first grade and their graduation from high school, and as many as 80% of these children experience recurrent infections. 2 Asymptomatic bacteriuria in adulthood is more prevalent in women than men (3-5% vs. <1% in those under 60 years), and its prevalence increases with age. 25-27 Asymptomatic bacteriuria is a common finding in older persons, especially those who are very old (20% of women and 10% of men >80 years old living in the community) or institutionalized (30-50% of women and 20-30% of men). 3,4 Bacteriuria occurs in 2-7% of pregnant women of those who are not bacteriuric at initial screening, 1-2% will develop bacteriuria later in the pregnancy. 28-30 An increased prevalence of asymptomatic bacteriuria (about 10-20%) has been reported in asymptomatic diabetic women, although several studies have found no increase when compared to matched nondiabetic controls or to expected age- and sex-specific population rates. 2,31-34 top link

Accuracy of Screening Tests

The most accurate test for bacteriuria is urine culture, but laboratory charges make this test expensive for routine screening in populations that have a low prevalence of asymptomatic bacteriuria. The most commonly used tests for detecting bacteriuria in asymptomatic persons are dipstick urinalysis and direct microscopy. The dipstick test is rapid, inexpensive, and requires little technical expertise. The dipstick leukocyte esterase (LE) test, which detects esterases released from degraded white blood cells, is an indirect test for bacteriuria. When compared with culture (at least 100,000 organisms/mL), it has a sensitivity of 72-97% and a specificity of 64-82%. 35-40 The nitrite reduction test, which detects nitrites produced by urinary bacteria (usually limited to Gram-negative bacteria), has variable sensitivity (35-85%) but good specificity (92-100%). 35-39,41-49 In children, dipstick testing for LE and/or nitrites has been found to have sensitivity and specificity of around 80% compared to quantitative culture. 50-57 Among pregnant women, a sensitivity of only 50% for dipstick testing compared to culture has been reported. 30 False-positive and false-negative urinalysis results are due to a variety of factors, including specimen contamination, certain organisms, and the timing of specimen collection. The sensitivity of this test can be improved by obtaining first-morning specimens, preferably on consecutive days, instead of performing random collection. 41 Many of the studies assessing the accuracy of dipstick testing in children and adults do not describe the patients included. A proportion of these patients were undoubtedly symptomatic, possibly leading to bias in the accuracy estimates. In one study, dipstick sensitivity was significantly lower (56% vs. 92%) and specificity significantly higher (78% vs. 42%) in patients with few symptoms and a low prior probability of bacteriuria, compared to patients with a high prior probability of bacteriuria (i.e., those with dysuria, frequency, etc.). 58

Examination of the sediment by microscopic urinalysis to detect bacteria and white blood cells has also been evaluated as a screening test for bacteriuria. In children (including symptomatic patients), microscopy performs similarly to dipstick testing for detection of bacteriuria. 50 In pregnant women, microscopic analysis, with either bacteriuria or pyuria indicating a positive test, had a sensitivity of 83% but a specificity of only 59%. 30 In hospitalized adults, only 3% of urine specimens that were macroscopically (including dipstick) negative had clinically significant abnormalities detected by routine microscopic examination. 59,60 Microscopy has limited value as a screening test for asymptomatic persons because of the cost, time, and technique required. 30

In populations with a low prevalence of urinary tract disorders, most positive screening tests are falsely positive. Thus, in asymptomatic men, and in asymptomatic women under age 60, a dipstick test has a positive predictive value for significant bacteriuria of less than 10% (assuming a sensitivity of 85% and a specificity of 70%). 20,25,43 In children, the likelihood of bacteriuria in the presence of a positive dipstick screening test has been estimated at 0.1% for boys and 4% for girls. 57 In groups at increased risk for urinary tract infection, the positive predictive value of dipstick tests is higher: 13% in pregnant women, 18% in women over age 60, 33% in diabetic women, and 44% in institutionalized older persons. 20,25,29,32,41,43,61-64 The predictive value of bacteriuria found on microscopic urinalysis among pregnant women was 4.2-4.5%. 30

Urine screening tests are generally performed on a clean-catch specimen. In infants and young children, collection of a "clean" urine specimen is difficult, and as a result few studies of the accuracy of screening tests have included infants. Adhesive polyethylene bag specimens are the most acceptable choice, but these may have a significant contamination rate (false positives). Compared to suprapubic aspiration, positive results on bag specimens indicate true bacteriuria in only 7.5% of specimens. 65 The collection of confirmatory sterile culture specimens by suprapubic aspiration or urethral catheterization is too invasive and costly to be considered in a screening protocol for asymptomatic infants, as is routine screening by urethral catheterization. top link

Effectiveness of Early Detection

The early detection of asymptomatic bacteriuria may reduce the rate of bacteriuria and prevent symptomatic infection and its complications. Some observational studies suggest that persons with untreated asymptomatic bacteriuria are at increased risk of developing symptomatic urinary tract infection 66,67 and other complications (e.g., structural damage, renal insufficiency, hypertension, or mortality). 41,61-64,68-71 Evidence is not conclusive, however, that these clinical outcomes are caused by bacteriuria (especially in the absence of a structural abnormality), or that early treatment results in important clinical benefits. A randomized placebo-controlled trial of conventional treatment for asymptomatic bacteriuria in both young and middle-aged women (ages 20-65) reported no significant differences in the prevalence of bacteriuria or incidence of symptomatic urinary tract infection at 1-year follow-up. 66 Another randomized controlled trial (available only in abstract form) among women ages 16-69 years with asymptomatic bacteriuria reported significant reductions in bacteriuria at 1 and 3 years with vigorous individualized antimicrobial therapy, but did not report on clinical outcomes. 72 Among a cohort of middle-aged women (38-60 years) screened for asymptomatic bacteriuria, the prevalence of asymptomatic bacteriuria at 6-year follow-up in women identified with asymptomatic bacteriuria and appropriately treated remained significantly higher than in the nonbacteriuric group (23% vs. 5%) and 58% of the treated women had recurrent or persistent infection within 2 years of treatment. 25 Studies evaluating the treatment or natural history of asymptomatic bacteriuria are not available for young or middle-aged men .

Randomized controlled trials in institutionalized elderly women 73 and men 74 found no decreases in genitourinary morbidity with treatment of asymptomatic bacteriuria despite a reduced prevalence of bacteriuria. In both studies, life-table analyses suggested a survival advantage for the untreated group, but the differences were not statistically significant. In women, treatment was associated with an increased incidence of adverse antimicrobial drug effects and increased reinfections. 73

Among noninstitutionalized ambulatory elderly women, a randomized controlled trial reported that treatment significantly reduced the prevalence of bacteriuria at 6-month follow-up. 67 Symptomatic urinary tract infection and mortality rates were 16.4% and 4.9%, respectively, without treatment, compared to 7.9% and 3.2%, respectively, with treatment, but these differences were not statistically significant sample size may have been inadequate to detect a difference, however. In a nonrandomized controlled trial in noninstitutionalized elderly women, treatment of asymptomatic bacteriuria did not significantly reduce mortality (adjusted relative risk 0.92, 95% confidence interval, 0.57 to 1.57), although wide confidence intervals do not exclude the possibility of a substantial benefit. 75 A large cohort study from the same center reported no association between asymptomatic bacteriuria and mortality in ambulatory elderly women after control for confounding, even though the cure rate with treatment was 83% compared to a 16% spontaneous remission rate in untreated patients. 75 It is not clear whether the possible but unproven benefits from treatment of such women justify routine screening or the potential adverse effects of antibiotic therapy, including drug toxicity and the development of resistant organisms while treating recurrent infections. No controlled trials of therapy for asymptomatic bacteriuria in noninstitutionalized elderly men have been reported. In a prospective cohort study of 234 elderly men followed for up to 4.5 years, 29 (12%) had asymptomatic bacteriuria at initial screening, and 20 (8%) became positive in follow-up. 76 Of untreated bacteriuric subjects, 76% spontaneously cleared. Only five bacteriuric subjects were treated for symptomatic infection, with prompt recurrence of asymptomatic bacteriuria in three no adverse outcomes from symptomatic infection were reported. Cohort and cross-sectional studies that have included elderly ambulatory men have reported no differences in mortality, chronic genitourinary symptoms, or systemic symptoms such as anorexia, fatigue, or malaise between those with and without asymptomatic bacteriuria, after adequate adjustment for confounding variables. 77-79

Although some trials of elderly patients may have included persons with diabetes, we found no controlled clinical trials specifically evaluating the effectiveness of early detection of asymptomatic bacteriuria in diabetics for improving clinical outcome. Case series suggest treatment of asymptomatic bacteriuria usually clears bacteriuria and may reduce clinical symptoms, but bacteriuria recurs in more than two thirds of treated patients. 80-83 Continuous suppressive antibiotic therapy in diabetic patients can prevent re-infection but provides no posttreatment benefit. 80,81 The long-term consequences of asymptomatic bacteriuria in this population are undefined, although in one series persistent bacteriuria did not appear to contribute to renal damage. 83

The early detection of asymptomatic bacteriuria is of greater potential value for pregnant women, in whom bacteriuria is an established risk factor for serious complications, including acute pyelonephritis, preterm delivery, and low birth weight. Randomized controlled trials, cohort studies, and a meta-analysis of 8 randomized clinical trials have shown that treatment of asymptomatic bacteriuria during pregnancy can significantly reduce the incidence of symptomatic urinary tract infection, low birth weight, and preterm delivery. 12-14,18,20,28,84 There is little evidence regarding the optimal periodicity of screening in pregnancy. A urine culture obtained at 12-16 weeks of pregnancy will identify 80% of women who will ultimately have asymptomatic bacteriuria in pregnancy, 85 with an additional 1-2% identified by repeated monthly screening.

In children, detection of bacteriuria might lead to the identification of correctable abnormalities of the urinary tract and the prevention of renal scarring, obstructive atrophy, hypertension, and renal insufficiency. However, in three randomized controlled trials in girls aged 5-15 years, treatment of asymptomatic bacteriuria did not significantly reduce emergence of symptoms, pyelonephritis, renal scarring, or persistence of vesicoureteral reflux. 86-88 In two of these trials, 86,87 sample sizes may have been too small to detect important differences, but adverse outcomes were rare in both groups. Treated and control subjects had similar growth, blood pressure, renal growth, and concentrating capacity at the end of follow-up, ranging from 12 to 48 months. In longitudinal studies from the Oxford-Cardiff Cohort screening program, girls with asymptomatic bacteriuria in childhood had an increased prevalence of asymptomatic bacteriuria in pregnancy, and among those with asymptomatic bacteriuria and renal scarring, increased preeclampsia, hypertension, and obstetric interventions. 89,90 On the other hand, in pregnant women with a history of symptomatic urinary tract infection in childhood, there were no differences in preeclampsia or operative delivery, although asymptomatic bacteriuria was again more common. 91 All pregnancies in these studies had satisfactory maternal and fetal outcomes.

Most of the complications from urinary tract abnormalities are thought to occur before children reach school age, 2 and therefore screening might be more effective in younger children. There have been no studies, however, proving that preschool urinalyses result in lower morbidity from recurrent infection or in less renal damage. 2,92 Several studies have evaluated the natural history of asymptomatic bacteriuria detected in infancy and followed through the preschool years. In a Swedish cohort of 3,581 screened newborns, 50 infants were identified with asymptomatic bacteriuria, of whom 3 (<0.1%) were treated for underlying renal or urologic abnormalities and 2 were treated for pyelonephritis that occurred within 2 weeks of testing. 8,93 All 45 infants with untreated asymptomatic bacteriuria followed for up to 7 years cleared either spontaneously (80%) or after antibiotic treatment for other conditions (20%). Three subsequently developed cystitis and 20% had recurrences of asymptomatic bacteriuria, but none had major renal or urologic abnormalities as measured by concentrating capacity and urography at a median follow-up of 32 months. Forty infants developed symptomatic urinary tract infection in the first year of age, but only 2 (5%) had evidence of bacteriuria on previous screening. In another cohort of 1,617 healthy infants followed for 5 years, screening for asymptomatic bacteriuria detected 5 cases (0.3%) with high-risk lesions (such as obstructive uropathy, vesicoureteral junction ectopia, etc). 94 Whether early detection of bacteriuria improved prognosis was not established by this study. In 113 infants less than 1 year old undergoing urologic evaluation, the proportion of abnormal kidneys on dimercaptosuccinic acid (DMSA) scan did not differ between those with and without urinary tract infection (33% vs. 28%), suggesting that renal scarring from reflux may occur independently of bacteriuria. 95 Renal abnormalities detectable by ultrasound are found in 1.4% of infants who are considered normal, 2,96 compared to 6% of infants with asymptomatic bacteriuria. 8 However, these infants might have been detected outside the screening program as their symptoms developed.

The effectiveness of detecting asymptomatic bacteriuria in patients with indwelling or intermittent urethral catheterization, of periodic screening in patients with known urologic structural abnormalities, or of follow-up of symptomatic urinary tract infection with repeat cultures, is not discussed in this report. These forms of testing are considered within the domain of diagnostic studies for patients with existing medical or surgical conditions, rather than a part of routine screening tests for asymptomatic persons.top link

Recommendations of Other Groups

The American Academy of Family Physicians (AAFP) recommends periodic screening by dipstick combining leukocyte esterase and nitrite tests to detect bacteriuria in preschool children, those who are morbidly obese, persons with diabetes or a history of gestational diabetes, and persons aged 65 years and older. 97 The recommendations of the AAFP are currently under review. The American College of Physicians recommends against routine screening of adults for asymptomatic bacteriuria with urinalysis or urine culture. 98 The Canadian Task Force on the Periodic Health Examination recommends against routinely screening asymptomatic infants, children, elderly men, or institutionalized elderly women for bacteriuria, and found insufficient evidence to recommend for or against screening noninstitutionalized elderly women. 99 Bright Futures does not recommend routine urinalyses in infants, children, or adolescents. 100 The American Academy of Pediatrics (AAP) recommends routine urinalysis at age 5, and dipstick urinalysis for leukocytes for all adolescents, preferably at age 15 years. 102

The American College of Obstetricians and Gynecologists and the AAP recommend a urinalysis, including microscopic examination and infection screen, at the first prenatal visit, with the need for additional laboratory evaluations including urine culture determined by findings obtained from the history and physical examination. 101 The Canadian Task Force recommends a urine culture at 12-16 weeks of pregnancy. 99 top link

Discussion

Screening for asymptomatic bacteriuria is important during pregnancy, where there is strong evidence that treatment is efficacious in improving outcome. Given the benefits of detecting asymptomatic bacteriuria in pregnancy, prenatal testing should be carried out by urine culture (rather than by urinalysis) to reduce the risk of false negatives. A specimen obtained at 12-16 weeks will detect most cases of asymptomatic bacteriuria. There are, however, inadequate data to determine the optimal frequency of subsequent urine testing during pregnancy.

Screening for asymptomatic bacteriuria in school-age girls has been shown to produce little clinical benefit in controlled trials. The effectiveness of screening school-age boys for asymptomatic bacteriuria has not been evaluated, but because the prevalence is extremely low in this population and the specificity of screening tests is only about 80% in children, most positive tests will be false positives (estimated at 99.9% in one overview 57 ), with the potential for consequent adverse effects including unnecessary antibiotic therapy and invasive testing. Screening in infants, toddlers, and preschool children might be beneficial in preventing renal damage, but its effectiveness has not been established and cohort studies suggest little risk from untreated asymptomatic bacteriuria. In addition, no accurate and noninvasive screening test is available for infants or toddlers in diapers. Given an 80% sensitivity and specificity of current screening methods, and a 1% prevalence of asymptomatic bacteriuria in girls and 0.03% in boys, screening 100,000 children is estimated to result in 19,897 false-positive tests, or nearly 1 in 5 children screened. 57

Trials of routine screening have shown no benefit for institutionalized elderly persons and suggest the occurrence of adverse consequences such as unintended drug effects and increased reinfection rates. Screening is therefore not justified in this population. Screening urinalysis might be appropriate in certain high-risk groups, such as diabetic and noninstitutionalized elderly women, but firm evidence of benefit is not available. Several trials in ambulatory elderly women have found no clinical benefit from screening for asymptomatic bacteriuria, but sample sizes were small and do not exclude the possibility of important benefits. Potential benefits must be balanced against the high likelihood of reinfection after treatment in these groups and the adverse effects associated with antibiotic use. Screening is not justified in the general adolescent and adult population, or in ambulatory elderly men, because unrecognized, serious urinary tract disorders are uncommon, the positive predictive value of screening urinalysis is low, and the effectiveness of early detection and treatment is unproven.

CLINICAL INTERVENTION

Screening for asymptomatic bacteriuria with urine culture is recommended for pregnant women at 12-16 weeks of gestation ("A" recommendation). The optimal frequency for subsequent periodic urine cultures during pregnancy has not been determined and is left to clinical discretion. The urine specimen should be obtained in a manner that minimizes contamination. Routine screening for asymptomatic bacteriuria with leukocyte esterase or nitrite testing in pregnant women is not recommended because of poor test characteristics compared to urine culture ("D" recommendation).

There is currently insufficient evidence to recommend for or against routine screening for asymptomatic bacteriuria with leukocyte esterase or nitrite testing in ambulatory elderly women or in women with diabetes ("C" recommendation), but recommendations against such screening may be made on other grounds, including a high likelihood of recurrence and the potential adverse effects of antibiotic therapy. Routine screening for bacteriuria with leukocyte esterase or nitrite testing is not recommended for other asymptomatic persons, including school-aged girls ("E" recommendation), institutionalized elderly ("E" recommendation), and other children, adolescents, and adults ("D" recommendation). Screening for asymptomatic bacteriuria with microscopy testing is not recommended ("D" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on materials prepared for the Canadian Task Force on the Periodic Health Examination by Michael B.H. Smith, MB, BCh, CCFP, FRCPC, and Lindsay E. Nicolle, MD. top link

32. Screening for Rubella --- Including Immunization of Adolescents and Adults

Burden of Suffering

Rubella is generally a mild illness when contracted by pregnant women, however, especially those in the first 16 weeks of pregnancy, it frequently causes serious complications including miscarriage, abortion, stillbirth, and congenital rubella syndrome (CRS). 1,2 The 1964 rubella pandemic in the U.S. caused over 12 million infections, 11,000 fetal losses, and 20,000 cases of CRS in infants. 3 The most common manifestations of CRS are hearing loss, developmental delay, growth retardation, and cardiac and ocular defects. 1,2 The lifetime costs of treating a patient with CRS were estimated in 1985 to exceed $220,000. 3

Since 1969, when rubella vaccine became available in the U.S. and universal childhood immunization was initiated, no major periodic rubella epidemics have occurred. The incidence of reported cases has declined dramatically, to an estimated incidence rate of 0.1/100,000 population (192 cases) and an indigenous CRS incidence rate of 0/100,000 live births (no cases reported) in 1993. 4 Outbreaks of rubella infection have continued to occur, however in 1991, for example, 1,401 rubella infections were reported (0.6/100,000), one third of which occurred among adolescents and young adults (ages 15-29 years), resulting in 31 cases of CRS (0.8/100,000). 4,5 Most recent outbreaks have occurred in settings where many unvaccinated children and young adults are gathered (e.g., religious communities that refuse vaccination, colleges, prisons, and work places), and among persons in specific racial/ethnic groups (e.g., Asians/Pacific Islanders and Hispanics) who are often unvaccinated. 4,6,7 The highest risk for CRS occurs in Amish women, for whom the rate in one Pennsylvania county was 14/1,000 live births in 1991, compared to 0.006/1,000 for the general U.S. population. 4 top link

Accuracy of Screening Tests

One way to prevent rubella infection in adults is to screen for susceptibility, by serologic tests for antibodies or by vaccination history, and to administer vaccine to susceptible persons. Vaccine trials and cohort studies have shown that most patients with hemagglutination-inhibition (HI) antibody are protected from clinical disease. 8-10 HI is a labor-intensive test, however, and it can be associated with both false-positive and false-negative results. 1,8,11 Faster, more convenient laboratory methods (e.g., enzyme immunoassay and latex agglutination) have now replaced HI in most laboratories. 1,12 Using HI as the comparison standard, these tests have sensitivities of 92-100% and specificities of 71-100%. 11,13-15 The apparently low specificities of some newer methods are due to their ability to detect low levels of rubella antibody that are undetectable by HI methods and are therefore reported as "false positives." 1,16,17 There have been no controlled trials to determine if these low levels confer immunity against wild virus, 1 but other clinical and in vitro evidence suggests that they are protective. 16,18-22 These newer tests, therefore, appear to be both more accurate and more convenient than HI when performed in laboratories with demonstrated proficiency.

A history of rubella vaccination can identify many who may be protected. Despite a variety of design flaws in some of the available studies (such as selection biases and small sample sizes), most demonstrate that persons with a positive history of having received rubella vaccine are significantly more likely to be seropositive (median 92%, range 82-97%) than those without such a history (median 74%, range 62-83%). 18,23-30 A positive rubella vaccination history documented by vaccination card, school record, or medical record is more likely to be associated with seropositivity than is an undocumented history (although this difference was not statistically significant in some studies), 18,25-27 and it is therefore preferred. A positive history of rubella infection is substantially less likely to correctly predict rubella immunity than is a positive history of vaccination 18,23-25 therefore, a history of infection is not adequate for determining susceptibility.top link

Effectiveness of Early Detection

Rubella vaccine, once administered, is efficacious. Efficacy studies in healthy vaccinees show that >=90% have protection against clinical rubella illness, 31-35 and seropositivity is long-lasting. 36-39 After the initiation of universal child immunization in 1969, the incidence of both rubella and CRS dropped markedly (see above). 1,4 Adverse reactions from the RA27/3 live attenuated rubella vaccine (the only rubella vaccine currently licensed in the U.S.) are generally mild in children. 40,41 Joint symptoms after vaccination are common in adults but rarely persist the incidence is higher in women than men and increases with increasing age at vaccination. 1,9,42,43 Vaccination of persons who are already immune rarely induces the joint symptoms seen with primary immunization of susceptible adults. 44,45

Because an estimated 6-12% of the young adult population is seronegative, 30,46 and because CRS continues to occur in the U.S. despite recommendations for universal childhood vaccination (see Chapter 65), 4 it has been recommended by some authorities that clinicians also direct efforts toward vaccinating susceptible adolescents and young adults, particularly women of childbearing age. 1 Several factors may reduce the effectiveness of a strategy to prevent CRS by screening (with history of vaccination or serology) and vaccinating susceptibles. The screening test may falsely identify some susceptible persons as immune of 21 infants with CRS in 1990, 71% of their mothers had had a positive serologic test and 43% gave a history of vaccination. 47 Persons correctly identified as susceptible may not be offered or accept the vaccine vaccination rates after serologic screening in different populations have ranged from 37% to 88%. 18,24,26,27,48-56 Seronegative women are more likely than are seronegative men to accept immunization, 55,57 with the highest rates of follow-up vaccination (78-87%) occurring in susceptible postpartum women. 52-54

The effectiveness of a strategy of screening and follow-up vaccination to prevent CRS may be assessed by its effect on the incidence of CRS and of rubella infection and susceptibility in women of childbearing age. No controlled studies have evaluated the effectiveness of screening and vaccinating susceptible persons in reducing the incidence of CRS. CRS occurrence has decreased over time in some, but not all, countries that have employed selective vaccination of susceptible adolescent and adult females as their sole strategy to reduce CRS. 58-60 Evidence that screening and follow-up vaccination can reduce the likelihood of rubella infection was provided by a severe rubella outbreak in Iceland, where identical rates of protection from infection occurred in screened and immunized (98.5%) and in naturally immune (99%) schoolgirls. 61 Evidence regarding rubella susceptibility is supplied by a cohort study from Scotland. Six to seven years after a screening program for schoolgirls took place, 98.7% of girls who had originally been naturally immune had circulating antibodies, compared to 95.1% of those who had been vaccinated as susceptibles and 42.8% of a small group of susceptibles who had refused vaccination. 62 Case series from Iceland 61,63 and cross-sectional studies from Great Britain 52,64 also show a reduction in susceptibility among women of childbearing age using this strategy. There is thus fair evidence that screening and immunizing susceptible females of childbearing age reduces both rubella susceptibility and infection and, by inference, CRS.

An alternative strategy to prevent rubella infection in women of childbearing age is routine vaccination without screening. In addition to protecting those who have not been previously vaccinated, such a strategy would eliminate most susceptibility due to primary vaccine failure (failure to develop antibodies after initial vaccination). Primary vaccine failure occurs in 2-5% of RA27/3 vaccine recipients, 65-70 and a second rubella vaccination results in seroconversion in most cases. 9,18 Antibodies have been found in 99.2% of schoolchildren after two doses of rubella vaccine, compared to 94.6% after one dose. 28 In Sweden and Finland, vaccine programs in which all adolescent girls are routinely immunized (as well as all children at age 14-18 months) have been associated with substantially reduced occurrence of both seronegativity and of rubella infection in female compared to male adolescents and adults. 71,72 These data provide fair evidence for routine vaccination of all nonpregnant women of childbearing age to reduce rubella susceptibility and infection and, therefore, CRS.

The rubella vaccine is contraindicated during pregnancy because of the theoretical possibility of teratogenicity, although there have been no reported cases of rubella vaccine-related birth defects in the United States after inadvertent vaccination of 321 susceptible pregnant women within 3 months of conception. 1 Similarly reassuring results have been reported from Great Britain and Germany. 73,74 Based on reported data, the true risk for CRS in susceptible women vaccinated during pregnancy using the RA27/3 vaccine may be zero, and the probability is 95% that the true risk is less than 1.7%. 75 Because a measurable iatrogenic risk cannot be excluded, however, vaccination of susceptible women who are known to be pregnant should be postponed until the postpartum period. 75 The virus has been isolated in breast milk and in breast-fed infants after postpartum vaccination, 76 but no adverse consequences from such exposure have been reported. 76,77 A greater disadvantage of postpartum immunization is that it often occurs too late to prevent CRS 61% of reported cases have occurred with the first live birth. 78

In settings where large numbers of young adults are gathered (e.g., military bases and colleges), outbreaks of rubella are not uncommon, and males and females are infected at similar rates. 79-82 Rubella screening or routine vaccination of young men in such settings might reduce the risk of spreading rubella to susceptible pregnant women. There is weak evidence from a single before-after study that universal rubella screening and follow-up vaccination of military recruits is effective in preventing rubella infection and eliminating epidemic rubella. 83 A small cohort study using the older Cendehill vaccine found that routine vaccination of young male military recruits reduced rubella susceptibility, clinical disease, and viral shedding. 10 In a before-after study of 256 college athletes (62% male) screened serologically with follow-up vaccination of susceptibles, the proportion with documented immunity by serology increased from 93% to 96%, and 8 of the remaining 9 seronegative students were vaccinated but did not receive follow-up testing. 84 There is, however, no direct evidence that either screening or routine vaccination of males in these settings reduces CRS. For young men not living in such settings, no evidence was found to support either screening or routine vaccination in reducing susceptibility, infection, or CRS.

There are few data concerning rubella screening or vaccination in older men or in women past childbearing age. Because men ages 40 years and older and postmenopausal women account for only a small proportion (<10%) of recent rubella cases, 5,85 have a high rate of natural immunity (85-95%), 59,86 have a greater likelihood of postvaccine joint reactions, 9 and are at little direct risk if they do become infected, routine screening or vaccination of this population does not seem to be justified despite the fact that these persons might, on rare occasions, transmit rubella to susceptible women of childbearing age. top link

Recommendations of Other Groups

The American Academy of Pediatrics (AAP), 87 American College of Obstetricians and Gynecologists (ACOG), 88 American College of Physicians, 89 and Advisory Committee for Immunization Practices (ACIP) 1 recommend vaccinating all adolescents and adults (particularly women and persons in colleges, health care settings, and military institutions) who have no contraindications and who lack documented evidence of either rubella immunization on or after the first birthday or of serologic evidence of immunity. Routine serologic testing of men and nonpregnant women is not recommended by these organizations. The American Medical Association 90 and Bright Futures 91 recommend rubella vaccination (as measles-mumps-rubella [MMR]) for all adolescents who have not had two previous MMR vaccinations. The American Academy of Family Physicians recommends rubella antibody testing in all women of childbearing age who lack evidence of immunity. 92 AAP, 87 ACOG, 89 and ACIP 1 recommend routine prenatal or antepartum serologic screening of all pregnant women not known to be immune, and postpartum vaccination of those found to be susceptible. The Canadian Task Force on the Periodic Health Examination recommends serologic screening of women of childbearing age, with vaccination of seronegative nonpregnant women immediately and seronegative pregnant women after delivery. They also recommend universal vaccination of women of childbearing age without screening as an acceptable alternative. The Canadian Task Force does not recommend for or against universal vaccination of young men in settings where large numbers of young persons are gathered. 93 top link

Discussion

When administered to children, the current rubella vaccine is efficacious in the induction of rubella immunity and in the prevention of rubella infection and CRS. Recent cases of rubella and CRS have been associated with outbreaks among groups of unvaccinated persons, leading to infections of unvaccinated pregnant women. 4,7 The added coverage provided by the two MMR vaccinations many will receive during childhood to meet current recommendations for measles immunization (see Chapter 65) should eliminate most primary vaccine failures, and will increase the rate of primary immunization among women of childbearing age. Therefore, the incidence of CRS will probably decline as the current cohort of highly immunized female children and adolescents enters its childbearing years.

In the intervening years, however, many women of childbearing age will remain unimmunized and, therefore, susceptible to rubella infection. Universal screening and follow-up vaccination of susceptible females would reduce rubella susceptibility, infections, and CRS however, the effectiveness of this strategy in the clinical setting may be limited by incomplete screening, imperfect screening tests, and failure to vaccinate susceptibles. Routine vaccination of all women of childbearing age, without screening, also seems to be effective in reducing rubella infections it avoids the problem of noncompliance with return visits, and if given as MMR also provides immunity to other infectious diseases, but it results in vaccination of many women who are already immune. Because the adverse effects of vaccinating immune persons appear to be minimal, cost and convenience are likely to be the determining factors in deciding which strategy should be used. In one study, the most cost-effective strategy was record review followed by vaccination, if at least 75% of patients had records available otherwise, vaccination of all persons without screening was most cost-effective. 23 On the other hand, a study from Iceland found that serologic screening of females ages 12-40 followed by vaccination of seronegatives and follow-up retesting was more cost-effective than routine vaccination. 94 These estimates are sensitive to the prevalence of immunity, compliance with follow-up, and the costs of screening, vaccine, and follow-up.

Whether either strategy (screening for susceptibility or routine vaccination of women of childbearing age) is justified by expected benefits compared to costs is not clear. An analysis of a premarital rubella screening program found that costs did not justify benefits unless at least 85% of seronegatives were vaccinated. 95,96 Variation in the cost of the screening tests and vaccines, the prevalence of immunity, and the likelihood of rubella exposure will influence these results, however. The impact and benefit-cost ratio of strategies to reduce rubella susceptibility are likely to be greatest in settings where many women are unvaccinated (and are therefore at higher risk for acquiring rubella), such as certain religious communities and communities with many unimmunized immigrants from developing countries. Cost-benefit analyses concerning rubella screening and vaccination of women in various settings are needed.

CLINICAL INTERVENTION

All children without contraindications should receive MMR vaccine at age 12-15 months and again at age 4-6 years (see Chapter 65). To reduce further the incidence of CRS, screening for rubella susceptibility by history of vaccination or by serology is recommended for all women of childbearing age at their first clinical encounter ("B" recommendation). A documented history of vaccination is more accurate than an undocumented history in determining rubella immunity and is therefore preferred. All susceptible nonpregnant women of childbearing age should be offered vaccination. Susceptible pregnant women should be vaccinated in the immediate postpartum period. An equally acceptable alternative for nonpregnant women of childbearing age is to offer vaccination against rubella without screening ("B" recommendation). The decision of which strategy to use should be tailored to the individual clinician's practice population, depending on the availability of vaccination records, the reliability of the vaccination history, the rate of immunity, the cost of serologic testing, and the cost and likelihood of follow-up vaccination for susceptible persons identified by serologic testing.

There is insufficient evidence to recommend for or against routine screening or vaccination of young men to prevent CRS in settings where large numbers of susceptible young adults of both sexes congregate, such as military bases and colleges ("C" recommendation). Recommendations to give MMR vaccine in these settings may be made on other grounds, however, such as prevention of measles (see Chapter 66). Routine screening or vaccination of other young men, of older men, or of postmenopausal women, is not recommended ("D" recommendation).

Guidelines for the administration of MMR vaccine, and its contraindications, have been published by ACIP. 1 The National Childhood Vaccine Injury Act requires that the date of administration, the manufacturer and lot number, and the name, address, and title of the person administering the vaccine be recorded in the patient's permanent medical record (or in a permanent office log or file). 97

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

33. Screening for Visual Impairment

Burden of Suffering

Preschool Children.

Undetected vision problems are common in preschool children, with an estimated prevalence of 5-10%. 1 About 2-5% suffer from amblyopia ("lazy eye" loss of vision due to disuse) and strabismus (ocular misalignment) which, aside from congenital conditions, usually develop between infancy and ages 5-7. 2-4 In the newborn, risk factors for developing strabismus or amblyopia include a family history of ocular malformations, anisometropia (a large difference in refractive power between the two eyes, more than 4 diopters in sphere and/or 2 diopters in astigmatism), congenital cataracts, ocular tumors, premature birth, or birth to a mother who suffered from infection such as rubella, genital herpes, or toxoplasmosis during pregnancy. Since normal vision from birth is necessary for normal binocular development, failure to detect and treat amblyopia, marked anisometropia, or strabismus at an early age may result in irreversible visual deficits. Resulting permanent amblyopia and cosmetic defects may lead to later restrictions in educational and occupational opportunities. 5,6 Patients with amblyopia are at increased risk of blindness from loss of vision in their good eye. 6a top link

School-Aged Children.

Data are limited regarding the prevalence of uncorrected refractive errors and previously undiagnosed vision problems in elementary school-aged and adolescent children. A community-based examination of all first- to third-grade children in 1984 found visual acuity of 20/30 or better in the better eye in 94-95% of the schoolchildren 7%, 9%, and 9% of children in first, second, and third grades, respectively, had glasses prescribed. Two percent of children for whom glasses were prescribed were not wearing them. 7 Refractive errors, which often become manifest during school age, rarely carry any serious prognostic implications. Experts disagree on whether an uncorrected refractive error that would be detected by screening has any adverse effects on academic performance in school-aged children. 7,8 top link

Adolescents and Adults.

Refractive errors are the most common visual disorder in the adolescent and adult population. In a study of undetected eye disease in a primary care population (94% African-American), 21% of patients ages 40-59 were diagnosed with an eye disease of which they were not aware. 9 The majority of these cases, however, were not detected by acuity screening (e.g., glaucoma or diabetic retinopathy), most were mild or previously diagnosed, and few required immediate treatment. There are no data to determine the incremental benefit of routine screening of adults to detect early refractive errors compared to waiting for patients to present with complaints of vision problems. top link

Elders.

Visual impairment is a common and potentially serious problem among older people. Personal safety may be compromised the risk of falling is increased. 10 The rate ratio for fatal car crashes in the elderly is lower in states where vision testing is required for persons over 65 than in states where it is not required. 11 While a reduction in visual acuity may be noticed by an individual, underreporting is common. One small study of patients attending a geriatric day care center showed that one third had unrecognized severe visual loss. 12 Surveys have revealed that up to 25% of older people are wearing inappropriate visual correction. 13 The Baltimore Eye Survey reported that more than half of the 5,300 persons screened had improved vision after refraction and appropriate corrective lenses. 14 In the Beaver Dam Eye Study, visual acuity with current correction was worse than 20/40 in 5% of persons aged 65-74, and was worse than 20/40 in 21% of those 75 years of age or older the proportion with correctable poor acuity was not reported. 15 A 1995 study found that uncorrected vision problems are common among nursing home residents. Among 499 residents, 17% had bilateral blindness (acuity <=20/200) and 19% had impaired vision (<20/40) a substantial proportion of vision problems in this population could have been remedied by adequate refractive correction or treatment of cataracts. 15a

The most common causes of visual impairment in the elderly include presbyopia, cataract, age-related macular degeneration (ARMD), and glaucoma (see Chapter 34). In persons over age 75 years, 5% have exudative macular degeneration, and 5% have glaucoma. 16-18 The prevalence of cataract increases with age. In persons aged 55-64 years, the Beaver Dam Eye Study found 33% with early cataract and 6% with late cataract in persons over 75 years these prevalences were 37% and 52%. 17 The frequency of visually significant cataract is higher in women than in men. 17 The causes of blindness vary by race, with whites being more commonly afflicted with macular degeneration and blacks having a higher prevalence of untreated cataract and open-angle glaucoma. 19 top link

Accuracy of Screening Tests

Preschool Children.

Despite the importance of early childhood screening for strabismus and amblyopia, detecting occult visual disorders by screening tests in children under 3 years of age has generally been unsuccessful. Obstacles to screening include the child's inability to cooperate, the time required for testing, and inaccuracy of the tests. 20,21 Some of the techniques for this age group, such as preferential looking, grating acuity cards, refractive screening, and photographic evaluation, have not yet been proven effective. 22,23

Screening tests for detecting strabismus and amblyopia in the 3-5-year-old child include simple inspection, cover test, visual acuity tests, and stereo vision assessment. Although it is widely recommended, 24 reports are not available of sensitivity or specificity of the cover test performed by primary care providers. Visual acuity tests for children include the Snellen chart, the Landolt C, the tumbling E, the Allen picture cards, grating cards, and others. 25 The specificity of any acuity test for detecting strabismus or amblyopia is imperfect as other conditions may be the cause of the diminished acuity. Snellen letters are estimated to have a sensitivity of only 25%-37%. 26 Refractive screening is not a test for strabismus or amblyopia per se, but may be used to identify amblyogenic risk factors (e.g., anisometropia, or severe hyperopia [farsightedness]). 27

The Modified Clinical Technique (MCT) includes retinoscopy, cover testing, quantifying ocular misalignment, Snellen acuity, color vision assessment, and external observation. 28 Preferential looking (PL) has been substituted for Snellen acuity in the MCT without loss of predictive power of the MCT but with increase in percentage of young children who were able to complete the test. 23 The MCT, despite a high sensitivity and specificity, cannot be used routinely by primary care physicians for screening because it takes on average about 12 minutes to perform and requires skills and instrumentation not typically found in this setting.

Stereograms such as the Random Dot E (RDE) have been proposed as more effective than visual acuity tests in detecting strabismus and amblyopia. 25,29 The test, in which the child wears Polaroid glasses while viewing the test cards, takes about 1 minute. The RDE has an estimated sensitivity of 54-64%, specificity of 87-90%, positive predictive value of 57%, and negative predictive value of 93%. 30,31

An evaluation of a preschool vision screening program comprising visual inspection, acuity assessment, and evaluation of stereoacuity, found a combined negative predictive value of 99% for amblyopia, strabismus, and/or high refractive errors. 32 A similar program, evaluated with limited use of definitive examinations, reported a positive predictive value of 72% for screening. 33 A positive screening test does not ensure adequate follow-up. In one practice-based study, nearly half of parents of children who had a positive screen were unaware of that result 2 months later 15% of children referred to a specialist did not make or keep the subsequent appointment. 34 top link

School-Aged Children.

The public school system in most states has taken on the responsibility of vision screening in school-aged children and making referrals to eye care specialists. In 1992, all but 12 states had mandatory or regulated screening of elementary school-aged children. Screening of visual acuity is generally accomplished with standard Snellen vision charts. Although referral criteria and procedures vary widely, school screening may have a false-positive rate of 30% or more. 35,36 top link

Elders.

Asking screening questions about visual function has yielded mixed results when compared to use of a Snellen acuity chart. The question "Do you have difficulty seeing distant objects?" had sensitivity of 28% in detecting visual acuity worse than 20/40. 37 "(When wearing glasses) Can you see well enough to recognize a friend across the street?" had sensitivity of 48%. 38 A similar question showed lower sensitivity for visual impairment as part of the HANES 1971-72 survey. 39 A brief questionnaire using an additive score formed from three similar questions was found to have sensitivity of 86% and specificity of 90% for visual acuity worse than 20/40 in a combined sample of 248 persons aged 45 years and older selected at random from a community population, and a convenience sample of 118 diabetics from the Wisconsin Epidemiologic Survey of Diabetic Retinopathy. 40

Impaired visual acuity is readily detected by use of a Snellen chart. Cataracts are detectable by ophthalmoscopy, even by relatively inexperienced health professionals. There are few data on sensitivity and specificity of these examinations in the primary care setting. Funduscopy may reveal characteristic changes of ARMD. While these abnormalities are readily recognized by ophthalmologists and optometrists trained in funduscopy, no studies of the sensitivity of funduscopy by primary care physicians were found in a computerized literature search. 41 Case reports support the usefulness of the Amsler grid to detect early detachment of the retinal pigment epithelium at a point when immediate treatment may be beneficial, but compliance with testing is poor. 42,43 top link

Effectiveness of Early Detection

Preschool Vision Problems.

There is fair evidence based on animal models, and case series and case-control studies in humans, that early detection and treatment of amblyopia and strabismus in infants and young children improves the prognosis for normal eye development. 24,44-49 The success of intervention may be dependent on age, with increased likelihood of attaining normal or near-normal vision with earlier detection and treatment the older the patient, the longer the duration of treatment needed. In a prospective study of visual acuity screening in matched cohorts of over 700 preschool children, those who were screened had significantly less visual impairment than the controls when reexamined 6-12 months later. 50 top link

Vision Problems in School-Aged Children, Adolescents, and Nonelderly Adults.

There is little evidence that early detection of refractive errors is associated with important clinical benefits, compared with testing based on symptoms. A common justification for regular screening in school-aged children is the concern that undetected vision problems are an important cause of academic difficulty, but there is no evidence that routine screening has important benefits in terms of academic performance. 51,52 top link

Vision Problems in Elders.

Refractive errors are readily correctable with eye glasses or contact lenses. Following refraction and correction, 54% of subjects in the Baltimore Eye Survey improved their visual acuity by at least one line on the Snellen chart and 8% improved by three lines or more. While the impact on physical and social function of these improvements is unknown, it has been demonstrated that restoration of vision following cataract surgery leads to subjective improvements in a variety of vision-related functions, as well as improvements in objective measures of physical and intellectual function. 53

Although ophthalmologists use differing criteria to determine the optimal time to remove cataracts, a general rule is that surgery should be considered when an otherwise well patient feels that there is a significant impairment to daily life caused by the vision loss. While there are theoretical reasons to believe that earlier referral to an ophthalmologist is desirable for assessment of retinal disease prior to obliteration of the view of the fundus by advancing cataract, in practice most individuals will complain of visual loss and be treated before this occurs.

Randomized clinical trials have shown a beneficial effect of argon laser photocoagulation of choroidal neovascular membranes in selected cases of ARMD. 54 Controlled trials with other wavelengths of light (e.g., krypton) are currently underway. Medical therapy for ARMD, with zinc supplements or interferon, has been reported as case series, but it has not yet been evaluated more rigorously. 55,56 top link

Recommendations of Other Groups

The Canadian Task Force on the Periodic Health Examination (CTF) 57 concluded that there is fair evidence to recommend visual acuity testing of preschool children. The American Academy of Ophthalmology (AAO), 58 American Optometric Association, 59 American Academy of Pediatrics (AAP), 60 and Bright Futures 61 each recommend examining newborns and infants for ocular problems and screening visual acuity and ocular alignment at age 3 or 4 in children, and every 1-2 years thereafter through adolescence. New guidelines for vision screening in children, outlining which tests to use and criteria for referral, have been developed by the AAP Section on Ophthalmology, in conjunction with the AAO and the American Association for Pediatric Ophthalmology and Strabismus. 62 The American Academy of Family Physicians (AAFP) recommends that all children be screened for eye and vision abnormalities at 3-4 years of age and that clinicians remain alert for vision problems throughout childhood and adolescence. 63

Periodic comprehensive eye examinations including acuity testing are recommended for all adults by the American Optometric Association and by Prevent Blindness America (formerly National Society to Prevent Blindness) 59 and for adults over age 40 by the American Academy of Ophthalmology. 64 The CTF 41 and the AAFP 63 advise routine screening of visual acuity only for individuals age 65 and over. AAFP recommendations on vision screening are currently under review.top link

Discussion

No prospective trial has directly assessed the benefits of routine preschool vision screening, but animal models and observational studies provide fair evidence that earlier detection and treatment improves the outcomes in children with strabismus and amblyopia. Screening and early referral is recommended for infants and preschool children in the primary care setting. The optimal age for screening cannot be determined from direct evidence. The recommendation to screen at ages 3-4 years is based primarily on expert opinion, and reflects a compromise between the inability of younger children to cooperate fully with screening and the goal to detect and treat the conditions as early as possible.

Screening older children, adolescents, and adults is less likely to detect vision problems that require early intervention. Although routine screening in asymptomatic persons may detect some persons with early refractive errors, these are readily corrected when patients become symptomatic. It is not certain that the incremental benefit of early detection (compared to evaluation when patients complain of change in vision) is sufficient to justify the costs and inconvenience of routine testing. Any patient with ocular symptoms, however, should be advised to see an eye care specialist.

Vision problems are more prevalent in persons over 65, and they are more likely to lead to serious consequences such as accidental injuries. Questioning elderly patients about vision problems is less sensitive than directly assessing visual acuity. Although the effect on functional outcomes of periodic screening with Snellen chart acuity testing in the elderly has not been directly assessed, there is fair evidence that routine screening leads to improvements in measured acuity, and there is little chance of serious harm from screening. The role of routine screening with funduscopy by the primary care provider is less certain. Funduscopy is likely to be more sensitive than acuity testing for detecting persons with exudative ARMD, especially those with early disease, who may benefit from photocoagulation therapy. The sensitivity and specificity of funduscopy by primary care providers for ARMD is unknown, however.

CLINICAL INTERVENTION

Vision screening for amblyopia and strabismus is recommended for all children once before entering school, preferably between ages 3 and 4 years ("B" recommendation). Clinicians should be alert for signs of ocular misalignment when examining all infants and children. Stereoacuity testing may be more effective than visual acuity testing in detecting these conditions.

There is insufficient evidence to recommend for or against routine screening for diminished visual acuity among asymptomatic schoolchildren and nonelderly adults ("C" recommendation). Recommendations against such screening may be made on other grounds, including the inconvenience and cost of routine screening, and the fact that refractive errors can be readily corrected when they produce symptoms.

Routine vision screening with Snellen acuity testing is recommended for elderly persons ("B" recommendation). The optimal frequency for screening is not known and is left to clinical discretion. Selected questions about vision may also be helpful in detecting vision problems in elderly persons, but they do not appear as sensitive or specific as direct assessment of acuity. There is insufficient evidence to recommend for or against routine screening with ophthalmoscopy by the primary care physician in asymptomatic elderly patients ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Joseph N. Blustein, MD, MS, and Dennis Fryback, PhD, based in part on materials prepared for the Canadian Task Force on the Periodic Health Examination by Christopher Patterson, MD, FRCP, and John W. Feightner, MD, MSc, FCFP. top link

34. Screening for Glaucoma

Burden of Suffering

Glaucoma is a disorder defined by slowly progressive loss of vision in association with characteristic signs of damage to the optic nerve. Selective death of retinal ganglion cells leads to the gradual enlargement of the optic cup and loss of vision (beginning with peripheral vision) that are typical of glaucoma. 1 Increased intraocular pressure (IOP) is common in glaucoma and is believed to contribute to the damage to the optic nerve, but it is no longer considered a diagnostic criterion for glaucoma. Glaucoma is the second leading cause of irreversible blindness in the U.S., and the leading cause among African Americans. 2,3 Of the various forms of glaucoma (e.g., congenital, open-angle, closed-angle, secondary), primary open-angle glaucoma (POAG) is the most common in the U.S. (80-90% of cases) 4 and is estimated to be responsible for impaired vision in 1.6 million Americans and blindness in 150,000. 1,4 Annual office visits for glaucoma increased from roughly 2 million in 1975 to almost 9 million in 1992. 4a POAG is usually asymptomatic until irreversible visual field loss has occurred. One study reported that over the course of 20 years, blindness may develop in up to 75% of persons with glaucoma. 5 There are few data, however, on the natural history of disease in persons with mild visual field defects detected by screening.

The prevalence of glaucoma is 4-6-fold higher in blacks than whites, and it increases steadily with age: among whites, glaucoma is present in 0.5-1.5% of persons under age 65 and 2-4% of those over 75 6,7 among blacks, 1.2% of 40-49-year-olds and 11.3% of those over 80 have glaucoma. 8 Prevalence of glaucoma is increased in patients with diabetes mellitus, myopia, and a family history of glaucoma. 1 A much larger number of persons have ocular hypertension (usually defined as an IOP > 21 mm Hg), which is a strong risk factor for developing glaucoma. Ocular hypertension is present in 7-13% of the general population, prevalence increasing with age. 3 In the Framingham Study, one fourth of men and women over age 65 had ocular hypertension. 9 The risk of progressing to glaucoma varies directly with level of IOP and duration of follow-up: the proportion of persons developing visual deficits within 5 years was less than 1% for normal IOP (<21 mm Hg), 3-10% for IOP >= 21 mm Hg, 6-16% for IOP > 25 mm Hg, and 33% for IOP > 30 mm Hg. 10 Untreated individuals with moderate ocular hypertension (mean IOP 24-26 mm Hg) developed new visual deficits (based on sensitive measures) at a rate of 3-4% per year in recent trials. 11-13 Among patients with untreated ocular hypertension followed for 17-20 years in older series, over 30% developed clinical glaucoma. 14,15 top link

Accuracy of Screening Tests

There are two potential targets for screening among asymptomatic persons: individuals who have normal vision but are at increased risk for developing glaucoma (i.e., "glaucoma suspects"), and those who have undetected visual field defects (i.e., undiagnosed glaucoma). Up to 50% of persons with glaucomatous visual deficits detected by screening are unaware of their diagnosis. 8

The three most common screening tests for glaucoma are tonometry, ophthalmoscopy, and perimetry. Tonometers, which include Schiötz, applanation, and noncontact (air puff) devices, are used to measure intraocular pressure. The accuracy and reliability of tonometry is affected by the choice of device, the experience of the examiner, and physiologic variables in the patient. 10,16 The more fundamental problem with tonometry as a screening test is the limited sensitivity and specificity of elevated IOP for current or future cases of glaucoma. Many patients with ocular hypertension (perhaps more than 70%) will never develop vision problems due to glaucoma. 14,15 Isolated measurements of IOP are also insensitive for glaucoma: only half of all patients with documented glaucoma have IOP greater than 21 mm Hg on random measurement, due in part to fluctuations in IOP over time. 4,17 There is no single cutoff value of IOP that provides an acceptable balance of sensitivity and specificity for screening. 1 In the Baltimore Eye Survey, a cutoff of IOP > 18 mm Hg had a sensitivity and specificity of 65% for definite or probable glaucoma raising the cutoff to 21 mm Hg improved specificity to 92%, but lowered sensitivity to 44%. 4 In population screening, where prevalence of glaucoma is relatively low, less than 5% of those with ocular hypertension will have documented glaucoma. 18

A second screening test for POAG is direct ophthalmoscopy or slit-lamp examination, which can detect the changes in the optic nerve head (e.g., cupping, pallor, hemorrhage) that often precede the development of visual deficits in glaucoma. Examining the optic disk to screen for glaucoma in the primary care setting is limited by considerable interobserver variation in interpretation of funduscopic findings, even among experts using standardized criteria. 19 Ophthalmologists using direct ophthalmoscopy alone detected fewer than one half of all cases of glaucoma. 20 Primary care clinicians with less skill in ophthalmoscopy and less time to dilate pupils would be expected to have poorer accuracy. Qualitative evaluation of stereoscopic photographs of the optic disk is more sensitive, 21 and disc photography allows for precise measures of disk parameters, which may provide evidence of glaucomatous nerve damage (e.g., vertical and horizontal cup-disk ratios, neuroretinal rim width). No combination of parameters, however, adequately discriminates patients with glaucoma from normal subjects. In the Baltimore survey, various combinations of disk parameters, IOP, and family history had only moderate sensitivity (49-66%) and specificity (79-87%) for glaucoma. 4 Neither slit-lamp examination nor optic disk photography is routinely available in the primary care setting.

The third method of screening for POAG is perimetry, in which patients respond to visual stimuli of varying brightness presented in various locations in their visual field. Reproducible visual field defects currently represent the "gold standard" for diagnosing glaucoma, but diagnostic testing with automated perimetry may take more than 45 minutes and is not feasible for screening. 1 Modified testing strategies can reduce the time needed for screening, but they are less sensitive and specific for glaucoma. Evaluations of these devices report a sensitivity in excess of 90% and a specificity of 70-88%. 17,22,23 False-positive results can be caused by visual disorders other than glaucoma and by unfamiliarity of patients with the testing process. Due to expense and technical difficulties, automated perimetry is not practical for routine use in the primary care setting. Moreover, visual field loss is often a late event in the natural history of glaucoma: by the time visual deficits are evident, up to 50% of nerve fibers may have been lost. 24 Newer techniques (e.g., computer-assisted imaging, specialized photographic methods) for assessing changes in the optic nerve may prove more sensitive for early injury, but they are currently too complicated or expensive to be used for routine screening. 1,17 top link

Effectiveness of Early Detection

Visual deficits due to glaucoma are not generally reversible, but early treatment is widely believed to prevent or delay the progression to more serious vision problems. The assumption that lowering intraocular pressure improves outcome in patients with glaucoma is based primarily on indirect evidence, however: the strong association between level of intraocular pressure and risk of POAG, the deleterious effects of raised IOP in secondary glaucoma and in animal models, and the progressive nature of untreated glaucoma. Controlled studies of treatment of glaucoma have generally compared different modes of therapy with each other, rather than comparing treatment to no treatment. 25 The majority of patients experience continuing loss of vision despite treatment, however, and change in IOP does not reliably distinguish patients who progress on treatment from those with stable disease. 26,27 A few observational studies have reported a higher incidence of disease progression in those receiving treatment than untreated patients, but these findings are probably biased by more severe disease in treated subjects. 10 Some indirect evidence of treatment effectiveness is provided by a report from Denmark of declining incidence of blindness due to glaucoma over the past 30 years. 28 The disparity in the rates of glaucoma and glaucoma blindness among white and black Americans may also reflect greater access to effective treatment among whites, although the higher prevalence of glaucoma among blacks may have a biologic basis as well. 2,29 Nonetheless, for many patients who would be detected by screening, especially those with mild visual field defects and moderate elevations of intraocular pressure, the natural history of disease and the benefits of early treatment remain uncertain. The Early Manifest Glaucoma Trial, currently under way in Sweden, is randomizing such patients to early treatment with medications and laser therapy or no initial treatment (M.C. Leske, personal communication, Stony Brook, NY, March 1995).

A larger number of controlled studies has been conducted among patients with elevated IOP but no visual deficits. Early trials suffered from various methodologic problems, including small size, insufficient follow-up, or use of less reliable methods for determining visual changes. 30-33 Three recent, well-designed studies have compared ocular timolol treatment to no treatment (or placebo) in patients with normal visual fields and moderate elevations of IOP (<35 mm Hg, mean 24-26 mm Hg). These studies each enrolled larger numbers of patients, followed subjects between 4-8 years, and used automated perimetry to detect or confirm new visual field deficits. A study by Kass et al. randomized one eye to active treatment (and one to placebo) in 62 patients. After 5 years of treatment, new visual deficits developed in 4 timolol-treated eyes and 10 placebo-treated eyes, a result of borderline statistical significance. 11 Systemic effects of timolol on placebo-treated eyes may have diminished the apparent benefit of treatment. A second study by Epstein et al. randomized 107 patients to timolol or placebo: 9 patients on placebo (vs. 4 on active treatment) developed new visual field defects. The benefits of treatment were of borderline significance (p = 0.07) using a combined endpoint of visual field changes, increase in cup-disk ratio, or progression to more severe intraocular hypertension (IOP > 32 mm Hg). 12 In contrast, Schulzer et al. found no benefit of timolol treatment, despite enrolling more subjects and more effectively lowering IOP than previous trials (mean 4.5 mm Hg). 13 Over a 6-year study, there were no differences between treated and untreated subjects in the progression to new visual field deficits, disk hemorrhage, or change in photographic appearance of the optic disk. Neither mean IOP nor change in IOP predicted progression of disease in subjects using timolol. The power of each of these trials was reduced by substantial dropout rates among treated subjects (up to 25%).

A meta-analysis of these three trials estimated that treatment reduces the proportion of patients who develop new visual deficits by 25%, but it could not rule out a possible harmful effect of treatment. 25 The difficulty in demonstrating a significant effect in previous clinical studies may be due in part to variations among individuals in their sensitivity to raised IOP, modest effects of treatment on IOP, and poor long-term compliance with therapy. Due to continuing uncertainty about the benefits of treating moderate, isolated intraocular hypertension, a new, large randomized trial is now under way. 34

The adverse effects of glaucoma treatment are potentially significant. Antiglaucoma medications must be taken for life and are accompanied by a variety of side effects. Eye drops containing cholinergic agonists (e.g., pilocarpine, carbachol, and echothiophate) and adrenergic agonists (epinephrine and dipivefrin) can cause ocular and systemic side effects topical beta blockers (e.g., timolol, levobunolol, metipranol, and betaxolol) can cause bradycardia, bronchospasm, or worsening of congestive heart failure and oral carbonic anhydrase inhibitors (e.g., acetazolamide, methazolamide) can cause malaise, anorexia, and other adverse systemic effects. 35,36

Argon laser trabeculoplasty appears to be a relatively safe alternative to medication, but it is expensive and its long-term effectiveness remains uncertain. 1,37 Although laser treatment lowered IOP more effectively than medications in one trial, more than half of laser-treated eyes required medications to control IOP, and no difference in progression to visual deficits was noted in 2-year follow-up. 37 Filtering surgery, which is usually reserved for patients unresponsive to other treatment, achieves greater reductions in IOP but carries a higher risk of serious postoperative ophthalmologic complications, including permanent loss of vision. 1 Trials of surgery as initial treatment for glaucoma are under way. 36 top link

Recommendations of Other Groups

The American Academy of Ophthalmology recommends a comprehensive eye examination by an ophthalmologist (including examination of the optic disc and tonometry) for all adults beginning around age 40, and periodic reexamination thereafter. Periodic examination every 3-5 years is also recommended for younger black men and women (age 20-39), due to their higher risk of glaucoma. 38 The American Optometric Association recommends regular optometric evaluations (including tonometry) for all adults, and advises primary care clinicians to screen for glaucoma (with ophthalmoscopy and/or tonometry) in high-risk groups, including persons over 50, blacks, diabetics or hypertensives, relatives of glaucoma patients, and others with specific health concerns or medical conditions. 39 Prevent Blindness America (formerly the National Society to Prevent Blindness) recommends that asymptomatic individuals have periodic comprehensive eye examinations beginning at age 20, with increasing frequency for African Americans and others at high risk. 40 A 1988 review by the Office of Technology Assessment of the U.S. Congress concluded that the benefits of screening for glaucoma or ocular hypertension among the elderly were uncertain. 10 The Canadian Task Force on the Periodic Health Examination concluded there was insufficient evidence to recommend for or against screening for glaucoma in the periodic health examination, but stated that referral of high-risk persons to a specialist with access to automated perimetry was "clinically prudent." 41 top link

Discussion

Glaucoma remains an important cause of blindness and impaired vision in older Americans, especially among blacks. Treatment of glaucoma with medications or surgery to lower intraocular pressure has been the standard of care for many years, and it remains prudent for patients with more severe visual deficits or extreme elevations in intraocular pressure. Definitive evidence to support the benefit of treating persons with early, mild disease is not yet available, however. Controlled treatment trials currently under way may help resolve the questions about early intervention in persons with mild disease and those at increased risk for glaucoma.

Despite a potential benefit of early treatment, the current evidence is not sufficient to recommend for or against routine screening for glaucoma in the primary care setting. There is currently no efficient and reliable method for primary care clinicians to detect patients who have early glaucoma or who are likely to develop glaucoma. While patients with elevated intraocular pressure are at increased risk of developing glaucoma, the majority may never develop significant vision problems, and the benefit of early treatment for such patients remains unproven.

Accurate glaucoma screening is best performed by eye specialists with access to specialized equipment for assessing the appearance and function of the optic nerve (e.g., slit-lamp, automated perimetry). Even experts, however, face limitations in screening patients for early disease. Of the three methods currently available for screening (tonometry, examination of the optic disk, and measurement of visual fields), only the latter is sufficiently sensitive and specific for glaucoma. Perimetry, however, is relatively expensive and time-consuming for use in routine screening, it detects patients relatively late in the disease process, and older patients may have difficulty adequately completing the examination.

Assuming that treatment of early glaucoma is effective, screening will be most useful in populations with an increased prevalence of glaucoma. If newer methods prove able to detect early and specific evidence of glaucoma (e.g., optic nerve damage), routine screening for early disease may become more feasible.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening by primary care clinicians for elevated intraocular pressure or early glaucoma ("C" recommendation). Effective screening for glaucoma is best performed by eye specialists who have access to specialized equipment to evaluate the optic disc and measure visual fields. Recommendations may be made on other grounds to refer high-risk patients for evaluation by eye specialists. This recommendation is based on the substantial prevalence of unrecognized glaucoma in these populations, the progressive nature of untreated disease, and expert consensus that reducing intraocular pressure may slow the rate of visual loss in patients with early glaucoma or severe intraocular hypertension. Populations in whom the prevalence of glaucoma is greater than 1% include blacks over age 40 and whites over age 65. Patients with a family history of glaucoma, patients with diabetes, and patients with severe myopia are also at increased risk and may benefit from screening. The optimal frequency for glaucoma screening has not been determined and is left to clinical discretion.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH, with contributions from materials prepared by Christopher Patterson, MD, FRCP, for the Canadian Task Force on the Periodic Health Examination. top link

35. Screening for Hearing Impairment

Burden of Suffering

Prevalence estimates for hearing impairment vary depending on age and the criteria used to define the various causal conditions. 1 For severe congenital and prelingually acquired losses, estimates range from 1 to 3/1,000 live births. 1-5,12,22,30 Moderate and severe hearing losses in early infancy are clearly associated with impaired language development. 6-8 Factors that increase the risk for congenital or delayed-onset sensorineural hearing impairment include family history of hearing impairment, congenital or central nervous system infections, ototoxic drug exposure, prematurity, congenital head and neck deformities, trauma, and several other factors associated with admission to an intensive care nursery. 2,5,9 Chronic and recurrent acute otitis media is commonly associated with temporary hearing loss in infants and school-aged children. Prevalence rates for otitis media are 12% before age 3, 4-18% for ages 4-5, and 3-9% for ages 6-9 years. 10 At any given time, about 5-7% of children ages 5-8 have a 25-dB hearing loss, usually a self-limited complication of otitis media with effusion. 11 Only a small proportion of episodes of otitis media occurring in school-aged children result in serious long-term complications, usually due to chronic middle ear effusion or previously undetected sensorineural deficits. 11 The uncertainties of the population occurrence rates and causes of infant and childhood hearing loss have been emphasized. 81

Hearing impairment creates further difficulties in adulthood. Adult hearing impairment has been correlated with social and emotional isolation, clinical depression, and limited activity. 2,12,13,16 Hearing loss acquired between adolescence and age 50 may be due to relatively uncommon causes such as Ménière's disease, trauma, otosclerosis, ototoxic drug exposure, and eighth cranial nerve tumors. Noise-induced hearing loss is a common cause of sensorineural hearing impairment in this age group. This is particularly true for the estimated 5 million Americans with occupational exposure to hazardous noise levels. 14 The prevalence of hearing impairment increases after age 50 years, with presbycusis being the most important contributor to this increase. Approximately 25% of patients between ages 51 and 65 years have hearing thresholds greater than 30 dB (normal range being 0-20 dB) in at least one ear. 15 An objective hearing loss can be identified in over 33% of persons aged 65 years and older and in up to half of patients ages 85 years and older. 16,17 Older persons with hearing impairment are particularly prone to suffering the associated social and emotional disabilities described earlier. 18,19 top link

Accuracy of Screening Tests

Multiple methods of audiologic testing are potentially suitable for evaluating possible hearing deficits. Test selection is usually dictated by patient age and occasionally by type of hearing loss in question (i.e., conductive vs. sensorineural). Cooperative children and adults are usually tested with pure-tone audiometry. With pure-tone thresholds in audiometric test booths used as a reference criterion, this technique has a reported sensitivity of 92% and a specificity of 94% in detecting sensorineural hearing impairment. 37 Comparable results have been obtained in recent studies using hand-held audiometers. 16,37 Audiometric results are, however, subject to error due to improper technique, background noise in the test area, and unintentional or intentional misreporting by the subject. 11,38 Efforts have been made to devise a sufficiently accurate test utilizing the pure-tone audiometer that is briefer and less costly than standard pure-tone audiometry, but clinical efficacy is not yet confirmed. 35

Evaluation of neonates and infants below the age of 2-3 years with audiometry is more difficult or not feasible because it depends on developmental ability it therefore usually requires some form of electrophysiologic and/or behavioral testing. Auditory brainstem response testing (ABR) is currently viewed as the standard for physiologic testing in infancy and the most accurate method available for determining hearing function. 1,2,5,20 Sensitivity rates have been reported to be 97-100% and specificity rates to be 86-96% in comparison with behavioral testing measures. 2,5,21

In order to detect congenital or postnatally acquired hearing loss, some form of newborn screening performed prior to hospital discharge has been recommended as most efficacious for ensuring early identification and proper follow-up and treatment of hearing loss. 5,6,20,22 As a universal screeningtest, ABR (or modified ABR) is probably unsuitable because of the need for costly equipment and trained operators in all community hospitals and birthing centers. Another screening modality for neonates is the high-risk register (HRR), a specific list of clinical risk factors associated with higher rates of neonatal and infant hearing impairment. 23,24 Those who meet criteria then undergo more objective hearing evaluation, usually ABR. The HRR identifies 50% or more of unselected infants with hearing loss 24 and 75-80% of hearing-impaired neonates in the intensive care nursery. 5 Behavioral testing techniques have also been used for infant hearing screening, including the "crib-o-gram," auditory response cradle, and distraction testing. 29-32 The limited specificity and sensitivity of behavioral testing, as well as specialized equipment and training requirements, renders these methods less desirable than physiologic testing procedures.

Evoked otoacoustic emission (EOE) testing is a relatively new screening method suitable for neonatal and infant screening. 22,25-28 Otoacoustic emissions are sounds generated by normal cochlear hair cells and detectable with relatively simple instrumentation. 67 Data concerning normative standards and reproducibility are now becoming available. 68-73 Using a cutoff of 30 dB to designate hearing impairment, EOE testing has an overall agreement rate with ABR of 91%, with a sensitivity of 84% and specificity of 92%. 74,75 Statewide neonatal auditory screening programs have been devised using EOE, and the logistical issues of operating such a program have been described. 86 Studies of EOE testing suggest a high rate of false-positive screens relative to true-positive results, which would be expected when testing for a low-prevalence condition, and some failures of testability, necessitating retesting with EOE and ABR. 83 In one screening study, only 15% of positive EOE screening tests were confirmed on repeat EOE testing 4-6 weeks later the proportion of infants with confirmed screening tests who actually had hearing loss is unknown, since the results of diagnostic follow-up tests were not available. 88 Based on the authors' estimates of true population prevalence, more than 90% of the positive neonatal EOE screening tests were false positives. Problems such as ambient noise in the newborn nursery and other factors that affect the technical conduct of EOE require solution before this technique can be applied widely. 83,84

The majority of children with congenitally or neonatally acquired losses are identified by age 4-5 years. 1 Hearing loss in the preschool and school-aged group is largely related to acute or chronic otitis media with effusion (OME), of which the majority of cases resolve uneventfully. 11 Routine audiometry can often detect the mild conductive hearing loss associated with OME. 33 Accuracy for detecting hearing loss associated with OME by audiometry may be variable in this age group, however, because of the mild and changing nature of the conductive loss, varying patient cooperation, conditions that make testing difficult (e.g., mental retardation), and the fact that middle ear conduction deficits may be superimposed on previously undetected sensorineural hearing loss due to other conditions.

Routine screening of working-age adolescents and adults is usually limited to those in high-risk occupations involving exposure to excessive noise levels. Among older persons, however, in whom the rate of hearing impairment is high, recommended screening methods for detecting hearing loss have included written patient questionnaires, clinical history-taking and physical examination, audiometry with a hand-held device, and simple clinical techniques designed to assess for the presence of hearing impairment. 15,16,35,36 These screening tests have not been fully evaluated, however. For example, the whispered voice test is one simple clinical technique used to assess hearing. Reported sensitivities and specificities have been 70-100% using pure-tone audiometry as the reference standard, but there are inadequate data on interobserver variability. 16 The free-field voice, tuning fork, and finger rub tests have been criticized on similar grounds. 16 Self-assessment questionnaires to identify hearing impairment probably represent the most rapid and least expensive way to screen for hearing loss in the adult. Depending on audiometric criteria, these questionnaires are reported to be 70-80% accurate for identifying patients with hearing loss defined by pure-tone audiometry. 16,36,82 top link

Effectiveness of Early Detection

Assessing the effectiveness of screening for hearing impairment depends upon the evidence that (a) hearing loss leads to decreased function and affects the quality of life, (b) screening leads to earlier detection of hearing abnormalities than spontaneous clinical presentation or observation, (c) various forms of hearing loss can be treated effectively, and (d) effective treatment leads to improved function and well-being.

Theoretically, the greatest benefit from hearing screening comes from detection of moderate to severe hearing impairment between birth and age 3 years. Auditory stimuli during this period appear to be critical to development of speech and language skills, 2,46 although other factors undoubtedly also play an important role. If screening for hearing deficits is performed near the time of birth, followed by definitive diagnosis, the choice of treatment and treatment success will depend on the etiology of the hearing loss. For sensorineural impairment, depending on the degree of loss, treatment may range from amplification in the majority of cases to cochlear implantation in profoundly deaf children. In both cases, speech and hearing therapy has been promoted as a key component of treatment and the efficacy of such therapy has been claimed. 78,79 Cochlear implant technology continues to evolve for treatment of profound deafness in children. Several studies have demonstrated improved language development and communication skills in deaf infants following cochlear implantation. 47,48 Several nonrandomized, prospective studies have also demonstrated superior communication performance in prelingually deafened children who received implants as compared to similar children using more traditional tactile or acoustic hearing aids. 49,75,76

Although the benefits of various treatments for hearing loss seem manifest, no controlled clinical trials have evaluated the effect of early screening on long-term functional and quality-of-life outcomes. Rather, studies of treatment efficacy are generally observational and retrospective, consisting of clinical series or case-control studies of highly selected patients, often with heterogeneous causes of hearing loss, and incompletely defined treatment regimens or protocols of uncertain compliance. Additionally, important confounders such as other patient characteristics (e.g., race or ethnic group, socioeconomic status, level and laterality of hearing loss, the presence of co-morbidity, disability, or developmental delay due to various causes), family characteristics, and the presence and nature of other therapeutic interventions are often not considered in the analysis. Thus, despite widespread professional opinion of general treatment efficacy, much more information is needed on the existence and level of treatment protocol efficacy. In many instances, however, it may understandably be deemed inappropriate to withhold any customary type of treatment in the research setting despite the limited evidence of treatment efficacy. 85

Conductive hearing loss in the preschool-age group is most commonly due to self-limited cases of otitis media with effusion. Multiple studies have concluded that hearing impairment in infancy due to chronic or recurrent otitis media with effusion can impair language development. 39-41 Although these studies have come under methodologic criticism, 42,43 several authors believe that available evidence is adequate to substantiate this relationship. 8,44 Auditory thresholds in hearing-impaired children can be improved through amplification with hearing aids and frequency modulation radio devices. Auditory and language training can also improve communication skills. 12,75,76 While early detection and treatment of such hearing loss would therefore appear to be beneficial, there are no controlled studies comparing outcome of hearing-impaired persons identified through screening to those not screened. The fewer than 5% of infants with chronic otitis who do not respond spontaneously or with medical management are at further risk for more significant pathology including middle ear fibrosis or adhesions and cholesteatoma. 11 Myringotomy and pressure-equalizing tube placement can resolve the conductive loss and prevent reaccumulation of middle ear effusion. 50 No randomized or otherwise well-controlled study exists, however, demonstrating that infants or young children screened with routine hearing testsfor chronic middle ear disease have a better outcome than those not screened in this manner. Nevertheless, if hearing loss is detected as part of the routine diagnosis or management of chronic OME, management of either sensorineural or conductive losses by standard regimens is indicated.

In older children, otitis media with effusion is responsible for the majority of hearing loss identified through screening. 3,11 As is the case in infants and toddlers, however, there is little evidence that asymptomatic children receiving hearing screening have better functional outcomes than those not screened. In fact, several studies of preschool and school-aged children who underwent audiometric screening demonstrated no significant difference in future audiometric performance between screened and unscreened children 51 nor any preventive benefit from screening. 4 Most hearing loss detected under these circumstances is self-limited and related to acute otitis media with effusion that resolves spontaneously within 6-8 weeks. 3,11 Since the critical period of language development has passed at this age, these individual episodes would appear to have little impact on educational performance. Studies have been unable to provide consistent evidence that clinical interventions for chronic OME (e.g., antibiotics, myringotomy, tympanostomy tubes) are able to achieve sufficient long-term improvement in hearing and language skills to justify the risk of complications. 41,43,51,53 A small portion of children routinely screened for hearing loss will demonstrate a protracted hearing impairment due to previously undetected, less severe, sensorineural losses as well as chronic and recurrent middle ear disease. These children may be at risk for educational and language problems, 1,52,53 although the evidence for this contention has been challenged.] 42

For adults between the ages of approximately 18 and 50 years, unrecognized hearing impairment is uncommon except for high-risk groups such as persons in occupations at risk for noise-induced hearing loss. 54,55 The incidence of hearing impairment, predominately due to presbycusis, rises quickly beyond age 50, however. No controlled study has proven the effectiveness of screening for hearing impairment in the adult population. 16 Two reviews cite numerous studies documenting the benefits of hearing amplification in these patients. 16,55 A 1990 randomized controlled trial demonstrated a measured improvement in social, cognitive, emotional, and communication function from hearing aid use in a group of elderly veterans with previously documented hearing loss. 56 The issue of patient compliance with recommendations to obtain hearing amplification has been raised as it relates to hearing screening, 15,55 but compliance rates of close to 40-60% can be achieved in some settings. 16 Patients receiving hearing aids have demonstrated improvement in communication and social function, as well as emotional status. 56 top link

Recommendations of Other Groups

The Joint Committee on Infant Hearing 1994 Position Statement, developed and approved by the American Speech-Language-Hearing Association (ASHA), American Academy of Otolaryngology-Head and Neck Surgery, American Academy of Audiology, American Academy of Pediatrics (AAP), and Directors of Speech and Hearing Programs in State health and welfare agencies, endorses the goal of universal detection of infants with hearing loss before 3 months of age. 59 When universal screening is not available, the committee recommends testing infants with indicators associated with sensorineural and/or conductive hearing loss, by 3 months of age. The high-risk indicators are similar to those described under Clinical Intervention(see below). The Bright Futures guidelines recommend hearing screening for all newborns prior to 3 months of age. 60 The National Institutes of Health recommends universal screening of all infants before age 3 months using evoked otoacoustic emission testing. 77 The Canadian Task Force on the Periodic Health Examination recommends regular assessment of hearing during well-baby visits during the first 2 years of life using parental questioning and the clap test. 80 The American Academy of Family Physicians (AAFP) recommends screening high-risk infants for hearing impairment high-risk criteria are similar to those described under Clinical Intervention(see below). 65 The recommendations of the AAFP are currently under review.

The AAP recommends periodic historical inquiry regarding hearing throughout infancy and childhood and objective testing at ages 3, 4, 5, 10, 12, 15, and 18. 61 The Bright Futures guidelines recommend hearing screening at ages 3-6, 8, and 10, and yearly from ages 11-21 if the adolescent is exposed to loud noises, has recurring ear infections, or reports problems. 60 In 1990, ASHA reaffirmed its recommendation for annual audiometry for all children functioning at a developmental level of 3 years through grade 3 and for all children in high-risk groups. 62,63 ASHA also added tympanometry to their screening protocol for this age group as well as for any other patient undergoing screening audiometry up to age 40. The Canadian Task Force on the Periodic Health Examination recommends against routine preschool screening for hearing problems. 80 The AAFP does not recommend routine hearing screening in children after age 3 years 65 this recommendation is under review.

Recommendations for adults vary and also depend on age. Although ASHA proposes a screening protocol applicable to young adults, no guidelines are given regarding exactly who should be screened or what are optimal times for screening. 62 In the U.S., federal law mandates baseline and annual audiometry for workers of any age exposed to hazardous noise levels. 14 The Canadian Task Force recommends risk assessment for hearing loss by history and physical examination at age 16 and thereafter during clinical visits for any other reason. 64 The AAFP recommends screening for hearing impairment in adolescents and adults regularly exposed to excessive noise in recreational or other settings 65 this recommendation is under review.

The Institute of Medicine recommended audiometric testing once each during ages 40-59, 60-74, and 75 and over. 66 The Canadian Task Force on the Periodic Health Examination recommends screening the elderly for hearing impairment, using a single question about hearing difficulty, whispered-voice out of the field of vision, or audioscope. 80 The AAFP recommends evaluation of hearing in persons aged 65 years and older, and hearing aids for patients found to have hearing deficits 65 this recommendation is under review. top link

Discussion

While congenital hearing loss is a serious health problem associated with developmental delay in speech and language function, there is little evidence to support the use of routine, universal screening for all neonates. Although screening methods have reasonable sensitivity and specificity, a substantial number of infants will be misclassified because the prevalence of hearing impairment is low. Also, screening technology is evolving, and the costs and feasibility for universal application are not fully known. Most importantly, the evidence for efficacy of early intervention is incomplete. There have been no controlled clinical trials designed to test whether devices or complex protocols lead to superior speech and language outcomes in screened children. For older children, good quality evidence suggests little benefit from screening, while for adolescents and young and middle-aged adults there is limited evidence evaluating hearing impairment and treatment. Many older adults with clinical complaints of hearing loss or documented hearing deficits, however, benefit from hearing aids or other forms of amplification.

Treating deaf children with modalities such as cochlear implants has stimulated ethical concerns from some advocates for the deaf, a full discussion of which is beyond the scope of this chapter. Attitudes held by both physicians and by society toward deaf individuals have changed over time, and various associations now offer support for individuals affected by deafness, promote their full participation in society, and seek to preserve and expand deaf awareness, deaf culture, and deaf heritage efforts. 87

CLINICAL INTERVENTION

Screening older adults for hearing impairment by periodically questioning them about their hearing, counseling them about the availability of hearing aid devices, and making referrals for abnormalities when appropriate, is recommended ("B" recommendation). The optimal frequency of such screening has not been determined and is left to clinical discretion. An otoscopic examination and audiometric testing should be performed on all persons with evidence of impaired hearing by patient inquiry. Although hand-held devices for audiometry testing (audioscopes) are also sensitive screening tools for hearing deficits, patient inquiry is likely to be a more rapid and less expensive way to screen for hearing loss in older adults. There is therefore insufficient evidence to recommend for or against routinely screening older adults for hearing deficits using audiometry testing ("C" recommendation).

There is insufficient evidence to recommend for or against routinely screening asymptomatic adolescents and working-age adults for hearing impairment ("C" recommendation). Recommendations against such screening, except for those exposed to excessive occupational noise levels, may be made on other grounds, including low prevalence, high cost, and the likelihood that hearing deficits in these individuals will present clinically. Screening of workers for noise-induced hearing loss should be performed in the context of existing worksite programs and occupational medicine guidelines.

Routine hearing screening of asymptomatic children beyond age 3 years is not recommended ("D" recommendation). It is recognized, however, that such testing often occurs outside the clinical setting. When this occurs, abnormal test results should be confirmed by repeat testing at appropriate intervals, and all confirmed cases identified through screening referred for ongoing audiologic assessment, selection of hearing aids, family counseling, psycho-educational management, and periodic medical evaluation.

There is insufficient evidence to recommend for or against routine screening of asymptomatic neonates for hearing impairment using evoked oto-acoustic emission (EOE) testing or auditory brainstem response (ABR) ("C" recommendation). Recommendations to screen high-risk infants may be made on other grounds, including the relatively high prevalence of hearing impairment, parental anxiety or concern, and the potentially beneficial effect on language development from early treatment of infants with moderate or severe hearing loss. For many high-risk conditions, hearing testing is commonly considered to be part of diagnostic evaluation and management. Risk factors for congenital or perinatally acquired hearing loss include family history of hereditary childhood sensorineural hearing loss congenital perinatal infection with herpes, syphilis, rubella, cytomegalovirus, or toxoplasmosis malformations involving the head or neck (e.g., dysmorphic and syndromal abnormalities, cleft palate, abnormal pinna) birth weight below 1,500 g bacterial meningitis hyperbilirubinemia requiring exchange transfusion severe perinatal asphyxia (Apgar scores of 0-4 at 1 minute or 0-6 at 5 minutes, absence of spontaneous respirations for 10 minutes, or hypotonia at 2 hours of age) ototoxic medications and findings associated with a syndrome known to include hearing loss. ABR testing may be useful for all infants who meet at least one of these high-risk criteria or for those who fail EOE testing. High-risk infants should ideally be screened prior to leaving the hospital after birth, but those not tested at birth should be screened before age 3 months with the goal being to initiate rehabilitation by age 6 months as clinically indicated. Clinicians examining any infant or young child should remain alert for symptoms or signs of hearing impairment, including parent/caregiver concern regarding hearing, speech, language, or developmental delay.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Robert Wallace, MD, MSc, and John Laurenzo, MD. top link

36. Screening Ultrasonography in Pregnancy

Burden of Suffering

Ultrasonography is widely used in pregnancy in the U.S. According to 1992 U.S. natality data, 58% of mothers who had live births received ultrasonography in pregnancy, compared to 48% in 1989. 1 The highest rates occurred in white women and those ages 25-39 years. For asymptomatic low-risk women, a single scan in the second trimester may be used to estimate gestational age in women with unreliable dates of last menses and to detect multiple gestation and fetal malformations. A third-trimester scan may be used to screen for intrauterine growth retardation (IUGR) and fetal malpresentation as well as previously undetected multiple gestations and malformations. 2

These conditions may be associated with increased maternal or perinatal morbidity and mortality. Inaccurate estimation of gestational age may lead to repeated testing of fetal well-being and induction of labor in pregnancies erroneously thought to be postterm. 3,4 About 25-45% of women are unable to provide an accurate menstrual history 3,5,6 the estimated date of confinement derived from the last menstrual period differs by more than 2 weeks from the actual date of birth in nearly one quarter of pregnancies. 5 Multiple gestation is associated with increased perinatal mortality, preterm delivery, and other obstetric complications, 7 and it is more likely to result in cesarean delivery (56% compared to a baseline rate of 23%). 8 The ratio of multiple gestation births to all births (currently 24/1,000 live births) has risen steadily since 1972 and is the highest reported in the past 50 years. 1 Congenital anomalies are the leading cause of death before 1 year of age in the U.S., with a mortality rate of 1.7/1,000 live births, and are also important contributors to childhood morbidity and shortened life expectancy. 10,11 Fetal growth retardation has been associated with poor pregnancy outcomes, including fetal and neonatal death, reduced intelligence, seizures, and cerebral palsy, although most term growth-retarded infants develop normally. 12 Breech and other malpresentations may be associated with poor outcome and result in cesarean delivery in 84% of cases. 8 Malpresentation occurs in 38/1,000 live births, with the risk increasing with increasing age of the mother. 1 top link

Accuracy of the Screening Test

Real-time ultrasound consists of high-frequency sound waves that allow two-dimensional imaging of both structural and functional characteristics of the fetus, as well as the location and morphology of the placenta. 2 (This chapter will not address the topic of umbilical Doppler ultrasound 13 ). Ultrasound is the recommended test for determination of gestational age in women with uncertain menstrual dates because measurement of the biparietal diameter, when performed early in the second trimester, has been shown to be accurate in determining gestational age. 5,6 Ninety percent of patients deliver within 2 weeks of the due date when gestational age is determined by early second-trimester ultrasound. 5

Ultrasound can also detect multiple gestations, which are missed by clinical examination in nearly one third of cases. 14 One center that provided the only maternity services in its community reported that 98% of all twins were detected antenatally when routine ultrasound screening was performed. 4,6 The average gestational age at detection fell from 27 weeks to 20 weeks. Randomized controlled trials of routine ultrasound before 20 weeks found higher rates of early detection of multiple gestation with screening (83-100%) compared to unscreened controls (60-76%). 15-19 False-positive ultrasound diagnoses also occur, however, primarily in the first trimester over 20% of multiple fetuses identified in the first trimester are either artifacts or die early in pregnancy. 20

Many fetal structural malformations, including cardiac, gastrointestinal, renal, limb, and neural tube defects, can also be detected by current ultrasound techniques (for detailed discussion of ultrasound screening to detect chromosomal abnormalities and neural tube defects, see Chapters 41 and 42, respectively). Detection rates depend on the quality of the equipment and the expertise of the ultrasonographer. In a trial in low-risk pregnant women, routine serial ultrasonography at 15-22 and 31-35 weeks of gestation had a sensitivity of 35% for detecting fetuses with at least one major anomaly before delivery but only 17% for detection before the typical gestational-age limit for legal abortion (<24 weeks). 21 Only 4% of missed cases occurred in women who did not comply with scheduled screening ultrasounds. The sensitivity of selective ultrasound, performed only for obstetric or medical indications, was significantly lower than routine ultrasound: 11% before delivery and 5% before 24 weeks. In this study, the sensitivity of routine midtrimester ultrasound was significantly higher at tertiary compared to other scanning facilities (35% vs. 13%). False-positive diagnoses were reported for 7 cases, or 0.9/1,000 pregnant women scanned before 24 weeks, with most reported from other than tertiary facilities. In another trial, the rates of detection of major malformations by screening before 20 weeks (confirmed at abortion or delivery) were 36% and 77% at two hospitals. 19 Ten of the thirty cases with suspected major malformations were judged normal at follow-up ultrasound examinations at 20-36 weeks and an 11th was found to have only a minor anomaly at delivery 2.7/1,000 pregnant women received a false diagnosis of a major fetal malformation. Large case series evaluating routine ultrasound in low-risk women have reported sensitivities ranging from 21% to 74% for detecting major fetal abnormalities prior to 22-24 weeks among women who were scanned in the second trimester. 22-24 False-positive rates of 0.2-1.0/1,000 women scanned were reported in one study, 6 of 8 initially false-positive diagnoses were corrected on follow-up evaluation. Direct comparisons of the trials and series results are hampered by varying definitions of "fetal malformation."

The ultrasound examination is the most accurate means of detecting IUGR, although the lack of consensus on standards for the definition or diagnosis of IUGR 12 makes evaluating screening tests for this condition difficult. Measurements of the fetal abdomen and head, and indices that compare the relative sizes of these structures, are accurate in assessing fetal growth. 25-32 A small abdominal circumference, for example, the most commonly affected anatomic measurement, 3 has a sensitivity of 80-96% and a specificity of 80-90% in detecting growth-retarded fetuses in the third trimester. 3,26,33,34 The product of the crown-rump length and the trunk area has a sensitivity of 94% and a specificity of 90%. 35 Because of the relatively low risk of IUGR in the general population, however, the likelihood that an abnormal test indicates IUGR is relatively low. For example, an abnormal abdominal circumference at 34-36 weeks' gestation indicates IUGR in only 21-50% of cases. 26,33,36 The generalizability of these studies has also been questioned many had small samples, used only expert ultrasonographers, and/or suffered from methodologic limitations. 26,37 In addition, the definitions commonly used in these studies may cause normal but constitutionally small fetuses to be labeled as IUGR. top link

Effectiveness of Early Detection

For routine ultrasonographic screening to be proven beneficial, evidence is needed that interventions in response to examination results lead to improved clinical outcome. Twelve randomized controlled trials have examined the effectiveness of routine ultrasound screening in improving maternal or neonatal outcomes. Four of these evaluated a single ultrasound before 20 weeks, 15-17,19 three trials assessed serial ultrasound at 18-20 weeks and 31-35 weeks, 18,21,38-40 three trials evaluated one or two ultrasounds between 32 and 37 weeks when all subjects received one ultrasound before 24 weeks, 35,41,42 and two tested multiple scans (plus Doppler flow studies in one trial) every 3-4 weeks beginning at 24-28 weeks, with all subjects receiving a single midtrimester scan. 43,44 In a 13th trial, all subjects received three ultrasounds, but the results of placental grading at 34-36 weeks were reported only for the experimental group. 45 In addition, four meta-analyses have been published, none of which included the U.S. RADIUS trial, the most recent and largest to date. 46-49 In most of the trials, large proportions of the controls also received ultrasound results, although not with the same timing or frequency as in the screened groups.

The most important potential benefit of ultrasound screening is reduced perinatal mortality. Among the seven trials that evaluated an ultrasound before 20 weeks (with or without additional late ultrasound), only the Helsinki trial 19 and a meta-analysis heavily influenced by that trial's results 47 were able to demonstrate a statistically significant benefit in lowering perinatal mortality. Two trials 17,40 showed nonsignificant reductions in mortality while the remaining four trials and another meta-analysis 48 showed no mortality benefit. In the Helsinki trial, the overall perinatal death rate was 4.6/1,000 deliveries (n = 18) in screened women versus 9.0/1,000 deliveries (n = 34) in unscreened women. In the experimental group, 11 induced abortions were performed because of ultrasound findings and two babies died with major anomalies, compared to no abortions and 10 deaths with anomalies in the control group. There was no difference in perinatal mortality when the induced abortions resulting from ultrasound detection of congenital anomalies were included as deaths in the analysis. The meta-analysis 47 that reported a significant mortality reduction included the four then-published trials 16-19 that compared routine to selective ultrasound scanning and that reported number of pregnancies, deliveries, and perinatal deaths. It also evaluated the live birth rate, which takes into account induced abortions for malformations, and found it to be identical in the screened and control groups. The largest trial to date, the RADIUS trial, 38 randomized 15,151 low-risk pregnant women to routine ultrasound scans at 15-22 and 31-35 weeks of gestation or to usual care, which included ultrasounds performed for indications that developed after randomization. The risk of fetal or neonatal death was the same in the screened (0.6%, n = 52) and control (0.5%, n = 41) groups. Including induced abortions for fetal anomalies (9 vs. 5 in the routinely and selectively screened groups, respectively) did not affect these estimates.

Effects on neonatal and maternal morbidity from a single second-trimester scan have also been evaluated. Most of the trials and meta-analyses showed no statistically significant benefit of prenatal ultrasound on neonatal morbidity (including low birth weight, admission to special care nursery, neonatal seizures, mechanical ventilation, and Apgar scores), or on maternal outcomes such as antenatal hospitalization. 15,17-19,46,47 In one randomized controlled trial of early second-trimester ultrasound, 16 babies born to screened women had a significantly greater mean birth weight (3,521 g vs. 3,479 g) than did those born to controls, with most of the benefit accruing to smokers. The Cochrane Database meta-analysis reported significantly fewer low birth weight singleton births and reduced risk of admission to special care nurseries with routine early ultrasound, but no effect on Apgar scores. 48 The RADIUS trial reported a slightly lower rate of tocolysis in screened women (3.4% vs. 4.2%), but no other differences in maternal outcomes (e.g., amniocentesis, external version, cesarean delivery, or days of hospitalization) 39 or in overall or individual indicators of perinatal morbidity. 38

Accurate dates determined by second-trimester ultrasound might help prevent routine tests of fetal well-being and the induction of labor for fetuses thought to be postterm on the basis of erroneous dating. 3,4,26 Rates of induced labor for postterm pregnancy were significantly reduced in three trials 16,39,40 but were unaffected in two others 17,18 meta-analysis demonstrated significantly decreased inductions for postterm pregnancy. 48 These trials may have underestimated such effects by including women with reliable dates, who are less likely to benefit from ultrasound dating. Trials and meta-analyses have not established whether overall rates of induced labor are reduced by a second-trimester ultrasound. 15-18,39,40,47,48 In the RADIUS trial, the significant decreases in induced labor for postterm pregnancy were completely offset by significant increases in inductions for IUGR. 39 Two meta-analyses 46,47 reported significant heterogeneity among the trials, suggesting that other factors, such as differences in obstetric management between countries or over time, may also influence this outcome. In one community, the incidence of postterm inductions fell from 8% to 2.6% after ultrasound screening was instituted, 4 but it was not proved that this trend was due specifically to improved accuracy of estimating gestational age. Two trials of second-trimester ultrasound reported other outcomes potentially related to inaccurate dates. The RADIUS trial found no significant effect of ultrasound screening on adverse perinatal outcomes among postdate pregnancies 38 or on the number of tests performed to assess fetal well-being. 39 Another trial reported significantly fewer days of inpatient neonatal care after treatment for "overdue pregnancy" among screened cases. 40

Other potential benefits of prenatal ultrasound, including the early detection of multiple gestations and congenital anomalies, are often cited in support of screening. The early detection of multiple gestation, a risk factor for intrapartum and neonatal complications, 3 might allow improved antenatal surveillance and management, but direct evidence of clinical benefits from early detection, such as improved maternal or neonatal outcome, is lacking. No significant improvements in fetal, neonatal, or maternal outcomes in multiple gestations were reported in any of the screening trials, except for a small reduction in use of tocolytics in the RADIUS trial. 15-19,38,39 Numbers of multiple gestations were small in all trials, however, and power to detect improved outcomes from screening was generally inadequate. There is also no clear evidence that early intervention for identified multiple gestation, including routine hospital admission for bed rest, cervical cerclage, or prophylactic oral tocolysis, results in improved perinatal outcome. 50

While ultrasound before 20 weeks allows earlier detection of fetal structural malformations, it is not clear that this results in improved outcome. In the Helsinki trial, early detection led to an increased rate of elective abortions (2.7/1,000 screened women vs. 0/1,000 control women) and therefore to reduced perinatal deaths (see above). 19 On the other hand, in the RADIUS trial, 38 screening had no statistically significant effect on the rate of induced abortion (n = 9 or 1.2/1,000 screened women compared to n = 4 or 0.5/1,000 controls). Although early detection might theoretically improve survival for infants with fetal anomalies if they could be delivered at tertiary care centers capable of immediate medical and surgical intervention, no significant effects of early detection on overall perinatal mortality, or on survival rates among infants born with acute life-threatening anomalies or with any major anomalies, were seen in the RADIUS trial. 21,38 Other trials of routine ultrasound before 20 weeks have detected too few (i.e., 0-2) malformations to allow meaningful comparisons of outcomes. 15-18,40 None of the trials has evaluated whether routine screening improves outcomes in newborns with nonlethal anomalies.

Eight randomized controlled trials and one meta-analysis have evaluated the effectiveness of routine third-trimester ultrasound focused on fetal anthropometry and morphology in improving outcomes. 18,35,38-44,49 Six trials involved low-risk patients or patients selected from the general population, 18,35,38-40,42,43 while two were restricted to women with suspected IUGR or at increased risk for IUGR or other complications (with results of the scan either released or withheld based on randomization). 41,44 Several of these trials had methodologic problems such as inadequate reporting of results, 40 use of hospital number for randomization, 35 and the revealing of test results for nearly one third of cases in the control group because of obstetrician requests. 41 These studies reported no significant reductions in low Apgar scores, admission to or length of stay in special care nursery, low birth weight or preterm delivery, perinatal morbidity, or perinatal mortality (excluding lethal malformations). There were also no consistent beneficial effects on antenatal hospitalization or induction of labor. The meta-analysis 49 reported that third-trimester ultrasound was associated with a significantly increased risk of antenatal hospital admission. One additional randomized controlled trial in unselected women, all of whom received ultrasounds at midtrimester and twice in the third trimester, evaluated whether reporting the result of placental grading by third-trimester ultrasound to the clinician responsible for care improved neonatal outcome. 45 Reporting the placental grading was associated with significant reductions in meconium staining in labor, low Apgar scores at 5 minutes, and perinatal mortality in normally formed babies. One previously cited trial 43 of serial third-trimester ultrasounds also assessed placental morphology and reported no beneficial effects of ultrasound on perinatal mortality or morbidity, but the method of assessing placental morphology was not described. Additional trials of third-trimester placental grading are needed to assess its effectiveness.

There is no clear evidence of important adverse effects related to screening ultrasonography reported from the published randomized controlled trials, although such effects might be difficult to detect given the small number of ultrasounds (usually one or two per patient) and the fact that the controls in many trials were also scanned, with results concealed. One randomized controlled trial compared routine multiple ultrasound scans plus Doppler flow studies to selective ultrasound for indications, with four or more scans being done in 91% of screened vs. 8% of control women. 43 The screened group had a significantly higher percentage of infants with birth weight below the 3rd and 10th percentiles. Although this was not a primary endpoint of the study, it suggests a possible adverse effect of frequent ultrasound examinations with Doppler studies on fetal growth, which is supported by several studies in mice and monkeys. 51-53 Long-term follow-up of singleton live births to age 8-9 years from the two Norwegian trials (in which only 19% of controls received ultrasound) was performed to evaluate possible adverse effects of ultrasound on neurologic development. 54,55 These two studies, with 83-89% response rates, found no differences between the two groups in school performance deficits in attention, motor control, or perception (by parent questionnaire) development in infancy or prevalence of dyslexia. Although false-positive diagnoses of major fetal malformations occurred in both the Helsinki trial (2.7/1,000 women in the screened group) and in the RADIUS trial (0.9/1,000), none of these pregnancies was electively aborted as a result. 19,21 Case reports have suggested adverse psychological effects of early and false-positive diagnoses of fetal abnormalities, 56-58 but no controlled studies that evaluated adverse effects of ultrasound diagnosis of fetal anomalies were found. top link

Recommendations of Other Groups

A National Institutes of Health consensus development conference recommended that ultrasound imaging during pregnancy be performed only for a specific medical indication and not for routine screening. 59 This is also the position of the American College of Obstetricians and Gynecologists. 2 The Canadian Task Force on the Periodic Health Examination found fair evidence to recommend a single second-trimester ultrasound examination in women with normal pregnancies, but concluded that there was insufficient evidence to recommend the inclusion or exclusion of routine serial ultrasound screening for IUGR in normal pregnancies. 60 top link

Discussion

Neither early, late, nor serial ultrasound in normal pregnancy has been proven to improve perinatal morbidity or mortality. Clinical trials show that a single midtrimester ultrasound examination detects multiple gestations and congenital malformations earlier in pregnancy, but there is currently insufficient evidence that early detection results in improved outcomes. In the U.S., it is not clear whether early detection of fetal anomalies by routine ultrasound leads to increased rates of induced abortion. In addition, many of the major fetal anomalies discoverable by routine ultrasound might be detected anyway during screening for Down syndrome (see Chapter 41) or neural tube defects (see Chapter 42). Routine second-trimester ultrasound can lower the rate of induction for presumed postterm pregnancy, a benefit likely to accrue primarily to women with unreliable dates, among whom ultrasound is more accurate than dates for predicting actual date of delivery. Early ultrasound has not been proven to reduce overall rates of induction, however, due to increases in inductions for other indications. It is also unclear whether the likeliest potential benefits of routine second-trimester ultrasound (reduced induction of labor for postterm pregnancy and increased induced abortions for fetal anomalies) would justify the significant economic implications of widespread testing. No benefits of routine ultrasound examination of the fetus in the third trimester have been demonstrated despite multiple randomized controlled trials. Additional trials of third-trimester placental grading are needed to adequately evaluate the potential benefits of screening for placental appearance. Further research to evaluate possible adverse effects of ultrasound and the cost-effectiveness of routine screening is also needed.

CLINICAL INTERVENTION

Routine ultrasound examination of the fetus in the third trimester is not recommended, based on multiple trials and meta-analyses showing no benefit for either the pregnant woman or her fetus ("D" recommendation). There is currently insufficient evidence to recommend for or against a single routine midtrimester ultrasound in low-risk pregnant women ("C" recommendation). These recommendations apply to routine screening ultrasonography and not to diagnostic ultrasonography for specific clinical indications (e.g., follow-up evaluation of elevated maternal serum alpha-fetoprotein). Recommendations regarding screening for Down syndrome appear in Chapter 41, and those for neural tube defects appear in Chapter 42.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on materials prepared for the Canadian Task Force on the Periodic Health Examination by Geoffrey Anderson, MD, PhD. top link

37. Screening for Preeclampsia

Burden of Suffering

Hypertension is a common medical complication of pregnancy, occurring in about 6-8% of all pregnancies. 1,2 It is seen in a group of disorders that include preeclampsia-eclampsia, latent or chronic essential hypertension, a variety of renal diseases, and transient (gestational) hypertension. The definitions used to distinguish these disorders are a matter of debate, leading to uncertainty about their exact prevalence, natural history, and response to treatment. 3,4 Based on 1992 birth certificate data, pregnancy-associated hypertension was noted in 3% of all pregnancies, and eclampsia in 0.4%. 4a

Preeclampsia and eclampsia, once called toxemias of pregnancy, are the most dangerous of these disorders. Although definitions differ, many describe preeclampsia as acute hypertension (blood pressure greater than 140 mm Hg systolic or 90 mm Hg diastolic or a rise of 30 mm Hg or 15 mm Hg above the usual systolic and diastolic pressures, respectively) presenting after the 20th week of gestation, accompanied by abnormal edema, proteinuria (more than 0.3 g/24 hours), or both. 4 Women with preeclampsia are at increased risk for such complications as abruptio placentae, acute renal failure, cerebral hemorrhage, disseminated intravascular coagulation, pulmonary edema, circulatory collapse, and eclampsia. 5 The fetus may become hypoxic, increasing its risk of low birth weight, premature delivery, or perinatal death. 6 Complications of pregnancy-induced hypertension, including eclampsia (the advanced stage of this disorder characterized by seizures), are major causes of maternal deaths in the U.S. 7 Women with preeclampsia are not at increased risk of developing chronic hypertension. 4 Individuals at increased risk of developing preeclampsia and eclampsia include primigravidas and women with multiple gestations, molar pregnancy or fetal hydrops, chronic hypertension or diabetes, or a personal or family history of eclampsia or preeclampsia. 8-10

Other causes of hypertension during pregnancy include transient and chronic hypertension. Transient (gestational) hypertension is defined as the acute onset of hypertension in pregnancy or the early puerperium without proteinuria or abnormal edema and resolving within 10 days after delivery. 2 Chronic hypertension that had been latent prior to the pregnancy may also become evident during gestation. Pregnant women with latent chronic hypertension are also at increased risk for stillbirth, neonatal death, and other fetal complications, but the risk is much lower than that of women with preeclampsia or eclampsia. Women with transient or latent chronic hypertension are also more likely to develop chronic hypertension in later years. 3,4,8 top link

Accuracy of Screening Tests

Screening tests for preeclampsia are difficult to evaluate due to the absence of a "gold standard" to confirm the diagnosis. Glomerular endotheliosis, the renal lesion characteristic of preeclampsia, is present in only about half of patients who meet the clinical criteria for the disease 11 diagnosis requires an invasive renal biopsy. In addition, the glomerular lesions of preeclampsia are not specific for preeclampsia, having been observed in association with other conditions, such as abruptio placentae and chronic renal disease. 11,12 For practical reasons, most studies of potential screening tests for preeclampsia have relied on clinical criteria to confirm the diagnosis.

Many proposed screening tests have been found unsuitable for early detection of preeclampsia. The appearance of edema and proteinuria alone is unreliable. Edema is common in normal pregnancies 13,14 and therefore lacks specificity. Measurable proteinuria usually occurs after hypertension is manifested and therefore is not useful for early detection. 2 In a prospective study of women between 24 and 34 weeks of gestation, a urine albumin concentration equal to or greater than 11 micro-g/mL had a sensitivity of 50% in predicting subsequent preeclampsia. 15 The conventional urine dipstick test is unreliable in detecting the moderate and highly variable elevations in albumin that occur early in the course of preeclampsia. 16,17 The definitive test for proteinuria, the 24-hour urine collection, is not practical for screening. 17 Because of these considerations, edema is no longer required to diagnose preeclampsia by some experts 5,9,14 and the inclusion of proteinuria is being reconsidered as well. Other tests that have been suggested include the angiotensin II infusion test and the supine pressor "rollover" examination, but these have also been found to be unsuitable, as the former is impractical and the latter lacks adequate sensitivity, specificity, and positive predictive value. 1,17

The most promising screening test for preeclampsia is sphygmomanometry to detect elevated blood pressure, although there are several problems in relying on blood pressure readings as an accurate predictor. Common sources of measurement error associated with sphygmomanometry include instrument defects and examiner technique (see Chapter 3). In addition, maternal posture can significantly affect blood pressure in pregnant women 17 the results can be erroneous, for example, if blood pressure is measured with the woman in the supine position. Measurements should be taken in the sitting position, after the patient's arm has rested at heart level for 5 minutes. 4 Most important, a single elevated blood pressure reading is neither diagnostic of nor a good predictor for preeclampsia. 1,18 Diagnosis utilizing only a change from baseline also has limited sensitivity (21-52% and 7-23% for the diastolic and systolic criteria, respectively) in predicting preeclampsia. 19 A combination of the blood pressure levels and the change from baseline may be more effective in identifying women at risk for preeclampsia, 20 and the trend in blood pressure over time is more important than a single isolated measurement.

In the middle trimester of pregnancy, the normal decline in blood pressure is often dampened or absent in women who subsequently develop preeclampsia. 6,21 Some experts therefore recommend using the middle trimester mean arterial pressure (MAP) -- defined as (systolic pressure + [2 * diastolic pressure])/3) -- as a screening test. 6 Studies indicate that a middle trimester MAP above 90 mm Hg has a sensitivity of 61-71% and a specificity of 62-74% in predicting preeclampsia, 6,22 and even higher sensitivity and specificity have been reported by some researchers. 23 Other studies report a much lower sensitivity of this test in detecting preeclampsia (22-35%) and suggest it is of little value in predicting eclampsia itself. 24 One review concluded that, due to inconsistencies in the definition of "preeclampsia" used in most of these studies (e.g., failure to require proteinuria for the diagnosis), elevations in second trimester blood pressure may be a better predictor of transient or chronic hypertension than of true preeclampsia. 25 top link

Effectiveness of Early Detection

The early detection of hypertension during pregnancy permits clinical monitoring and prompt therapeutic intervention for severe preeclampsia or eclampsia. The delivery of the fetus is considered to be the most definitive method to minimize preeclamptic complications, but other measures (e.g., bed rest and pharmacologic agents) have not been conclusively shown to improve outcome. 17,26 A randomized controlled trial found that antihypertensive therapy and hospitalization, when compared with hospitalization alone, did not improve maternal or fetal outcome. 27 There have been no clinical trials to determine whether hypertensive preeclamptic women treated early in pregnancy have a better prognosis than those who are not detected early.

Clinical experience, however, suggests that early detection and treatment of preeclampsia is beneficial to the patient and fetus. 1,5,9,14 This view is based in part on inferences drawn from the apparent effectiveness of regular prenatal care in reducing the complications of preeclampsia-eclampsia. Studies conducted as early as the 1940s suggested an inverse relationship between the extent of prenatal care and the incidence of eclampsia, perhaps reflecting benefits of early detection. 28 These findings do not provide direct evidence that better outcomes are due solely to blood pressure screening itself, rather than to other components of prenatal care or to the characteristics of women who receive regular prenatal care. top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists recommends blood pressure measurements at the initial visit, every 4 weeks until 28 weeks' gestation, every 2-3 weeks until 36 weeks' gestation, and weekly thereafter. 29 The Canadian Task Force on the Periodic Health Examination recommends that systolic and diastolic blood pressures be measured on all obstetric patients at the first prenatal visit and periodically throughout the remainder of pregnancy. 30 A task force report to the U.S. Public Health Service recommended blood pressure measurements at a preconception visit, at the first prenatal visit (at 6-8 weeks' gestation, ideally) and at each prenatal visit after 24 weeks until delivery. 31 top link

Discussion

The most efficacious screening strategy for preeclampsia is the early detection of an abnormal blood pressure trend over time. Serial measurements during the second and third trimester increase the likelihood that a pathologic pattern or overt blood pressure elevation will be detected. 5,6,18,22,32 Although there is no direct proof that regular screening results in reduced maternal or perinatal morbidity and mortality, it is unlikely that a study will be conducted in which a control group does not receive blood pressure screening or treatment. Because the target condition is a common medical complication of pregnancy and the screening test is simple, inexpensive, and acceptable to patients, screening is indicated on an empirical basis.

Consistent attention should be given to using proper technique for measuring blood pressure. 4 Although the use of isolated specific blood pressure levels (e.g., above 140/90 mm Hg) has an important role in evaluating patients, more definitive data are needed to determine its positive predictive value in the diagnosis of preeclampsia. 20 Measurement of blood pressure and calculation of the MAP during the second trimester may also provide useful information prior to the development of preeclampsia-eclampsia, but more reliable data are needed to determine the positive predictive value of second trimester blood pressure and whether screening based on these criteria results in improved clinical outcome.

Several therapeutic agents are being investigated as preventive measures for preeclampsia. Aspirin prophylaxis for the prevention of preeclampsia and its complications is discussed elsewhere (see Chapter 70). Calcium supplementation is currently being evaluated. 4,5,14

CLINICAL INTERVENTION

Screening for preeclampsia with blood pressure measurement is recommended for all pregnant women at the first prenatal visit and periodically throughout the remainder of pregnancy ("B" recommendation). The optimal frequency for measuring blood pressure in pregnant women has not been determined and is left to clinical discretion it is most efficient to measure blood pressure on women who are being seen by their clinicians for other reasons. The collection of meaningful blood pressure data requires consistent use of correct technique and a cuff of appropriate size. In addition to the guidelines listed in Chapter 3, the patient should be in the sitting position and the blood pressure should be measured after the patient's arm has rested at heart level for 5 minutes. 4 Further diagnostic evaluation and clinical monitoring, including frequent blood pressure monitoring and urine testing for protein, are indicated if blood pressure does not decrease normally during the middle trimester, if the systolic pressure increases 30 mm Hg above baseline or the diastolic pressure increases 15 mm Hg above baseline, or if the blood pressure exceeds 140/90 mm Hg. Medical interventions should not be prescribed until the diagnosis of preeclampsia is confirmed. See Chapter 70 for recommendations on the use of aspirin prophylaxis in pregnancy.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Michelle Berlin, MD, MPH, and A. Eugene Washington, MD, MSc.top link

38. Screening for D (Rh) Incompatibility

Burden of Suffering

D incompatibility exists when a D-negative woman is pregnant with a D-positive fetus, which occurs in up to 9-10% of pregnancies, depending on race. 1,2 If no preventive measures are taken, 0.7-1.8% of these women will become isoimmunized antenatally, developing D antibody through exposure to fetal blood 8-17% will become isoimmunized at delivery, 3-6% after spontaneous or elective abortion, and 2-5% after amniocentesis. 1-3 In subsequent D-positive pregnancies of isoimmunized women, maternal D antibody will cross the placenta into the fetal circulation and hemolyze red cells. 1 Without treatment, 25-30% of these offspring will have some degree of hemolytic anemia and hyperbilirubinemia, and another 20-25% will be hydropic and often will die either in utero or in the neonatal period. 4

Since the introduction of routine postpartum prophylaxis in the 1960s, the crude incidence of D isoimmunization in the U.S. and Canada has fallen from 9.1-10.3 cases to 1.3 cases/1,000 total births. 5-9 Hemolytic disease of the fetus or newborn due to D isoimmunization (also called erythroblastosis fetalis) now accounts for only 4-5 deaths/100,000 total births, 6,10 although this may be an underestimate as early intrauterine deaths are not always reported. 10 Even before the introduction of prophylaxis, however, a decline in fetal and neonatal mortality from D hemolytic disease was occurring due to declines in both incidence and case fatality rates. It has been estimated that 30-40% of the recent decline in disease incidence is attributable to smaller family size, since the incidence of D hemolytic disease increases with increasing birth order. 11 Since the 1940s, the case fatality rate has fallen from about 50% to 2-6%. 8,9 This decline can be attributed in part to the trend toward smaller families, since the first affected infant in a family generally has less severe disease. 1,5,9 The decline has also been associated with the introduction of interventions such as amniotic fluid spectrophotometry, exchange transfusion, amniocentesis, intrauterine fetal transfusion, and improved care of both the mother and the premature erythroblastotic infant. 1,9 top link

Accuracy of Screening Tests

Hemagglutination is the established reference standard for the determination of D blood type. 12 The indirect antiglobulin (Coombs) test (IAGT) is the reference standard for detecting anti-D antibody in women who are sensitized to D-positive blood. 12 The IAGT will also detect other maternal antibodies that may cause hemolytic disease. 13 top link

Effectiveness of Early Detection

The early detection of D-negative blood type in the pregnant woman is of substantial benefit if the patient is not yet isoimmunized and the father is not known to be D-negative. Administration of D immunoglobulin (or Rho (D) immune globulin (human)) to an unsensitized D-negative woman after delivery of a D-positive fetus will prevent maternal isoimmunization and consequent hemolytic disease in subsequent D-positive offspring. The efficacy of D immunoglobulin prophylaxis was convincingly demonstrated in a series of controlled clinical trials in the early 1960s. 14-16 Despite a variety of minor flaws in study design, these trials showed that isoimmunization did not occur in any of the women who received a full dose of D immunoglobulin postpartum and who were unsensitized when it was administered. These findings led to the introduction of routine postpartum prophylaxis following licensure of D immunoglobulin in 1968. Time series studies have since shown a dramatic decline in the incidence of D isoimmunization, from 13-14% in the mid-1960s to 1-2% in the mid-1970s, 7,17 although as described above, at least some of this decline is probably attributable to smaller family size.

The most frequent cause of apparent failure of postpartum prophylaxis is antenatal isoimmunization, which happens in 0.7-1.8% of pregnant women at risk. 1,9,18 Although sample selection and other design features were not optimal, nonrandomized controlled trials have shown that the administration of D immunoglobulin at 28 weeks' gestation, when combined with postpartum administration, reduces the incidence of isoimmunization to <=0.2% of women at risk. 19-21

Since D isoimmunization during pregnancy is caused by transplacental hemorrhage, the risk of isoimmunization increases whenever such hemorrhage is likely to occur, including after abortion, amniocentesis, chorionic villus sampling (CVS), cordocentesis, ectopic pregnancy, fetal manipulation (e.g., external version procedures) or surgery, antepartum hemorrhage, antepartum fetal death, and stillbirth. 1,22-24 Studies documenting the effectiveness of D immunoglobulin prophylaxis are available for only a few of these indications, however. In a nonrandomized trial of D immunoglobulin after amniocentesis, control D-negative women delivering D-positive infants were more likely to become isoimmunized than were those receiving D immunoglobulin (5.2% vs. 0%), although because of small numbers this difference was not statistically significant. 25 Case series describing D immunoglobulin administration after amniocentesis have demonstrated isoimmunization rates as low as 0-0.5%. 26-28 In a case series of D immunoglobulin after induced abortion, isoimmunization occurred in 0.4%, 29 compared to 2.6% among a series of patients, described by the same authors, who did not receive D immunoglobulin. 30 The preliminary results from a randomized controlled trial of D immunoglobulin after CVS showed that among D-negative women delivering D-positive infants, similar rates of isoimmunization were seen in both intervention (2.3%) and control (1.1%) groups insufficient details are provided to ensure baseline comparability between the two groups, however. 31 D-negative women who received D immunoglobulin experienced twice as many unintended fetal losses as did controls (6.9% vs. 3.8%), but this difference was not statistically significant. Results of the completed trial confirm the preliminary findings (S. Smidt-Jensen, Rigshospitalet, Copenhagen, Denmark personal communication, January 1995), but have not yet been published. No studies evaluating the use of D immunoglobulin after other obstetric procedures or after obstetric complications were found.

The standard postpartum dose of D immunoglobulin (300 micro-g) contains sufficient D antibodies to prevent sensitization to at least 15 mL of D-positive fetal red blood cells (RBCs), or approximately 30 mL of fetal blood 32 a "minidose" (50 micro-g) prevents sensitization to 2.5 mL of D-positive fetal RBCs. For women with transplacental hemorrhages >30 mL of fetal blood, the risk of D isoimmunization developing after the full postpartum D immunoglobulin dose is 30-35%. 3,33 The incidence of fetal-maternal hemorrhage >30 mL is 0.1-0.7% for all D-negative pregnancies, 1,33,34 but it is 1.7-2.5% after complicated vaginal and cesarean deliveries, 34,35 and 4.5% after stillbirth. 1 There are several available methods for detecting excess fetomaternal hemorrhage. Acid elution (Kleihauer-Betke) is both sensitive and specific when done correctly, 1,36 but it is subject to substantial laboratory and technologist error. 36 Flow cytometry is also highly sensitive and specific, but it is technically difficult to perform. 36 The erythrocyte rosette test is simple to perform and highly sensitive (99-100%) for the presence of >=15 mL of D-positive fetal RBCs, 1,36 but its specificity is low 36,37 so positive results must be confirmed by more specific tests such as acid elution and flow cytometry. 1,36

In clinical practice, combined antenatal and postnatal prophylaxis will prevent isoimmunization in 96% of women at risk. 21 The remaining cases are due to failure to give D immunoglobulin when indicated, isoimmunization that occurred before the widespread availability of D immunoglobulin, administration of an insufficient dose, or treatment failure (i.e., isoimmunization occurring before 28 weeks or transplacental hemorrhage too large or too late in pregnancy to be prevented by the standard antepartum dose). 3,38,39 Human error causes 22-50% of these cases. 6,21,39 While clinicians almost always administer D immunoglobulin postpartum or after induced abortion, administration rates have been documented to be lower for other obstetric procedures and complications: 81-88% after spontaneous abortion, 36-60% after ectopic pregnancy, 31% after antepartum hemorrhage, and 14% after amniocentesis. 2,40,41

D immunoglobulin has few adverse effects. 1,42 Some fetuses will become weakly direct antiglobulin-positive following antenatal administration, but resulting anemia and hyperbilirubinemia in the newborn are very rare. 19 All plasma for D immunoglobulin production is screened for infectious diseases as required by the Food and Drug Administration no cases of human immunodeficiency virus (HIV) infection from D immunoglobulin have been reported. 43 The evidence is therefore compelling that early detection and prophylaxis of the unsensitized D-negative woman is both safe and effective in preventing isoimmunization and thus in preventing D hemolytic disease.

Early detection is also beneficial for D-negative women who are already isoimmunized and are carrying D-positive offspring, because early intervention may improve clinical outcome. Decisions to intervene depend on the validity of screening tests in predicting the degree of fetal anemia. Obstetric history, maternal antibody titers, and ultrasound are currently used to determine the need for more invasive tests during isoimmunized pregnancies, but in the absence of hydrops none of these reliably distinguishes mild from severe hemolytic disease. 1,4,22,44 Immunologic tests on maternal serum show promise in predicting disease severity. 1,45,46 In the third trimester, serial amniotic fluid spectrophotometry has been found to correctly predict disease severity (i.e., cord hemoglobin and need for neonatal therapy) in 94-99% of cases. 47,48 In the second trimester, however, this test has insufficient sensitivity or specificity for predicting the need for intervention. 4,49,50 Determination of fetal hemoglobin and D blood type by ultrasound-guided cordocentesis, which can be performed in the second trimester, quantifies the degree of anemia, can be followed by transfusion if indicated, and allows referral of those with D-negative babies to routine care. 1,4 Case series, however, have demonstrated complication rates of 2-7% and procedure-related fetal mortality rates of 0.5-1%. 23,51,52 DNA amplification in amniotic cells and chorionic villus samples appears to be effective in determining fetal D blood type early in pregnancy, without the risk associated with invading the fetomaternal circulation. 53

In the presence of severe fetal anemia, early intervention appears to offer substantial improvement in clinical outcome. Current perinatal survival after ultrasound-guided intravascular transfusion at experienced centers is 62-86% for hydropic fetuses and >90% for those without hydrops. 4,54,55 Once pulmonary maturity is established, the fetus can be delivered early and exchange transfusion performed with only 1% mortality risk. 56 top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists (ACOG) 22 and the U.S. Public Health Service Expert Panel on the Content of Prenatal Care 57 recommend D blood typing and antibody screening at the first prenatal visit and repeat D antibody screening at 24-28 weeks of pregnancy for D-negative women. Both groups recommend offering D immunoglobulin to all unsensitized D-negative women at 28 weeks of gestation, and to those at increased risk of sensitization because of delivery of a D-positive infant, antepartum hemorrhage, spontaneous or induced abortion, amniocentesis, external version procedures, or ectopic pregnancy, within 72 hours of the event. 22,57 ACOG also recommends D immunoglobulin administration to unsensitized D-negative women who have CVS, cordocentesis, antepartum fetal death, fetal surgery, or transfusion of D-positive blood products. 22 ACOG recommends measuring fetal blood cell levels in the mother when antepartum placental hemorrhage occurs. 22 The Canadian Task Force on the Periodic Health Examination recommends D blood typing and antibody screening at the first prenatal visit, before elective procedures such as amniocentesis and therapeutic abortion in which there is the possibility of fetal bleed, between 24 and 28 weeks if the mother is D-negative, and within 72 hours of delivery. They recommend administration of D immunoglobulin to unsensitized women at 28 weeks and postpartum, and after amniocentesis or induced abortion. 58 top link

Discussion

Although the burden of suffering from this disease is now low, the incidence was at least 10/1,000 live births before the introduction of preventive measures in the 1960s. 9 There is excellent evidence for the efficacy and effectiveness of blood typing, anti-D antibody screening, and postpartum D immunoglobulin prophylaxis. Although antepartum prophylaxis offers some additional benefit, some critics argue that the total impact of antepartum prophylaxis on the incidence of D disease is relatively small, making it approximately 16 times less cost-effective than a program consisting only of postpartum treatment. 2,59,60 Other studies support the cost-effectiveness of antepartum prophylaxis. 21,61 The cost-effectiveness of D immunoglobulin after obstetric procedures and complications is unknown.

CLINICAL INTERVENTION

D blood typing and antibody testing is recommended for all pregnant women at their first prenatal visit, including visits for elective abortion ("A" recommendation). For purposes of blood typing and prophylaxis, Du- and D-negative blood types should be considered equivalent. 22 Unless the father is known to be D-negative, a repeat D antibody test is recommended for all unsensitized D-negative women at 24-28 weeks' gestation, followed by the administration of a full (300 micro-g) dose of D immunoglobulin if they are antibody-negative ("B" recommendation). If a D- (or Du-) positive infant is delivered, the dose should be repeated postpartum, preferably within 72 hours after delivery ("A" recommendation). Unless the father is known to be D-negative, a full dose of D immunoglobulin is recommended for all unsensitized D-negative women after elective abortion (50 micro-g before 13 weeks) and amniocentesis ("B" recommendation). There is currently insufficient evidence to recommend for or against the routine administration of D immunoglobulin after other obstetric procedures or complications such as chorionic villus sampling, ectopic pregnancy termination, cordocentesis, fetal surgery or manipulation (including external version), antepartum placental hemorrhage, antepartum fetal death, and stillbirth ("C" recommendation).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

39. Intrapartum Electronic Fetal Monitoring

Burden of Suffering

Intrapartum fetal asphyxia is an important cause of stillbirth and neonatal death. In the U.S. in 1993, an estimated 700 infant deaths (17.3/100,000 live births) were attributed to intrauterine hypoxia and birth asphyxia. 1 Some neonates with intrauterine hypoxia require resuscitation and other aggressive medical interventions for such complications as acidosis and seizures. Asphyxia has also been implicated as a cause of cerebral palsy, although most cases of cerebral palsy occur in persons without evidence of birth asphyxia or other intrapartum events. 2-5 Most fetuses tolerate intrauterine hypoxia during labor and are delivered without complications, but assessments suggesting fetal distress are associated with an increased likelihood of cesarean delivery (63% compared to 23% for all births). 6 The exact incidence of fetal distress is uncertain a rate of 42.9/1,000 live births was reported from 1991 U.S. birth certificate data, with the highest rates in infants born to mothers under age 20 or over age 40, and in blacks. 7 top link

Accuracy of the Screening Test

The principal screening technique for fetal distress and hypoxia during labor is the measurement of fetal heart rate. Abnormal decelerations in fetal heart rate and decreased beat-to-beat variability during uterine contractions are considered to be suggestive of fetal distress. The detection of these patterns during monitoring by auscultation or during electronic monitoring (cardiotocography) increases the likelihood that the fetus is in distress, but the patterns are not diagnostic. In addition, normal or equivocal heart rate patterns do not exclude the diagnosis of fetal distress. 5 Precise information on the frequency of false-negative and false-positive results is lacking, however, due in large part to the absence of an accepted definition of fetal distress. 8,9 For many years, acidosis and hypoxemia as determined by fetal scalp blood pH were used for this purpose in research and clinical practice, but it is now clear that neither finding is diagnostic of fetal distress. 5,10-12

Electronic fetal heart rate monitoring can detect at least some cases of fetal distress, and it is often used for routine monitoring of women in labor. In 1991, the reported rate of electronic fetal monitoring in the U.S. was 755/1,000 live births. 7 The published performance characteristics of this technology, derived largely from research at major academic centers, may overestimate the accuracy that can be expected when this test is performed for routine screening in typical community settings. Two factors in particular that may limit the accuracy and reliability achievable in actual practice are the method used to measure fetal heart activity and the variability associated with cardiotocogram interpretations.

The measurement of fetal heart activity is performed most accurately by attaching an electrode directly to the fetal scalp, an invasive procedure requiring amniotomy and associated with occasional complications. This has been the technique used in most clinical trials of electronic fetal monitoring. Other noninvasive techniques of monitoring fetal heart rate, which include external Doppler ultrasound and periodic auscultation of heart sounds by clinicians, are more appropriate for widespread screening but provide less precise data than the direct electrocardiogram using a fetal scalp electrode. In studies comparing external ultrasound with the direct electrocardiogram, about 20-25% of tracings differed by at least 5 beats per minute. 13,14

A second factor influencing the reliability of widespread fetal heart rate monitoring is inconsistency in interpreting results. Several studies have documented significant intra- and interobserver variation in assessing cardiotocograms even when tracings are read by experts in electronic fetal monitoring. 15-17 It would be expected that routine performance of electronic monitoring in the community setting with interpretations by less experienced clinicians would generate a higher proportion of inaccurate results and potentially unnecessary interventions than has been observed in the published work of major research centers. top link

Effectiveness of Early Detection

A potentially more important issue is whether electronic evidence of fetal distress during labor results in benefit to either the fetus or mother. Observational studies in the 1960s and 1970s suggested that electronic fetal monitoring during labor reduced the risk of intrapartum stillbirth, neonatal death, and developmental disability, but methodologic problems in these largely retrospective studies left the issue unsettled. 4,8 Ten randomized controlled trials and four meta-analyses of electronic fetal monitoring have since been published, all of which compared electronic monitoring, with or without fetal scalp blood sampling, to active clinical monitoring including intermittent auscultation by trained personnel. Three trials in low-risk women, 18-20 the largest of which involved nearly 13,000 patients, 18 compared continuous electronic monitoring to intermittent auscultation where described, auscultation was performed at least every 15 minutes during the first stage of labor 18,20 and between each contraction during the second stage. 20 Two trials included scalp blood sampling. 19,20 These trials found no significant differences between the study groups in intrapartum or perinatal deaths, maternal or neonatal morbidity, Apgar scores, umbilical cord blood gases, the need for assisted ventilation, or admission to the special care nursery. The results of one of these trials 19 may have been biased by the method of randomization, however, which resulted in a large disparity in the distribution of primigravidae between the study groups. Similarly, no differences in clinical outcomes were reported in a subgroup analysis of low-risk women enrolled in a prospective study of nearly 35,000 pregnancies in which routine monitoring was compared with selective monitoring of high-risk pregnancies. 21,22 A controlled trial 23 that assigned intervention by week of admission also reported no effect of electronic fetal monitoring on low Apgar scores, admissions to special care nurseries, or neonatal infection. A trial from Greece carried out in predominantly low-risk pregnant women found no differences in most neonatal outcome measures, but reported a significant reduction in perinatal mortality rates (2.6 compared to 13/1,000 total births). 24 This study may not be generalizable to the U.S., however, given higher perinatal mortality and substantially lower cesarean delivery rates (<10%) than are typical in the U.S. In addition, the method of randomization and the large disparity in numbers between study and control group (746 vs. 682 women) raise the possibility of biased randomization.

The potential benefits of electronic fetal monitoring during labor have also been examined in high-risk pregnancies. Four clinical trials in developed countries found that electronic fetal heart rate monitoring in high-risk pregnancies, with or without scalp blood sampling, was of limited benefit when compared with intermittent auscultation during labor. 25-28 Neonatal death, Apgar scores, cord blood gases, and neonatal nursery morbidity were unchanged in three of the trials, 26-28 all of which performed intermittent auscultation systematically in control women: every 15 minutes in the first stage of labor and every 5 minutes in the second stage. The fourth trial found that continuous monitoring was associated with improved umbilical cord blood gases and neurologic symptoms and signs, and decreased need for intensive care. 25 This study has been criticized, however, because monitoring techniques in the control group were poorly described and one physician withdrew his patients from the control group after the trial began. 8,29 Results from a fifth trial in high-risk pregnant women in Zimbabwe are unlikely to be applicable to obstetric care in the U.S. 30

Meta-analyses 31-33 that included all but the two most recently published randomized controlled trials 24,30 cited above reported no effect of electronic fetal monitoring on low Apgar scores, admissions to special care nurseries, or neonatal infection. With electronic fetal monitoring combined with scalp blood sampling, the relative risk of intrapartum death was 0.81 (95% confidence interval, 0.22 to 2.98) and of perinatal death was 0.98 (95% confidence interval, 0.58 to 1.64) when compared to intermittent auscultation. Relative risk of perinatal mortality when electronic fetal monitoring without blood sampling was used was 1.94 (95% confidence interval, 0.2 to 18.62). A meta-analysis of all trials from developed countries also reported no significant effect on overall perinatal mortality (typical odds ratio 0.87 95% confidence interval, 0.57 to 1.33). 33a The confidence intervals around these point estimates of the risk of perinatal death are wide, indicating that sample size is insufficient to exclude the possibility of clinically important increases or declines in mortality. One meta-analysis reported a significant reduction in perinatal mortality due to fetal hypoxia, but the method for attributing deaths to hypoxia was not standardized. 33a The results appeared to be strongly influenced by the inclusion of one trial with questionable randomization methods and generalizability to the U.S. (see above) 24 a sensitivity analysis to examine the effect of excluding this trial from the meta-analysis was not reported.

Although most outcome measures in these studies were not influenced by electronic fetal monitoring, there is evidence that it reduces the incidence of neonatal seizures. This was suggested in early research 25,34 and confirmed in the Dublin trial of low-risk women. 20 This study reported a statistically significant reduction in the rate of neonatal seizures when continuous intrapartum fetal monitoring was compared with intermittent auscultation. Secondary analysis suggested that the reduced risk was limited to labors that were prolonged or induced or augmented with oxytocin. In a meta-analysis of the controlled trials that included scalp blood sampling as an adjunct, the odds of neonatal seizures were reduced by about one half with electronic monitoring. 31 A separate meta-analysis found no effect of electronic monitoring on neonatal seizures when no scalp blood sampling was performed, 32 raising the possibility that the benefit may have been due to the blood sampling rather than the electronic monitoring. What also remains unclear is the extent to which infants benefit from the prevention of neonatal seizures by monitoring. Seizures have been viewed by many as a poor prognostic indicator in the Dublin trial, death occurred in 23% of the babies who experienced seizures, and autopsy confirmed that at least two thirds of these deaths were due to asphyxia during labor. 20 There are few prospective data on whether the prevention of neonatal seizures reduces the risk of neonatal death or long-term neurologic sequelae. The neonatal seizures prevented by electronic monitoring may not be those associated with long-term impairment. 20,31 At 4-year follow-up of survivors after seizures in the Dublin trial, the total number and rate with cerebral palsy (n = 3 and 0.5/1,000 enrolled subjects) were identical in the monitored and control groups. 35

None of the three trials reporting longer term follow-up found that electronic fetal monitoring improved neurologic or developmental outcomes. A follow-up study of the growth and development at 9 months of age of infants involved in the second Denver trial 27 failed to show any long-term benefits of electronic fetal monitoring the direction of the effect on mental and psychomotor development scores suggested increased risk in the monitored group. 36 In the Dublin trial, 20 the overall rates of cerebral palsy at 4-year follow-up were 1.8/1,000 in the electronically monitored group and 1.5/1,000 in the auscultation group. 35 Eighteen-month follow-up in a trial in high-risk women 28 revealed little difference in mean mental or psychomotor development scores on the Bayley Scales, but cerebral palsy and low mental development scores were both significantly more common in the electronically monitored group. 37 Cerebral palsy was associated with an increased duration of abnormal fetal heart rate patterns and time to delivery after diagnosis of such patterns in the electronically monitored group. Meta-analyses combining these three studies confirm little benefit from monitoring on adverse neurologic outcomes. 31,32

Any potential benefit of intrapartum monitoring must be weighed against the potential risks associated both with diagnostic procedures and operative interventions for fetal distress. The insertion of fetal scalp electrodes, for example, is generally a safe procedure, but it may occasionally cause umbilical cord prolapse or infection due to early amniotomy electrode or pressure catheter trauma to the eye, fetal vessels, umbilical cord, or placenta and scalp infections with Herpes hominis type 2 or group B streptococcus. 10 Concerns have also been raised about the potential for enhancing transmission of human immunodeficiency virus (HIV) infection by the use of scalp electrodes. 38 Meta-analysis of randomized controlled trials indicates no increased risk of neonatal infection from electronic fetal monitoring compared to intermittent auscultation. 33 Perhaps the most important complication of intrapartum electronic fetal monitoring is the increased performance of cesarean delivery, an operation associated with maternal and neonatal morbidity and a small but measurable operative mortality. 39,40 Fetal distress is a common indication for cesarean delivery, and all trials showed a higher cesarean delivery rate in the electronically monitored group. The randomized controlled trials from the 1970s reported that cesarean delivery was performed significantly more frequently in association with electronic fetal monitoring. 18,19,25-27 In recent years, an effort has been made to lower the frequency of cesarean delivery, and four of five trials carried out in developed countries in the 1980s or 1990s reported no significant increase in the overall cesarean delivery rate with electronic fetal monitoring. 20,23,24,28 A fifth trial, comparing routine to selective electronic monitoring, reported a very small increase that was statistically but not clinically significant. 21 On the other hand, operative vaginal (e.g., forceps) deliveries were significantly increased in the newer trials, 20,23,24 suggesting an inverse relationship between cesarean and operative vaginal delivery. The meta-analyses 31,32,33a previously cited reported a 1.3- to 2.7-fold increased likelihood of cesarean delivery and a 2.0- to 4.1-fold increased likelihood of cesarean delivery for fetal distress with continuous electronic fetal monitoring, with lower rates in the meta-analysis of studies that used scalp blood sampling. The likelihood of any operative delivery was increased by about 30% with electronic fetal monitoring. The meta-analyses also reported higher rates of both maternal infection and general anesthesia with electronic monitoring, presumably secondary to the higher rates of operative delivery. 31,32 Electronic monitoring may also have adverse psychological effects. In a comparison of subsamples from the randomized groups in one trial, women who had electronic fetal monitoring reported an increased likelihood of feeling "too restricted" during labor and were also more likely to report feeling left alone, although the latter difference was of only borderline significance. 41 On the other hand, in a subsample from a different trial, there were no differences between women in the two groups in their assessment of their monitoring experience, medical or nursing support, or the labor or delivery experience. 42 top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists states that all patients in labor need some form of fetal monitoring, with more intensified monitoring indicated in high-risk pregnancies the choice of technique (electronic fetal monitoring or intermittent auscultation) is based on various factors, including the resources available. 43 The Canadian Task Force on the Periodic Health Examination advises against routine electronic fetal monitoring in normal pregnancies but found poor evidence regarding the inclusion or exclusion of its routine use in high-risk pregnancies. 44 top link

Discussion

Electronic fetal monitoring has become an accepted standard of care in many settings in the U.S. for the management of labor. 4 Birth certificate data suggest that this technology was used in about three fourths of all live births in 1991 7 in certain academic centers the rate may be as high as 86-100%. 4 As discussed above, there are important questions regarding the definition of fetal distress, as well as about the accuracy and reliability of electronic fetal monitoring in discriminating accurately between pregnancies with and without this disorder. It is also unclear whether the use of this technology results in significantly improved outcome for the baby when compared to active clinical monitoring. Adequately conducted trials generalizable to obstetric care in the U.S. have not reported a reduction in perinatal mortality, although sample sizes are not adequate to exclude a benefit. Evidence does support a reduced risk of neonatal seizures, but the benefit was mainly seen in women with complicated labors (i.e., induced, augmented with oxytocin, or prolonged), and it is not clear that there are long-term adverse effects associated with the types of seizures prevented. Follow-up of study subjects at 9 months to 4 years of age has not revealed any long-term neurologic benefits from electronic monitoring. If anything, effect estimates suggest an increased risk of cerebral palsy and low developmental scores in electronically monitored infants, possibly due to false reassurance and consequent delayed intervention.

In addition to the maternal risks associated with electronic fetal monitoring, including increased rates of cesarean or operative vaginal (e.g., forceps) delivery, general anesthesia and maternal infection, and the possible increased risk of adverse neonatal neurologic outcome, increased use of this technology is associated with increased costs of labor care. The widespread use of electronic fetal monitoring in low-risk pregnancies in the face of uncertain benefits, and certain maternal risks and costs, has been attributed to concerns about litigation. 8,45 It has been estimated that nearly 40% of all obstetric malpractice losses are due to fetal monitoring problems, 46 and this may be a major motivating factor behind the widespread use of electronic fetal monitoring during labor.

CLINICAL INTERVENTION

Routine electronic fetal monitoring is not recommended for low-risk women in labor when adequate clinical monitoring including intermittent auscultation by trained staff is available ("D" recommendation). There is insufficient evidence to recommend for or against electronic fetal monitoring over intermittent auscultation for high-risk pregnancies ("C" recommendation). For pregnant women with complicated labor (i.e., induced, prolonged, or oxytocin augmented), recommendations for electronic monitoring plus scalp blood sampling may be made on the basis of evidence for a reduced risk of neonatal seizures, although the long-term neurologic benefit to the neonate is unclear and must be weighed against the increased risk to the mother and neonate of operative delivery, general anesthesia, and maternal infection, and a possible increased risk of adverse neurologic outcome in the infant. There is currently no evidence available to evaluate electronic fetal monitoring in comparison to no monitoring.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on materials prepared by Geoffrey Anderson, MD, PhD, for the Canadian Task Force on the Periodic Health Examination. top link

40. Home Uterine Activity Monitoring

Burden of Suffering

Preterm birth is a leading cause of perinatal morbidity and mortality in the U.S. Preterm neonates account for at least half of the mortality and morbidity among newborns without congenital anomalies. 1 These conditions represent a leading cause of years of potential life lost before age 65. 2 Preterm births generate large societal costs in providing neonatal intensive care and long-term treatment for complications. 3 Both primary and secondary preventive measures have been proposed for the prevention of prematurity. Primary prevention includes efforts to reduce risk factors for prematurity, such as cessation of tobacco, alcohol, and other drug use, and programs to improve nutrition, socioeconomic conditions, and prenatal care. Secondary prevention involves the early detection and treatment of preterm labor. top link

Accuracy of Screening Tests

The rationale behind screening for preterm labor is the assumption that the risk of preterm birth can be reduced significantly by the prompt initiation of treatment (e.g., rest, hydration, tocolytic therapy). These measures are of potential value in prolonging pregnancy only in those cases involving idiopathic preterm labor and not in medically indicated preterm births (e.g., preterm rupture of membranes, antepartum hemorrhage, fetal distress). Tocolytic medications are generally ineffective after substantial cervical dilation (>2-3 cm) and effacement have occurred. Because patients and physicians may have difficulty in recognizing the early signs of preterm labor, many patients arrive at the hospital with advanced cervical dilation and effacement and/or with ruptured membranes. Such delays in detection are thought to limit the effectiveness of tocolysis.

Screening for earlier detection of preterm labor has therefore been proposed. The principal screening tests are self-palpation and tocodynamometry. Programs to improve the early detection of preterm labor have centered on educating women about the symptoms of preterm labor and on teaching self-palpation to help them detect the increasing rate of uterine contractions which often precedes preterm labor. 4 Studies of these measures have produced mixed results. Some studies have shown that self-palpation has poor sensitivity in detecting preterm labor. One study reported that only 15% of contractions were detected by patients and that fewer than 11% of pregnant women were able to identify half of their recorded contractions. 5

Although tocodynamometry is usually performed in the hospital, home uterine activity monitoring (HUAM) has been advocated as an ambulatory screening test for preterm labor in high-risk women. The home tocodynamometer consists of a pressure sensor that is held against the abdomen by a belt and a recording/storage device that is carried by a belt or hung from the shoulder. Uterine activity is typically recorded by the patient for 1 hour, twice daily, while performing routine activities. The stored data are transmitted via telephone to a practitioner, where a receiving device prints out the data. Patients are often contacted by, or have access to, personnel who can address monitoring problems.

The sensitivity and specificity of HUAM are uncertain, due to lack of data and the absence of a reference standard. External tocodynamometers, whether in the hospital or home, can produce inconsistent wave amplitudes when measuring uterine contractions, depending on the location of the instrument, the tension on the belt, thickness of adipose tissue, and other factors. Contractions of mild intensity can be confused with background noise. Studies suggest that HUAM performs similarly to monitoring devices used in the hospital, detecting 1.1-2.2 contractions for every contraction detected by conventional devices, 6,7 and there is good correlation between HUAM results and contractions detected by intrauterine pressure catheters. 8 There appears to be substantial variation among physicians in the interpretation of tocodynamometry tracings. 7,9 top link

Effectiveness of Early Detection

A nonrandomized observational study 10 and six randomized controlled trials 11-18 of women at risk for preterm labor have compared birth outcomes with and without the use of HUAM. Three trials 13,15,17 found no significant effect on the incidence of preterm birth or low birth weight, but sample size may have been inadequate to detect a difference. An observational study 10 and four other trials 14-16,18 reported a significant reduction in the incidence of preterm birth, neonatal morbidity and mortality, or low birth weight in pregnancies monitored by HUAM. In four of these studies, 10,14,16,18 HUAM-monitored women received more intensive nursing or telemetry personnel contact than did women in the control groups, making it unclear whether it was the device or the nursing contact that was responsible for the improved outcome. Overall, the studies showing benefit also lacked randomization, had high attrition and exclusion rates, or suffered from other design limitations.

Four studies 10,11,14,16 found that HUAM-monitored women were less likely to experience preterm cervical dilation, effacement, or ruptured membranes and were more likely to be eligible for long-term tocolysis. Reported reductions in these surrogate measures, however, are of uncertain value in inferring an effect on clinical outcomes. The overall evidence shows rather consistently that the combination of HUAM and frequent provider telephone contact produces better outcomes than standard care.

There are no known direct adverse effects from HUAM. The technology involves some inconvenience, and surveys suggest that some women reject the device because of its impact on their lifestyle. 19 There is little evidence of other adverse effects. Studies have shown that HUAM-monitored women attend no more than one extra physician visit per pregnancy than do unmonitored women. 10 Another theoretical adverse effect is unnecessary hospitalization or administration of tocolytic drugs to women who have abnormal home tocodynamometry data but are not in preterm labor. Objective evidence regarding the incidence of this problem is unavailable. top link

Recommendations of Other Groups

In 1989 20 and again in 1992, 21 the American College of Obstetricians and Gynecologists concluded that HUAM should remain investigational and should not be recommended for routine clinical use. That position was maintained in a recent technical bulletin. 21a In 1989, the National Institute of Child Health and Human Development concluded that the existing evidence was not convincing that HUAM, independent of vigorous nursing support and other interventions, was effective in assessing the risk of preterm labor or in preventing preterm birth. 22 In a 1989 survey, 86% of the experts on an American Medical Association Diagnostic and Therapeutic Technology Assessment panel concluded that the effectiveness of HUAM was investigational, indeterminate, or unacceptable. 23 In 1991, the Food and Drug Administration licensed the marketing of a HUAM device for women who have had a previous preterm delivery. 24 A 1992 technology assessment by the Agency for Health Care Policy and Research concluded that current data did not support widespread use of HUAM or suggest its superiority over other methods for reducing the incidence of preterm births. 25 top link

Discussion

The cost implications of HUAM are potentially great but have been incompletely evaluated in published research. Some studies have reported that average charges for HUAM-monitored women are $5,000-$11,000 lower than those for unmonitored women, presumably because of savings achieved by reduced neonatal intensive care. 26-28 The cost-effectiveness of HUAM cannot fully be determined, however, until its clinical effectiveness has been demonstrated. Moreover, it remains unclear whether the money, personnel, and professional time required to provide this technology would divert resources from other potentially effective measures for the primary prevention of preterm births.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against HUAM as a screening test for preterm labor in high-risk pregnancies (pregnancies with risk factors for preterm labor), but recommendations against its use may be made on other grounds, including its costs and inconvenience ("C" recommendation). HUAM is not recommended for normal-risk pregnancies (without risk factors for preterm labor) ("D" recommendation).

Note: See also the U.S. Preventive Services Task Force background paper on this topic: U.S. Preventive Services Task Force. Home uterine activity monitoring for preterm labor. JAMA 1993270:369-376.

The draft of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH, and Douglas B. Kamerow, MD, MPH.top link

41. Screening for Down Syndrome

Burden of Suffering

Down syndrome, a congenital syndrome caused by trisomy of all or part of chromosome 21, is the most common chromosome abnormality. 1 Population-based surveillance programs have reported a Down syndrome birth prevalence of 0.9/1,000 live births. 2 The incidence of Down syndrome is higher than the birth prevalence, however, since many fetuses are spontaneously aborted, some are recognized in utero and electively aborted, and some cases are not recognized at birth. Affected children are characterized by physical abnormalities that include congenital heart defects and other dysmorphisms, and varying degrees of mental and growth retardation. Although there are therapies for some of the specific malformations associated with Down syndrome, there are no proven therapies available for the cognitive deficits. Life expectancy for infants born with Down syndrome is substantially lower than that of the general population. 3 Based on 1988 cross-sectional data, the lifetime economic costs of Down syndrome have been estimated to be $410,000 per case. 4

The risk for Down syndrome and certain other chromosome anomalies increases substantially with advancing maternal age. 1,5-10 Parents carrying chromosome-21 rearrangements are also at an increased risk of Down syndrome pregnancies, 11-13 with the risk being much higher if the mother carries the rearrangement than if the father does. Also at higher risk are those who have previously had an affected pregnancy, independent of advancing maternal age and chromosome rearrangements. 14,15 top link

Accuracy of Screening Tests

Down syndrome is diagnosed prenatally by determining karyotype in fetal cell samples obtained by amniocentesis or chorionic villus sampling (CVS). Because of their invasiveness, risks, and cost, these procedures are generally reserved for women identified as high-risk either by history (i.e., advanced maternal age, prior affected pregnancy, known chromosome rearrangement) or by screening maneuvers (e.g., serum markers, ultrasound). Chromosome analysis of fetal cells obtained by second-trimester amniocentesis has been demonstrated to be accurate and reliable for prenatal diagnosis of Down syndrome in a randomized controlled trial and several cohort studies. 16-19 CVS, a technique for obtaining trophoblastic tissue, is an alternative to amniocentesis for detecting chromosome anomalies. The advantages of this procedure include the ability to perform karyotyping as early as 10-12 weeks and more rapid cytogenetic analysis. Potential disadvantages of CVS include apparent discrepancies between the karyotype of villi and the fetus due to maternal cell contamination or placental mosaicism, and failure to obtain an adequate specimen, resulting in a repeat procedure (usually amniocentesis) in up to 5% of tested women. 20-22 In randomized controlled trials 20-22 and cohort studies 23-29 comparing CVS to amniocentesis, accurate prenatal diagnosis has been obtained in over 99% of high-risk women when CVS is accompanied by both direct and culture methods of cytogenetic examination and when amniocentesis is provided to clarify CVS diagnoses of mosaicism or unusual aneuploidy. Transabdominal CVS has been reported to have comparable accuracy to transcervical CVS in randomized controlled trials. 20,30,31 First-trimester amniocentesis (at 10-13 weeks) has been compared to CVS in one randomized controlled trial. 32 Success rates were the same for the two procedures (97.5%) early amniocentesis failures were primarily due to failed culture. First- and second-trimester amniocentesis have not been directly compared in controlled trials.

For low-risk women, the risks associated with prenatal diagnostic testing (see Adverse Effects of Screening and Early Detection , below) are generally considered to outweigh the potential benefits because of the low likelihood of diagnosing a Down syndrome gestation. If screening tests, such as measurement of maternal serum markers or ultrasound imaging, can identify women who are at high risk for carrying a Down syndrome fetus, the relative benefit of prenatal diagnostic testing increases, potentially justifying the more invasive diagnostic procedures. Reduced levels of maternal serum alpha-fetoprotein (MSAFP) and unconjugated estriol, and elevated levels of human chorionic gonadotropin (hCG), have each been associated with Down syndrome gestations. Intervention studies of screening have not been carried out with unconjugated estriol alone, while cohort intervention studies evaluating MSAFP and hCG have found them to have relatively poor discriminatory power as individual tests. 33-36 Multiple-marker screening uses results from two or three individual maternal serum marker tests, combined with maternal age, to calculate the risk of Down syndrome in the current gestation. 37,38 Amniocentesis and diagnostic chromosome studies are then offered to women whose screening test results suggest a high risk of Down syndrome, with high risk often defined as having the same or greater risk of an affected pregnancy that a 35-year-old woman has (i.e., 1 in 270).

Six interventional cohort studies that analyzed low-risk women younger than 35 years, 39-41 36 years, 42 37 years, 43 or 38 years, 44 and six that included women of any age desiring screening (90-95% <=35 years), 45-50 have evaluated the proportion of Down syndrome pregnancies identified through double-marker (hCG and either MSAFP or estriol) or triple-marker screening in the midtrimester compared to the total number of such pregnancies identified. Interpretation of sensitivity is affected by incomplete ascertainment of karyotype and incomplete diagnosis at birth in these studies, although most had active surveillance systems for Down syndrome cases born to screened women. The reported sensitivity of multiple-marker screening for Down syndrome ranged from 48 to 91% (median 64.5%) and the false-positive rate (after revision of dates by ultrasound) ranged from 3% to 10%. The likelihood of Down syndrome given a positive screening test result was 1.2-3.8%, depending on the threshold for high risk used to define a positive test result. In these studies, the threshold chosen ranged from a 1 in 125 to a 1 in 380 chance of having an affected pregnancy given a positive test result. A young woman with a prescreen risk of about 1 in 1,000 who tested positive would have a postscreen risk similar to the risk in women of advanced age who are currently offered prenatal diagnosis.

Multiple-marker screening has also been evaluated in women 35 years of age or older, for whom prenatal diagnosis using amniocentesis or CVS is routinely recommended because of their increased risk of Down syndrome. Studies suggest that multiple-marker screening in these women might reduce the need for more invasive diagnostic tests. In a cohort study of 5,385 women >=35 years of age with no other risk factors, all of whom were undergoing routine amniocentesis and chromosome studies (thus allowing complete ascertainment of chromosome abnormalities), estimates of the individual risk of Down syndrome were calculated based on maternal age in combination with the results of multiple-marker screening using MSAFP, hCG, and unconjugated estriol. 51 If amniocentesis were performed only on older women with at least a 1 in 200 risk of carrying a fetus with Down syndrome based on triple-marker screening, 89% of affected fetuses would have been detected, 25% of women with unaffected fetuses would have been identified by screening as needing amniocentesis. A threshold of 1 in 300 (similar to risk based on age >=35 years alone) did not add sensitivity but did increase the screen-positive rate to 34%. Thus, triple-marker screening could have avoided 75% of amniocenteses in older women, with their attendant risk of fetal loss, at a cost of missing 11% of cases of Down syndrome. In this study, performing amniocenteses only on women with postscreen risks of at least 1 in 200 for Down syndrome would also have detected 47% of fetuses with other autosomal trisomies, 44% of fetuses with sex aneuploidy, and 11% with miscellaneous chromosome abnormalities. In previously cited interventional cohort studies of double- or triple-marker screening that reported separate results for older women, the Down syndrome detection rate was reported as 80-100% for women >=35 years 43,46,47,50 and 100% for women >=36 years, 42,45 with false-positive screening results of 19-27%. Incomplete case ascertainment was possible, however, since screen-negative women rarely had diagnostic chromosome studies.

Although no controlled trials have directly compared double-marker to triple-marker screening, several cohort studies of triple-marker screening have reported the detection rates for double-marker screening with hCG and MSAFP only. Three markers appear to be somewhat more sensitive than two for detection of Down syndrome the net difference in sensitivity ranged from -2 to +18% in these studies, depending on the false-positive rate and risk cut-off used. 43,48,50,51

Ultrasonography is another potential screening test for Down syndrome. Abnormalities associated with Down syndrome (including intrauterine growth retardation, cardiac anomalies, hydrops, duodenal and esophageal atresia) and differences in long-bone length and nuchal fold thickness between Down syndrome and normal pregnancies observable on midtrimester ultrasound have been reviewed. 52 In prospective cohort studies of midtrimester ultrasound screening in high-risk women who were undergoing amniocenteses for chromosome studies, nuchal fold thickening identified 75% of Down syndrome fetuses shortened humerus or femur length detected 31% and an index based on thickened nuchal fold, major structural defect, and certain other abnormalities identified 69%. 53-55 The likelihood of Down syndrome given a positive result was 7-25% in these high-risk samples, but would be substantially lower in low-risk women. No published cohort studies have evaluated the accuracy of ultrasound screening for detection of chromosome abnormalities in low-risk women, nor have interventional cohort studies evaluated its efficacy as a screening tool in high-risk women. The use of ultrasound as a screening test for Down syndrome is limited by the technical difficulty of producing a reliable sonographic image of critical fetal structures. 56,57 Incorrect positioning of the transducer, for example, can produce artifactual images resembling a thickened nuchal skin fold in a normal fetus. 58 Sonographic indices are therefore subject to considerable variation. Imaging techniques require further standardization before routine screening by ultrasound for Down syndrome can be considered for the general population. 56,59,60 In addition, results obtained by well-trained and well-equipped operators in a research context may not generalize to widespread use. In a multicenter cohort study in high-risk women that involved a large number of ultrasonographers of varying ability, the sensitivity of nuchal fold thickening for Down syndrome was only 38%. 59 The false-positive rate in this study was 8.5%, many times higher than that reported in studies involving expert ultrasonographers. 55,61 top link

Effectiveness of Early Detection

The detection of Down syndrome and other chromosome anomalies in utero provides as its principal benefit the opportunity to inform prospective parents of the likelihood of giving birth to an affected child. Parents may be counseled about the consequences of the abnormality and can make more informed decisions about optimal care for their newborn or about elective abortion. No controlled trials have been performed to assess clinical outcomes for those using screening or prenatal diagnosis for Down syndrome compared to those who do not. Therefore, the usefulness of this information depends to a large extent on the personal preferences and abilities of the parents. 62 Whether or not parents choose to use prenatal screening or diagnosis is related both to their views on the acceptability of induced abortion and their perceived risk of the fetus being abnormal. 63 The perception of the harm or nature of the disability may play a greater role in the decision than the actual probability of its occurrence. 64-67

Induced abortion is currently sought by the majority of women whose prenatal diagnostic studies (i.e., karyotyping) reveal fetuses with Down syndrome. 33-35,39,40,45,48,68 Estimates of the reduction in birth prevalence of Down syndrome associated with offering prenatal diagnosis to women 35 years and older range from 7.3% to 29% in the U.S. and other developed countries. 2,69-73 The effect of this approach on the total number of Down syndrome births is limited because older women have low birth rates and therefore account for a relatively small proportion of affected pregnancies despite their exponentially increased risk for having an affected pregnancy. 74 Limited data are available to estimate the impact of serum-marker screening in younger women on Down syndrome birth prevalence. In England and Wales, the proportion of all cytogenetically diagnosed Down syndrome cases detected prenatally (thus potentially preventable) increased from 31% to 46% after the introduction of screening by maternal serum analysis and ultrasound for low-risk women. 68 In cohort studies evaluating double- or triple-marker screening, when the proportions of screen-positive women who decided not to undergo amniocentesis or induced abortion were taken into account, the proportion of Down syndrome births to screened women that were actually prevented ranged from 36% to 62%. 39,40,45,48 Up to 25% of screen-positive women declined prenatal diagnosis by amniocentesis in these studies. The effectiveness of screening in preventing Down syndrome births may be further reduced by incomplete uptake of screening. In antenatal screening programs in which double- or triple-marker screening was offered to all women and amniocentesis or CVS was offered to women over 35 years of age, nearly 60% of all Down syndrome births were potentially preventable, the remainder either being missed by screening (14-23%) or occurring in women who were not screened (17-27%). 47,49 Neither study evaluated acceptance of induced abortion, however. In another population, offering double-marker screening to all women prevented 59% of all Down syndrome births. 45 This population had high rates of screening (89%), largely due to the fact that pregnant women had to specifically ask to be excluded. There was also high acceptance of amniocentesis in screen-positive women (89%), and of induced abortion of cytogenetically confirmed cases (91%). The birth prevalence of Down syndrome decreased from approximately 1.1/1,000 to 0.4/1,000 after initiation of prenatal screening in this population.

Other potential effects of prenatal detection of Down syndrome have not been adequately explored. In families at high risk of Down syndrome births, such as those with advanced maternal age, a previous affected pregnancy, or known carriage of translocations, the availability of prenatal diagnosis may reduce the induced abortion rate by identifying normal pregnancies that might otherwise be electively aborted. This benefit has been reported with screening for cystic fibrosis, 75 but it has not been evaluated for Down syndrome. The diagnosis of a chromosome abnormality may spare unsuspecting parents some of the trauma associated with delivering an abnormal infant, and may help parents to prepare emotionally. Studies evaluating these potential psychological benefits have not been reported, however. Prenatal diagnosis may also enable clinicians to better prepare for the delivery and care of the baby. Studies are lacking regarding the impact of these measures on neonatal morbidity and mortality.

An indirect benefit of testing to detect Down syndrome is the discovery during testing of abnormalities other than the target condition. Chromosome studies on specimens obtained by amniocentesis or CVS will detect other abnormalities besides Down syndrome. Autosomal trisomies other than Down syndrome are usually spontaneously aborted, so the principal benefit of screening may be avoidance of late fetal death. 76 The health consequences of sex aneuploidy are less significant than trisomies, but about half such pregnancies are nevertheless electively aborted when discovered prenatally. 77,78 Serum marker screening for Down syndrome will also identify some patients carrying fetuses with other chromosome abnormalities (e.g., Turner syndrome, trisomy-13 or -18) sensitivity is low, 51 however, because some of these abnormalities have different effects on serum markers than does Down syndrome, and require different risk thresholds. 50,79 Ultrasound screening for Down syndrome leads to a more accurate assessment of gestational age in women with uncertain dates, and some studies suggest that acting on this information may reduce the likelihood of induced labor for erroneously diagnosed postterm pregnancy (see Chapter 36). Multiple gestations and major congenital anomalies, such as diaphragmatic hernia, gastroschisis, nonimmune fetal hydrops, and obstructive uropathy, may also be detected by ultrasound. These discoveries permit antenatal treatment as well as delivery and neonatal care planning. Controlled trials proving that early detection by ultrasound of multiple gestations or congenital anomalies improves outcome have not been published, however (see Chapter 36).

Adverse Effects of Screening and Early Detection.

The most important risks of early detection of Down syndrome include those to the fetus from amniocentesis and CVS performed as a primary or follow-up diagnostic test, the psychological effects of a positive test on the parents, and the complications resulting from induced abortion. The risks of amniocentesis include rare puncture of the fetus, bleeding, infection, and possibly isosensitization. 80,81 The procedure-related rate of fetal loss with current technique appears to be about 0.5-0.8%. 16,17,29 The best evidence on amniocentesis risks comes from a randomized controlled trial of screening, 16 which reported a procedure-related risk of fetal loss of 0.8% of pregnancies. This may nevertheless overestimate current rates of loss as techniques have improved. In a more recent series of patients undergoing amniocentesis as part of a clinical trial, the risk of fetal loss was 0.04%. 22 In a randomized controlled trial, neonatal respiratory distress syndrome and neonatal pneumonia were more frequent after amniocentesis, independent of birth weight and gestational age the additional risk was about 1%. 16 A similar trend was seen in the Medical Research Council study, 18 but has not been confirmed in other studies. Infection has not been identified as a significant problem in any large studies. No clinically important effects on development, behavior, or physical status were identified in 4-year-old children whose mothers had undergone midtrimester amniocentesis. 83 Case series of women undergoing first-trimester amniocentesis suggest a procedure-related fetal loss rate of 3-7%. 84-87 In a randomized controlled trial, the total fetal loss rate with early amniocentesis was significantly higher than with CVS (5.9 vs. 1.2%). 32

Several randomized controlled trials comparing amniocentesis and CVS have reported significantly higher fetal loss rates with CVS (1.0-1.5%) when compared with second-trimester amniocentesis. 20-22 Inexperience and the use of transcervical CVS appear related to a greater risk of fetal loss, although at least one trial found no significant difference in fetal loss rates between transcervical and transabdominal CVS (2.5% vs. 2.3%). 31 An increased risk of transverse limb reduction anomalies in infants born after CVS has been reported in case-control and case-series studies. 88-93b Conflicting evidence from cohort studies may relate to varying methods of case ascertainment or classification. 94-99a Decreasing risk and a trend from proximal to distal limb damage with increasing gestational age at CVS provide biologic plausibility for a true association with limb reduction defects. 93,99b Current estimates for the overall risk of transverse limb deficiency from CVS range from 0.03% to 0.10% of procedures. 99a Severe maternal complications from CVS are rarely reported, but the Canadian Collaborative Study suggested a higher risk of bleeding requiring intervention for women undergoing CVS compared to amniocentesis. 22 None of the CVS trials has reported increased risks of birth defects or major infant health problems, but sample size is inadequate in these trials to rule out rare adverse effects.

A positive screening test result can produce a harmful psychological effect on parents. This is especially important because the large majority of positive screening tests occur in normal pregnancies. Adverse psychological effects of screening tests include the fear of discovering an abnormal pregnancy as well as anxiety over possible complications from diagnostic and therapeutic procedures. Women who have been identified as being at high risk because of a positive serum-marker screening test may have greater distress than women who are identified as high risk because of advanced age. 100,101 Distress is reduced following a diagnostic procedure confirming a normal pregnancy, but some anxiety related to the false-positive screening test may persist. 102,103 Most women screened will have normal results, however, and this may have psychological benefits for the reassured parents.

The potential complications of induced abortion must also be considered, since this is the outcome of the majority of positive diagnostic test results. Morbidity from first-trimester induced abortion, including infection, hemorrhage, and injury, occurs in 2-3% of procedures, but serious complications are rare in one series of 170,000 cases, 0.07% required hospitalization and none resulted in death. 104-107 Complication rates, including maternal case-fatality rates, are higher with second-trimester abortions, but remain uncommon. 108-110 The case-fatality rate from legally induced abortion, 0.4/100,000 procedures, is substantially lower than the risk of pregnancy-related death, which is 8-9/100,000 live births. 108,109,111,112 The most serious consequence of false-positive test results, the induced abortion of a normal pregnancy, was not reported in any of the trials, and appears to be rare with current techniques. The likelihood of diagnostic error is slightly higher with CVS than with amniocentesis, but the risk of induced abortion as a consequence has not been fully evaluated. top link

Recommendations of Other Groups

Most organizations recommend offering amniocentesis or CVS for prenatal diagnosis to all pregnant women who are aged 35 years and older or otherwise at high risk for chromosome abnormalities. 113-115 The Canadian Task Force on the Periodic Health Examination concluded that there is fair evidence to offer second-trimester triple-marker screening to all pregnant women less than 35 years of age, and as an alternative to prenatal diagnosis by karyotyping in women 35 years and older such offering should be accompanied by education on its limited efficacy, as well as on the risks of second-trimester diagnosis and abortion, and on the psychological implications of screening and of a Down syndrome birth. 114 Offering multiple-marker screening between 15 and 18 weeks of gestation to low-risk women under 35 years of age to assess Down syndrome risk is also recommended by the American College of Obstetricians and Gynecologists (ACOG) and the American College of Medical Genetics (ACMG) neither group recommends a specific multiple-marker protocol. 115,116 Neither ACOG nor ACMG recommends prenatal cytogenic screening by multiple-marker testing in women 35 years and older ACOG recommends that multiple-marker testing may be offered as an option for those women who do not accept the risk of amniocentesis or who wish to have this additional information prior to making a decision. No organizations currently recommend routine screening for Down syndrome by ultrasound. ACOG 117 and a National Institutes of Health consensus development conference 118 have recommended that ultrasound imaging be performed during pregnancy only in response to a specific medical indication. top link

Discussion

Prenatal diagnostic testing is accurate and reliable for detecting Down syndrome, but it is associated with a procedure-related fetal loss risk of about 0.5% for second-trimester amniocentesis and 1-1.5% for CVS, and a measurable risk of transverse fetal limb deficiency after CVS. The currently accepted medical practice of routinely offering amniocentesis or CVS for prenatal diagnosis to pregnant women aged 35 years and older or otherwise at high risk is based on the mother's increased risk of having a fetus with a chromosome abnormality balanced against the risk of fetal loss associated with these procedures, and therefore includes an element of judgment. It can be predicted from available data (odds of Down syndrome during the second trimester) that a program offering amniocentesis to all pregnant women at age 35 has the potential of exposing 200-300 normal fetuses to this procedure for every case detected. 10 With an estimated procedure-related fetal loss rate of 0.5%, one normal fetus would be lost by amniocentesis for every one to two chromosome anomalies detected in such women. For CVS, the number of normal fetuses lost per case detected would be higher, and for first-trimester amniocentesis, it may be higher still. The older the maternal age, the more favorable the ratio of affected fetuses to fetal loss. Most women who request such testing and receive a diagnosis of a Down syndrome pregnancy choose to abort the pregnancy, resulting in a measurable reduction in Down syndrome births. There is little good evidence of the effect on personal and family outcomes, however, or on the balance of risks and benefits for the group as a whole. Nevertheless, those women at high risk who desire prenatal diagnosis of Down syndrome may benefit substantially from it. Thus, there is fair evidence to support offering prenatal diagnosis to high-risk pregnant women who are identified by age, history, or screening tests when a comprehensive prenatal diagnosis program that includes education, interpretation, and follow-up is available.

In low-risk pregnant women, maternal serum multiple-marker screening in the second trimester can detect nearly two thirds of Down syndrome fetuses, but it will result in a large number of young women being offered amniocentesis who would not otherwise be subjected to its risks. The ratio of affected fetuses detected to procedure-related fetal loss in women with positive multiple-marker screening would be similar to or more favorable than that of women 35 years and older. The risk of fetal loss may be acceptable to parents with strong fears of having an affected child. 64,119-121 There is also evidence that multiple-marker screening in women 35 years and older can detect 80% or more of Down syndrome pregnancies while allowing the majority of such women to avoid the risks associated with invasive diagnostic testing. Multiple-marker screening is not supported by the same strength of evidence as is amniocentesis or CVS, however. Potential problems include the reduced sensitivity for Down syndrome and other chromosome abnormalities, the large proportion of false-positive tests, and the substantial number of women who refuse or do not receive follow-up amniocentesis and chromosome studies. This is of particular concern if such screening is offered to women 35 years and older who might otherwise receive amniocentesis or CVS. Nevertheless, in some older women, particularly those who may have had difficulty conceiving or carrying a pregnancy, the reduced likelihood of amniocentesis or CVS and consequent risk of fetal loss or injury may outweigh the reduced sensitivity of multiple-marker screening. There is therefore fair evidence to support offering multiple-marker screening to pregnant women of all ages when a comprehensive prenatal diagnosis program is available that includes education, interpretation, and follow-up.

There is a lack of sound evidence to support the use of individual maternal serum markers to screen for Down syndrome, and currently available evidence suggests that sensitivity is substantially lower than with multiple-marker screening. Similarly, ultrasonography has not been adequately evaluated as a routine screening test for Down syndrome, and there are important concerns about the measurement reliability and generalizability of this technology to widespread use. Since there is evidence supporting the effectiveness of other screening and diagnostic methods, neither individual serum markers nor ultrasonography can be recommended as screening tests for Down syndrome outside clinical trials.

Identification and selective abortion of Down syndrome pregnancies raises important ethical concerns, a full discussion of which is beyond the scope of this chapter. These concerns include the implicit message that Down syndrome is an undesirable state, the interpretation of induced abortion in eugenic terms by some persons, and societal and economic pressures that may stigmatize families with a Down syndrome member. Attitudes held by both physicians and by society toward individuals with Down syndrome have changed over time, and various Down syndrome associations now offer support for families and individuals with Down syndrome, promote their participation in society, and seek respect for them. 122,123 These issues highlight the importance of offering screening and prenatal diagnosis of Down syndrome in a value-sensitive fashion with emphasis on reliable information about Down syndrome itself as well as about the potential risks and benefits of screening procedures.

In these recommendations, primary consideration has been given to the prenatal detection of Down syndrome. Other chromosome anomalies (e.g., Turner syndrome, trisomy-18) are often detected during prenatal screening and diagnosis and many may consider their detection important. There are few studies directly addressing screening for these conditions, however, and screening protocols have not been sufficiently evaluated to warrant review at this point.

CLINICAL INTERVENTION

The offering of amniocentesis or CVS for chromosome studies to pregnant women aged 35 years and older and to those at high risk of Down syndrome for other reasons (e.g., previous affected pregnancy, known carriage of a chromosome rearrangement associated with Down syndrome) is recommended ("B" recommendation). In some circumstances, depending on resources, preferences, and other factors, the selection of a different age threshold for offering prenatal diagnosis may be considered. Counseling before the procedure should include a comparison of the risks to the fetus from the procedure and the probability of a chromosome defect given the patient's age or other risk factors, as well as a full discussion of the potential outcomes associated with delivering a child with Down syndrome and of aborting a Down syndrome fetus.

The offering of screening for Down syndrome by maternal serum multiple-marker testing at 15-18 weeks of gestation is recommended for all pregnant women who have access to counseling and follow-up services, skilled high-resolution ultrasound and amniocentesis capabilities, and reliable, standardized laboratories ("B" recommendation). There is currently insufficient evidence to recommend a specific multiple-marker screening protocol. Counseling regarding screening should include information on the procedure itself, the likelihood of follow-up testing with amniocentesis and its associated risks, as well as a full discussion of the potential outcomes associated with delivering a child with Down syndrome and of aborting a Down syndrome fetus. Women with a positive screen should receive detailed information comparing the increased risk of trisomy and the risks of fetal loss from amniocentesis. For women aged 35 years and older, the choice of serum multiple-marker screening versus amniocentesis or CVS for chromosome studies depends on patient preferences and therefore requires a detailed discussion of the potential risks and benefits of each procedure. In particular, the patient should understand the reduced sensitivity of multiple-marker screening for Down syndrome and for other chromosome abnormalities compared to prenatal diagnosis by chromosome studies, and the increased risk of fetal loss or injury with amniocentesis and CVS.

There is currently insufficient evidence to recommend for or against routine ultrasound examination or the use of individual maternal serum markers in pregnant women as screening tests for Down syndrome ("C" recommendation). Recommendations against these tests may be made on other grounds, however, including the availability of other screening tests of proven effectiveness.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on material prepared for the Canadian Task Force on the Periodic Health Examination by Paul Dick, MDCM, FRCPC. top link

42. Screening for Neural Tube Defects --- Including Folic Acid/Folate Prophylaxis

Burden of Suffering

Neural tube defects, including anencephaly, encephalocele, and spina bifida, account for substantial morbidity and mortality. Anencephaly is almost always lethal, usually resulting either in stillbirth or death within hours or days of birth. Spina bifida can range from mild (spina bifida occulta) to severe (myelomeningocele). The manifestations of severe spina bifida may include infectious complications, paraplegia, bladder and bowel incontinence, Arnold-Chiari malformations, hydrocephalus, and, as a complication of hydrocephalus, diminished intelligence. 1 Aggressive surgical and medical care is often necessary for severely affected cases, along with special schooling and rehabilitative services for patients with permanent disabilities. Based on 1988 cross-sectional data, the estimated lifetime cost of spina bifida is $258,000 per case. 2

The birth prevalence of neural tube defects has declined substantially over the past 60 years. 3,4 Neural tube defects are reported in 3.6-4.6/ 10,000 live births in the United States. 3,5 These rates underestimate true incidence, however, because affected pregnancies may be spontaneously or electively aborted and because not all cases are detected and reported at birth. Population-based active surveillance programs that include prenatal diagnoses have reported neural tube defect rates of 7.2-15.6/10,000 live-born and stillborn infants. 6 A personal or family history of a pregnancy affected by a neural tube defect is associated with an increased risk of having an affected pregnancy, as is maternal insulin-dependent diabetes, but about 90-95% of cases occur in the absence of any positive history. 7-9 The birth prevalence of neural tube defects in the U.S. is higher at younger maternal ages and is more than one third higher for whites than blacks. 5 top link

Accuracy of Screening Tests

Tests for neural tube defects include ultrasound examination and measurement of maternal serum alpha-fetoprotein (MSAFP), amniotic fluid alpha-fetoprotein (AFAFP), and amniotic fluid acetylcholinesterase (AFAChE). The latter two are used primarily as confirmatory tests and should not be regarded as part of routine screening of women at low risk for neural tube defects. Ultrasound examination is used both as a screening test and as a follow-up test after positive MSAFP screening.

An elevated MSAFP measured at 16-18 weeks' gestation is a good predictor of neural tube defects. Depending on the cutoff used to define an elevated level (usually 2-2.5 times the median value for gestational age, reported as multiples of the median or MoM), screening can detect between 56% and 91% of affected fetuses. 10-17 An elevated MSAFP occurs in about 1-5% of pregnant women, 9,14,15,17-20 but for a number of reasons, the likelihood of a neural tube defect given a positive screening test result is small. First, about one third of positive tests are not confirmed by a second MSAFP measurement. 9,19 Second, although the reported specificity of MSAFP when followed by appropriate diagnostic tests (i.e., high-resolution ultrasonography, AFAFP, AFAChE) approaches 100%, 12,18,21 MSAFP assays themselves are relatively nonspecific. About 90-95% of cases with confirmed elevated MSAFP are caused by conditions other than neural tube defects, such as an underestimated gestational age, other congenital anomalies, intrauterine growth retardation, multiple gestations, or fetal demise. 7,9,10,18,20,22 An ultrasound examination is necessary to rule out these explanations for an elevated MSAFP. If ultrasonography does not provide an explanation for the abnormal result (as occurs in about 50%), 10,14,15,17 an amniocentesis should be offered to measure AFAFP and/or AFAChE levels. 8 Less than 10% of these lead to the discovery of a neural tube or abdominal wall defect the majority of the fetuses tested are normal. 9 In comparison with the number of women who must be tested, the actual number of neural tube defects detected through screening the general population is small (0.06-0.16% of pregnancies). 9,10,18,19

Virtually all cases of anencephaly can be detected by ultrasound alone, 24 as can many closed neural tube defects that may escape detection by MSAFP measurement. Current ultrasound techniques are less sensitive, however, in detecting other neural tube defects such as small meningomyeloceles. 11 In addition, although the published sensitivities and specificities of sonographic detection of spina bifida are high (79-96% and 90-100%, respectively), 24-28 investigators have emphasized that these data were obtained from centers with special expertise. 24 They may overestimate the sensitivity that would be expected when prenatal ultrasound is conducted with older equipment or is performed by those with less complete training, 28 which has become increasingly common as more physicians perform their own ultrasound examinations. 29 In addition, many of the published studies were of high-risk women and may not be generalizable to screening in the low-risk population. Nevertheless, recent improvements in ultrasound diagnosis of neural tube defects, with sensitivity and specificity approaching 100% when performed by expert sonographers at major screening centers, has caused some experts to recommend the use of ultrasound instead of MSAFP in pregnancies at low risk for neural tube defects. 30,31 Ultrasound and MSAFP have not been directly compared as screening tests for neural tube defects. top link

Effectiveness of Early Detection

The detection of neural tube defects in utero provides as its principal benefit the opportunity to inform prospective parents of the likelihood of carrying an affected fetus. Parents may be counseled about the consequences of the malformation and can make more informed decisions about optimal care for their newborn or about elective abortion. The antenatal diagnosis of a severe and/or lethal malformation (e.g., anencephaly) may spare parents some of the trauma associated with delivering such an infant. No controlled trials have been performed to prove that those screened for neural tube defects have better outcomes compared to those not screened, however. Therefore, the usefulness of this information depends to a large extent on the personal preferences and abilities of the parents. 32 Whether or not parents choose to use prenatal screening is related both to their views on the acceptability of induced abortion and their perceived risk of the fetus being abnormal. 33

Induced abortion is sought by the majority of women who choose to be screened and whose screening tests reveal neural tube defects, thus leading to a decreased birth prevalence of affected infants among screened women. 10,14,15,34-37 Time series from the United Kingdom, where screening by MSAFP and ultrasound is widespread, have reported a 49-50% decline in the birth prevalence of anencephaly and a 32-38% decline in the birth prevalence of spina bifida attributable to elective abortion for suspected central nervous system malformation. 38,39 The effectiveness of screening in reducing the number of infants born with neural tube defects is decreased by less than universal acceptance of screening, incomplete detection of affected fetuses, and varying decisions about elective abortion following early detection. 35

Several interventional cohort studies have evaluated the effect of MSAFP programs on the birth incidence of neural tube defects. In one such study, MSAFP screening was offered to all pregnant women attending antenatal clinics during the study period (n = 15,687), of whom 70% were actually screened. 35 Of 66 total neural tube defects, 54 occurred in the screened group (4.9/1,000) and 12 occurred in the unscreened population (2.5/1,000). The higher incidence rate in the screened group suggests that women who elected screening may have been at higher risk. Offering screening resulted in the elective abortion of 56% (37/66) of all pregnancies with neural tube defects. In the screened group, 11 (20%) were not detected by screening. Six (11%) were detected but were not aborted because ultrasonography or tests on amniotic fluid mistakenly indicated an unaffected pregnancy. Screening resulted in fewer infants being born with neural tube defects (1.6/1,000 in screened women vs. 2.5/1,000 in unscreened women). In another cohort study of more than 18,000 women, MSAFP screening was accepted in approximately 85% of second-trimester pregnancies and detected 80% of open neural tube defects, all of which were electively aborted. 34 Screening in this population reduced the birth prevalence of anencephaly by 90% and of open spina bifida by 72%. In a third study, screening was performed in 72% of clinic patients and detected 59% of all affected pregnancies, 94% of which were aborted. 36 Offering screening therefore prevented the births of 55% of affected fetuses.

There is limited evidence evaluating ultrasound screening for neural tube defects. One randomized controlled trial of routine ultrasound examination in low-risk women reported increased prenatal detection of fetal malformations, but no differences in induced abortion rates, survival rates of anomalous fetuses, or other perinatal outcomes. 28,40 Neural tube defects were included among the anomalies detected, but the study was not designed to evaluate this outcome specifically and only 13 such defects occurred in the entire enrolled population. Of eight neural tube defects occurring in the screened group, seven were detected by screening and electively aborted, reducing the birth prevalence by 88%, to 0.13/1,000 screened women. In controls, three of five neural tube defects were detected prenatally, two of which were aborted, reducing the birth prevalence to 0.39/1,000. The numbers of fetuses affected by neural tube defects were too small for valid statistical comparison. Interventional cohort studies evaluating routine ultrasound examination as a screening test for detection of neural tube defects have not been published, so there is inadequate information available on the acceptability and impact of ultrasound screening, confirmatory tests, and induced abortion in the general population.

Early detection of neural tube defects may also help parents to prepare emotionally, although this potential benefit has not been evaluated. It may enable clinicians to provide more intensive obstetric care and to better prepare for the delivery and care of the baby. Studies are limited, however, regarding the impact of these measures on neonatal morbidity and mortality. In a series of 208 patients aged 2-18 years with meningomyeloceles, there were no statistically significant differences in motor or sensory level, or in ambulatory function, between those delivered vaginally compared to those who had cesarean delivery. 41 On the other hand, in a retrospective population-based study of 160 fetuses with uncomplicated meningomyelocele, cesarean delivery before the onset of labor resulted in better motor function at age 2 years than vaginal delivery or cesarean delivery after a period of labor. 42 Follow-up to 4 years of age was available for 85% of the original cohort. 43 The pre-labor cesarean group continued to have a significantly greater difference between the anatomic and motor spinal cord level compared to the vaginally delivered group. Motor function was not significantly better at 4 years, however. These types of studies have important design limitations. Controlled trials evaluating early cesarean delivery for neural tube defects have not yet been published.

Another potential benefit of MSAFP screening for neural tube defects is the discovery during testing of abnormalities other than the target condition. A raised level of MSAFP, even in the absence of a congenital defect, is a risk factor for low birth weight, preterm labor, preeclampsia, and abruptio placentae 20,44,45 early obstetric intervention for these problems may be beneficial (see also Chapter 37). Reduced levels of MSAFP are associated with Down syndrome and certain other chromosomal anomalies MSAFP is one component of multiple-marker screening, which is recommended for the early detection of Down syndrome (see Chapter 41). The ultrasound evaluation that follows the detection of raised MSAFP may lead to a diagnosis of twins or a more accurate assessment of gestational age, and some studies suggest that acting on this information may improve neonatal outcome (see Chapter 36). Other congenital anomalies, such as diaphragmatic hernia, gastroschisis, nonimmune fetal hydrops, and obstructive uropathy, may also be detected. Discovery of a fetus affected by one of these anomalies may be useful for parental decision making, allowing the options of elective abortion or of antenatal treatment if available, planning for delivery and appropriate neonatal care. Controlled trials have not proven that early detection of these anomalies improves outcome, however. Indeed, studies suggest that fetuses with diaphragmatic hernias detected in utero have poorer outcomes than those detected after birth, 46,47 perhaps in part because larger defects are more likely to be detected prenatally.

The potential benefits of early detection of neural tube defects must be weighed against the potential risks of screening. The most important risks include those to the fetus from amniocentesis, the psychological effects on the parents of a positive test, the complications resulting from induced abortion, and the risk of elective abortion of normal pregnancies due to false-positive test results. The risks of amniocentesis include miscarriage, puncture of the fetus, bleeding, infection, and possibly isosensitization. 9 The exact rate of fetal loss due to amniocentesis is uncertain, since women undergoing this procedure are already at increased risk of fetal loss. The procedure-related rate of fetal loss with current technique appears to be about 0.5-0.8%. 48-50 The best evidence on amniocentesis risks comes from a randomized controlled trial of screening, 48 which reported a procedure-related risk of fetal loss of 0.8% of pregnancies. This may nevertheless overestimate current rates of loss as techniques have improved. In a more recent series of patients undergoing amniocentesis as part of a clinical trial, the risk of fetal loss was 0.04%. 51 In a randomized controlled trial, neonatal respiratory distress syndrome and neonatal pneumonia were more frequent after amniocentesis, independent of birth weight and gestational age the additional risk was about 1%. 48 A similar trend was seen in the Medical Research Council study, 52 but has not been confirmed in other studies. Infection has not been identified as a significant problem in any large studies. No clinically important effects on development, behavior, or physical status were identified in 4-year-old children whose mothers had undergone midtrimester amniocentesis. 53 Although MSAFP screening will increase the number of women undergoing amniocentesis, in cohort studies of screening programs fewer than 2% of MSAFP-screened women received amniocentesis. 10,14,15,17

Another risk of screening is the harmful psychological effect on parents of a positive test result. This is especially important because the large majority of positive screening tests in low-risk pregnancies are false positives. There is evidence that expectant parents with normal fetuses who are informed of an abnormal MSAFP test suffer substantial anxiety during the weeks of diagnostic testing and waiting for definitive results. 54-57 The anxiety level of these women at delivery was the same as that of women who had normal screening test results, however. No published controlled trials have evaluated whether counseling and education prior to screening alleviates these psychological effects. Elective abortion of pregnancy because of a fetal anomaly may also have important psychological effects. In one small case-control study, 58 women who aborted fetuses with major malformations (including neural tube defects) experienced grief similar to those experiencing spontaneous perinatal loss, but no comparison was made to women delivering an infant with a severe anomaly, who may also grieve for the loss of a normal infant. Most women screened will have normal results, and this may have psychological benefits for the reassured parents.

The potential complications of induced abortion must also be considered, since this is the outcome of the majority of positive diagnostic test results. The maternal case-fatality rate from legal induced abortion is 0.4/100,000 procedures, which is substantially lower than the eight to nine maternal deaths per 100,000 live births due to pregnancy and childbirth. 59-62 Rates of other major maternal complications are also lower than in pregnancy and childbirth, occurring in an estimated 0.1% of legal abortions. 62 All maternal complication rates are higher with second-trimester than with first-trimester abortions.

The most serious consequence of false-positive results, the induced abortion of a normal pregnancy on the basis of erroneous diagnostic test results, appears to be very uncommon with current diagnostic techniques (i.e., high-resolution ultrasound, AFAFP, and AFAChE). Investigators have reported false-positive results leading to elective abortion of normal fetuses in 0.006-0.07% of women screened. 10,17,18,34,38,63,64 top link

Primary Prevention (Folic Acid/Folate Prophylaxis)

Randomized placebo-controlled trials 65,66 and nonrandomized controlled trials 67-69 in pregnant women with a prior pregnancy affected by a neural tube defect have demonstrated that folic acid supplements substantially reduce the risk of recurrent neural tube defects. In the international, multicenter British Medical Research Council (MRC) trial, involving nearly 1,200 high-risk women, 4 mg of folic acid daily at least 1 month before conception through the first trimester reduced the risk of recurrence of neural tube defects from 3.5% to 1.0%, for a relative risk of 0.28 (95% confidence interval, 0.12 to 0.71). The MRC and two other trials tested folic acid doses of 4-5 mg/day, but the 86% risk reduction seen in one nonrandomized trial 67 that used 0.36 mg of folic acid plus multivitamins daily suggests that lower doses may also be effective.

Several case-control studies have reported a reduced risk of neural tube defects in women without a prior affected pregnancy who took daily multivitamins during the periconceptional period (from 1-3 months before conception to 0.5-3 months after conception). 70-72 One of these analyzed the amount of folic acid the multivitamins contained, which was >=0.4 mg for most women. 71 A similar study, on the other hand, reported no protective effect of either folic acid alone or multivitamins with folic acid. 73 Stronger evidence for a benefit of periconceptional multivitamins with folic acid in low-risk pregnant women comes from a cohort study of 22,715 women. 74 The risk of neural tube defects was significantly reduced, from 3.3/1,000 to 0.9/1,000 women, with daily intake of multivitamins containing 0.1-1.0 mg of folic acid during the first 6 weeks of pregnancy. In an unadjusted analysis, taking multivitamins with folic acid both in the 3 months before conception and in the first trimester was also protective against neural tube defects. In a random sample of the multivitamin users, about two thirds consumed multivitamins containing a daily dose of at least 0.4 mg of folic acid and 95% consumed at least 0.1 mg of folic acid daily. A randomized double-blind controlled trial of the efficacy of daily periconceptional multivitamin-multimineral supplements containing 0.8 mg of folic acid in preventing first occurrences of neural tube defects was conducted in Hungary, enrolling 4,753 women planning pregnancy. 75,76 Full supplementation was defined as taking them from 28 days before conception to at least the second missed menstrual period. The average daily consumption of dietary folate was 0.18 mg, which is similar to the estimated average intake of 0.2 mg/day by women aged 19 to 34 years in the United States. 77 The supplemented group experienced a significantly decreased prevalence of neural tube defects (0 of 2,104 vs. 6 of 2,052), congenital malformations as a whole (13.3 compared to 22.9/1,000), and major congenital abnormalities other than neural tube defects and genetic syndromes diagnosed by 8 months of age (14.7 vs. 28.3/1,000). Three observational studies provide limited evidence for the effectiveness of dietary folate at levels higher than 0.1-0.25 mg/day in preventing the occurrence of neural tube defects. 70,71,74 All three studies reported a protective effect of greater dietary folate intake, although not all results were statistically significant or adequately reported.

Research on adverse effects from folic acid supplementation is limited. Evidence that folic acid supplements in daily doses of 1-5 mg can mask the hematologic manifestations of vitamin B12 deficiency, possibly delaying its diagnosis and treatment thereby leading to permanent neurologic consequences, is limited to uncontrolled intervention studies 78-80 and case reports. 81-83 Hematologic improvement in pernicious anemia has also been reported in some patients taking folic acid doses <1 mg, but the response is not consistent, particularly at lower doses. 79,84-86 Nevertheless, this has been advanced as one reason to avoid universal supplementation or food fortification with folic acid. It has also been argued, however, that it is unreasonable to maintain anemia to make it easier to diagnose B12 deficiency while some neural tube defects occur that are potentially avoidable by supplementation. 87 Limited evidence supports independent associations of low-normal folate and B12 levels, and high homocysteine levels, with neural tube defects, 88,89 suggesting that a causal mechanism for these defects may be an abnormality in methionine synthase, a folate- and B12 -dependent enzyme. If these results are confirmed, supplementation with both folic acid and B12 may be appropriate to prevent neural tube defects. This could reduce the potential for adverse effects of folate supplementation in B12-deficient patients.

Folic acid supplementation may also reduce intestinal absorption of zinc. 90 A randomized trial in which 50 women received either 10 mg folic acid or placebo daily showed no effect on plasma zinc concentrations after 2 and 4 months, however. 91 One cross-sectional study found a significant correlation between pregnancy complications and high folate and low zinc concentrations in the plasma of 450 pregnant women, 92 but confirmation is needed. Patients under therapy with medications that interfere with folic acid metabolism (e.g., treatments for cancer, asthma, arthritis, AIDS, and psoriasis) may be adversely affected by folic acid supplementation, but this risk has not been adequately assessed. 93 Folic acid supplementation might theoretically provoke convulsions in epileptic women by interfering with the activity of certain anticonvulsants this potential risk has not been well studied.

None of the trials of healthy pregnant women reported serious adverse effects associated with folic acid supplementation. In the Hungarian trial, 94 infants born to women who received a multivitamin-multimineral supplement with folic acid did not differ from those born to women receiving only trace elements in mortality, somatic development, mental and behavioral development, or total serious or chronic disorders, at 8-21 months (mean 11 months) of age. The rate of atopic dermatitis, asthma, and wheezy bronchitis was significantly increased in the group whose mothers received multivitamins (16 vs. 5/1,000), but more affected infants in the supplemented group also had a positive family history for these disorders. This difference may also be a chance effect due to the large number of comparisons made. A series of 91 children born to women who had taken daily multivitamins containing 0.36 mg of folic acid to prevent neural tube defect recurrences revealed no adverse effects on health, auditory, visual, growth, or developmental status at age 7-10 years, compared with the general population. 95 The study found significant increases in neurotic traits, but whether this was attributable to folic acid or to other causes (e.g., increased parental anxiety related to having had a previously affected pregnancy) is unknown.top link

Recommendations of Other Groups

The American College of Obstetricians and Gynecologists (ACOG), 8 the American Society of Human Genetics, 96,97 the American Academy of Pediatrics (AAP), 98 the Canadian Task Force on the Periodic Health Examination, 99 and an international expert consensus conference 11 have recommended that MSAFP screening be offered to all pregnant women at 16-18 weeks' gestation, provided that it is accompanied by adequate counseling and follow-up and is performed in areas with qualified diagnostic centers (conventional and high-resolution ultrasound, amniocentesis) and high-quality standardized laboratories. The Canadian Task Force also recommends that high-resolution ultrasonography may be adequate for low-risk women. 99 The AAP and ACOG recommend that patients with a personal or family history of neural tube defects be offered amniocentesis at 15-16 weeks' gestation with AFAFP testing. 100 The recommendations of the American Academy of Family Physicians (AAFP) on screening for neural tube defects are currently under review.

The AAP, 101 Canadian Task Force, 99 and U.S. Public Health Service (USPHS) 102 recommend that all women of childbearing age who are capable of becoming pregnant take 0.4 mg of folic acid daily to reduce the risk of having a pregnancy affected with a neural tube defect. The AAP, 101 Canadian Task Force, 99 ACOG, 103 and the USPHS 104 recommend that patients who have had a previous pregnancy affected by a neural tube defect and who are planning to become pregnant should be offered treatment with 4 mg of folic acid daily starting 1-3 months prior to planned conception and continuing through the first 3 months of pregnancy. The recommendations of the AAFP on folate supplementation for the prevention of neural tube defects are currently under review. top link

Discussion

MSAFP is a sensitive screening test for neural tube defects. When positive results are followed by appropriate diagnostic tests, such as high-resolution ultrasound, AFAFP and AFAChE, MSAFP is highly specific as well. Early detection leads to a reduced birth prevalence of severely affected fetuses and may reduce complications due to labor and delivery in affected infants. It can also detect several other conditions, for some of which effective interventions exist. While MSAFP screening is a relatively safe procedure, neural tube defects are relatively uncommon, and in certain low-prevalence populations it is possible for the complication rate from screening and its follow-up diagnostic tests to equal or exceed the detection rate for the target condition. Some have expressed concern that the relatively small number of neural tube defects detected through screening may not justify the potential risks of amniocentesis and parental anxiety for the large majority of normal fetuses. 32 The increased risk may, nevertheless, be acceptable to parents with strong fears of having an abnormal child. 105 Whether or not to receive MSAFP screening therefore depends on the preferences of the individual patient, who must receive adequate counseling regarding potential risks and benefits of screening before being screened in order to make an informed decision. Ultrasonography performed by expert sonographers at major screening centers also appears to be a sensitive and specific screening tool, but these findings may not be generalizable to sonographers in other settings. Ultrasound has not yet been adequately evaluated as a routine screening test for neural tube defects.

Identification and abortion of pregnancies affected by neural tube defects raises important ethical concerns, a full discussion of which is beyond the scope of this chapter. These concerns include the implicit message that having a neural tube defect is an undesirable state, the interpretation of induced abortion in eugenic terms by some persons, and societal and economic pressures that may stigmatize families with a member who has a neural tube defect. These issues highlight the importance of offering screening and prenatal diagnosis of neural tube defects in a value-sensitive fashion with emphasis on reliable information about the defects themselves as well as about the potential risks and benefits of screening and diagnostic procedures.

For a woman who has had a previous pregnancy affected by a neural tube defect, there is good evidence that folic acid supplementation begun at least 1 month prior to conception and continued through the first trimester decreases the risk of recurrence. The only dosage adequately studied is a daily supplement of 4 mg, although some evidence suggests that lower dosages may be effective. For low-risk women who are planning pregnancy, a randomized controlled trial and several observational studies indicate that periconceptional intake of multivitamin-multimineral or multivitamin preparations containing 0.4-0.8 mg of folic acid can significantly reduce the risk of first occurrence of neural tube defects. All of these studies indicate the need to start supplementation at least 1 month before conception and to continue daily supplements through the first 2 to 3 months of pregnancy. There is limited evidence that dietary folate intake of greater than 0.1-0.3 mg/day reduces the risk of neural tube defects. No studies have directly compared the effectiveness of multivitamins with folic acid to increased dietary folate intake for the primary prevention of neural tube defects, but the evidence supporting use of multivitamins with folic acid is of higher quality. The current estimated average daily consumption of only 0.2 mg of dietary folate by American women aged 19-34 years 77 suggests that achieving adequate dietary intake may be more difficult for some women than taking supplements. It is unknown whether women who already have a diet that meets or exceeds 0.4 mg/day of folate would gain additional benefit from vitamin supplements. The effort required to assess dietary folate intake adequately may outweigh the costs and potential harms from routine supplementation.

Since half of pregnancies in the U.S. are unplanned, 106 all women capable of becoming pregnant would need to take multivitamins with folic acid (or increase their dietary folate intake) to maximize prevention of neural tube defects. It is likely that in the observational studies evaluating the association between multivitamins with folic acid and reduced risk of neural tube defects, many of the women evaluated had unplanned pregnancies, providing indirect evidence in support of this intervention. The ability of clinicians to convince women not contemplating pregnancy that they should take multivitamins with folic acid in order to prevent neural tube defects is unknown. Many women of childbearing age who are not planning pregnancy may not take supplements or pursue diets adequate in folate, particularly those who are poorer and less educated. Some authorities suggest that food fortification with folate has greater potential to reach the entire population at risk. 101-111

The results of controlled trials indicate that folic acid supplementation will not prevent all neural tube defects. The Centers for Disease Control and Prevention estimates that low-dose folic acid supplementation of all women capable of pregnancy would reduce the incidence of neural tube defects in the U.S. by 50%. 102 Therefore, the use of periconceptional folic acid supplements does not preclude offering screening for neural tube defects, although the cost-effectiveness of such screening is likely to be reduced given a lower risk of occurrence.

CLINICAL INTERVENTION

The offering of screening for neural tube defects by maternal serum alpha-fetoprotein (MSAFP) measurement at 16-18 weeks' gestation is recommended for all pregnant women who are seen for prenatal care in locations that have adequate counseling and follow-up services, skilled high-resolution ultrasound and amniocentesis capabilities, and reliable, standardized laboratories ("B" recommendation). Women with elevated MSAFP levels should receive a second confirmatory test when time allows (i.e., before 18 weeks of gestation), and high-resolution ultrasound examination by an adequately trained and experienced examiner before amniocentesis is performed. Screening with MSAFP may be offered as part of multiple-marker screening (see Chapter 41). There is currently insufficient evidence to recommend for or against the offering of screening for neural tube defects by routine midtrimester ultrasound examination in pregnant women ("C" recommendation). Recommendations may be made against such screening, except when conducted by expert sonographers at major screening centers, based on its unproven accuracy in other settings, the availability and proven effectiveness of MSAFP screening, and cost. See Chapter 36 for additional recommendations regarding routine ultrasound examination in pregnancy. Pregnant women at high risk of neural tube defects (e.g., those with a previous affected pregnancy) should be referred to specialized centers for appropriate diagnostic evaluation, including high-resolution ultrasound and amniocentesis.

Folic acid supplementation at a dose of 4 mg/day beginning 1-3 months prior to conception and continuing through the first trimester is recommended for women planning pregnancy who have previously had a pregnancy affected by a neural tube defect, to reduce the risk of recurrence ("A" recommendation). It is also recommended that all women planning pregnancy take a daily multivitamin or multivitamin-multimineral supplement containing folic acid at a dose of 0.4-0.8 mg, beginning at least 1 month prior to conception and continuing through the first trimester, to reduce the risk of neural tube defects ("A" recommendation). Taking a daily multivitamin containing 0.4 mg of folic acid is also recommended for all women capable of becoming pregnant, to reduce the risk of neural tube defects in unplanned pregnancies ("B" recommendation). Women taking drugs that interfere with folate metabolism (e.g., methotrexate, pyrimethamine, trimethoprim, phenytoin), women at increased risk of vitamin B12 deficiency (e.g., vegans or persons with AIDS), and those with epilepsy whose seizures are controlled by anticonvulsant therapy, should consult with their clinician regarding potential risks and benefits prior to considering folic acid supplementation. There is currently insufficient evidence to recommend for or against counseling women planning or capable of pregnancy to increase their dietary folate consumption to 0.4 mg/day as an alternative to taking multivitamins with folic acid ("C" recommendation). Offering counseling to increase dietary folate intake to women who do not wish to take folic acid supplements may be recommended on other grounds, including low risk, low cost, and likely benefit.

The use of periconceptional multivitamins with folic acid does not necessarily obviate the need to offer screening for neural tube defects during pregnancy, since not all defects will be prevented by prophylaxis.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH, based in part on background material prepared by Marie-Dominique Beaulieu, MD, MSc, and Brenda Beagan, MA, for the Canadian Task Force on the Periodic Health Examination. top link

43. Screening for Hemoglobinopathies

Burden of Suffering

Hemoglobin S is formed as the result of a single-gene defect causing substitution of valine for glutamic acid in position 6 of the beta chain of adult hemoglobin. Persons homozygous for hemoglobin S (HbSS) have sickle cell anemia. Under conditions of low oxygen tension, hemoglobin S polymerizes, causing the red blood cells of persons with sickle cell anemia to assume a "sickled" shape. This deformity of red blood cells leads to the symptoms of sickle cell disease. Persons heterozygous for both hemoglobin S and hemoglobin C (HbSC) and persons heterozygous for both hemoglobin S and beta-thalassemia (HbS/ beta-thal) also may experience sickle cell disease, although their symptoms tend to be less severe than those of persons homozygous for hemoglobin S. 1 Sickle cell disease affects an estimated 50,000 Americans 2-4 and affects persons of many racial and ethnic backgrounds. Among infants born in the U.S., sickle cell disease occurs in 1 in every 375 African Americans, 1 in 3,000 Native Americans, 1 in 20,000 Hispanics, and 1 in 60,000 whites. 4

Compared to blacks in the general population, the average life expectancy of patients with sickle cell anemia is decreased by 25-30 years. 5 Symptom severity and life expectancy vary considerably, with some patients surviving beyond middle age and others dying during infancy or childhood. Mortality in children with sickle cell disease peaks between 1 and 3 years of age, and is chiefly due to sepsis caused by Streptococcus pneumoniae. 1 Pneumococcal septicemia occurs at a rate of approximately 8 episodes per 100 person-years of observation in children under the age of 3 years with sickle cell disease. 6,7 The case-fatality rate can be as high as 35%. 8 After infancy, patients with sickle cell disease are usually anemic and may experience painful crises and other complications, including acute chest syndrome, priapism, strokes, splenic and renal dysfunction, bone and joint disease, ischemic ulcers, cholecystitis, and hepatic dysfunction associated with cholelithiasis. 4,9 The causes of premature death in adults are varied, and include sudden death during acute pain episodes, stroke, infection, and chronic organ failure. 5 Treatment for sickle cell disease may be expensive. This chronic illness places a large economic and psychosocial burden on patients and their caretakers. 9

About 2 million Americans are heterozygous for hemoglobin S and hemoglobin A (normal adult hemoglobin). This carrier state has been termed sickle cell trait and is present in 8% of the African American population. 3 Except for a slightly increased risk of exercise-related death under extreme conditions, 10 persons with sickle cell trait experience negligible morbidity. 11,12 Parents who are both carriers have a 25% probability with each pregnancy of having a child with sickle cell disease. One in every 150 African American couples in the U.S. is at risk of giving birth to a child with sickle cell disease (about 3,000 pregnancies per year). 13,14

Certain thalassemias may also be detected by screening for hemoglobinopathies. Thalassemias result from genetic defects that cause reduced synthesis of the polypeptide globin chains that combine to form hemoglobin. The clinical severity of these syndromes is related to the degree of reduction of alpha- or beta-globin synthesis.

The beta-thalassemias occur primarily among individuals of Mediterranean, African, or Southeast Asian origin. beta-Thalassemia minor occurs in persons heterozygous for a gene causing reduction in beta-globin synthesis. Life expectancy is normal, and the clinical severity of this state is related to the specific defect and its effect on beta-chain synthesis. Beta-Thalassemia major occurs in persons homozygous for genetic defects in beta-globin synthesis. beta-Globin synthesis in these individuals is markedly reduced or absent. They suffer from severe anemia and are transfusion dependent. Modern transfusion protocols and iron chelation therapy have greatly improved prognosis and some patients survive beyond the third decade of life. 15 Beta-Thalassemia major affects fewer than 1,000 Americans. 16

Alpha-Thalassemias are common in persons of Southeast Asian descent and also occur in persons of African and Mediterranean origin. Alpha-Thalassemias result from deletions of one or more of the four genes responsible for alpha-globin synthesis. Patients with a four-gene deletion develop hydrops fetalis secondary to severe anemia and die before or shortly after birth. Mothers of these infants are at risk for toxemia during pregnancy, operative delivery, and postpartum hemorrhage. 17 The three-gene deletion is referred to as hemoglobin H disease and affects about 1% of Southeast Asians. 18 Three- and four-gene deletions are rare in African Americans. Persons with hemoglobin H disease experience chronic hemolytic anemia that is exacerbated by exposure to oxidants and may require transfusion. Persons with a two-gene deletion have microcytic red blood cells and occasionally mild anemia. The one-gene deletion is a "silent" carrier state. These latter two conditions are often called alpha-thalassemia trait. The exact prevalence of alpha-thalassemia trait is uncertain, but it is estimated to be 5-30% among African Americans and 15-30% among Southeast Asians. 18-20

Hemoglobin E trait is the third most common hemoglobin disorder in the world and the most common in Southeast Asia, where its prevalence is estimated to be 30%. 18 Although hemoglobin E trait is associated with no morbidity, the offspring of individuals who carry this hemoglobin variant may exhibit thalassemia major (hemoglobin E/ beta-thalassemia) if the other parent has beta-thalassemia trait and contributes that gene. This combination is the most common cause of transfusion-dependent thalassemia in areas of Southeast Asia. 18 top link

Accuracy of Screening Tests

Two-tier hemoglobin electrophoresis (cellulose acetate electrophoresis with confirmation by citrate agar electrophoresis) or thin-layer isoelectric focusing are widely used screening tests for hemoglobin disorders. 4,21 High-performance liquid chromatography (HPLC) is a newer technique that offers high resolution and is in use in at least one screening program. 22 Techniques employing monoclonal antibodies and recombinant DNA technology are not used widely. 23 Blood for screening is collected in heparinized tubes or, in the case of newborn screening, on filter paper (Guthrie paper blotter). 24

Electrophoresis is highly specific in the detection of certain hemoglobin disorders, such as sickle cell disease. In one study, all 138 children with hemoglobin S identified by screening 3,976 African American newborns were found to have a sickling disorder when retested at age 3-5 years. 25 Another study of 131 infants detected by screening found only nine instances in which the sickling disorder required reclassification and no instance in which a child originally diagnosed as having sickle cell disease was found to have sickle cell trait. 26 Ten years' experience with universal screening of Colorado newborns (528,711 infants) using filter paper specimens and two-tier hemoglobin electrophoresis was reported in 1990. 27 Fifty infants with sickle cell diseases (HbSS, HbSC, HbS/ beta-thal) and 27 infants with other hemoglobin disorders were identified. Initial screening failed to identify four infants with sickle cell disease, but three of these were diagnosed on routine follow-up testing of infants suspected of having sickle cell trait. There were 32 false-positive results, 27 of which were confirmed to have a hemoglobinopathy trait on follow-up testing. The remaining five had normal hemoglobin. The test characteristics of HPLC may be superior to those of two-tier electrophoresis. Data are yet to be published.

The yield in screening pregnant women for hemoglobin disorders depends on the risk profile of the population being tested. In one study, electrophoresis in combination with a complete blood count was performed on 298 African American and Southeast Asian prenatal patients. Ninety-four women (31.5%) had a hemoglobin disorder (including sickle cell disease, sickle cell trait, hemoglobin E, alpha-thalassemia trait, beta-thalassemia trait, hemoglobin H, and hemoglobin C). 19 In a larger study in a different community, similar tests were performed on 6,641 prenatal patients selected without regard to race or ethnic origin. 28 One hundred eighty-five women (3%) had sickle cell trait, 68 (1%) had hemoglobin C, 30 (0.5%) had beta-thalassemia trait, and 17 (0.3%) had other disorders (hemoglobin E, alpha-thalassemia trait, hemoglobin H, hemoglobin E/beta-thalassemia disease). These results were obtained by combining electrophoresis with red cell indices. When low mean corpuscular volume (MCV) was used as the only screening test to detect thalassemia, the yield was 0.3-0.5%. 29

Prenatal diagnosis of sickle cell disease and other hemoglobinopathies in the fetus has been aided by advances in techniques of obtaining and analyzing specimens. Early tests involved the analysis of fetal blood obtained by fetoscopy or placental aspiration. 30 Genetic advances, however, have provided a safer 14 and more practical method in which amniocytes are obtained by amniocentesis and gene mutations are identified directly through recombinant DNA technology. 13 These techniques are highly accurate (error rate less than 1%) in detecting sickle cell disease and certain forms of thalassemia. 14,30-33 The principal disadvantage of using amniocentesis to obtain specimens is that it cannot be performed safely until about 16 weeks' gestation, thus delaying diagnosis and potential intervention until late in the second trimester. Chorionic villus sampling (CVS) is a means of obtaining tissue for DNA analysis as early as 10-12 weeks of gestation and is an established technique for prenatal diagnosis (see Chapter 41). 34,35 top link

Effectiveness of Early Detection

Screening for hemoglobin disorders is usually discussed with respect to two target populations: neonates and adults of reproductive age. Newborns with sickle cell disease benefit from early detection through the early institution of prophylactic penicillin therapy to prevent pneumococcal sepsis. A multicenter, randomized, double-blind, placebo-controlled trial demonstrated that the administration of prophylactic oral penicillin to infants and young children with sickle cell disease reduced the incidence of pneumococcal septicemia by 84%. 36 Other benefits of identifying newborns with sickle cell disease include prompt clinical intervention for infection or splenic sequestration crises and education of caretakers about the signs and symptoms of illness in these children. A 7-year longitudinal study reported lower mortality in children with sickle cell disease identified in the newborn period than in children diagnosed after 3 months of age (2% vs. 8%), but the investigators did not account for confounding variables in the control group. 37 A briefer longitudinal study (8-20 months) reported no deaths in 131 newborns detected through screening. 26 In the Colorado experience described above, 47 of the 50 newborns with sickle cell disease identified through screening remained in the state beyond 6 months of age. None of the 47 died during the period of observation. 27

Screening older children and adolescents is designed to detect carriers with sickle cell trait, beta -thalassemia trait, and other hemoglobin disorders that have escaped detection during the first years of life. Identification of carriers before childbearing allows them to make informed reproductive choices by receiving genetic counseling about partner selection and the availability of diagnostic tests in the event of pregnancy. There is some evidence that individuals who receive certain forms of counseling retain this information and may encourage other individuals, such as their partners, to be tested. 28,38-40 A prospective study of 142 persons screened for beta-thalassemia trait found that 62 (43%) encouraged other persons to be screened. 38 Compared with controls, those who had received counseling demonstrated significantly better understanding of thalassemia when tested immediately after the session. There is no direct evidence, however, that individual genetic counseling by itself significantly alters reproductive behavior or the incidence of births of infants with hemoglobin disorders. 9,41

Detection of carrier status during pregnancy provides prospective parents with the option of testing the fetus for a hemoglobinopathy. If the test is positive, they have the time to discuss continuation of the pregnancy and to plan optimal care for their newborn. Parents appear to act on this genetic information. About half of pregnant women with positive tests for thalassemia refer their partners for testing and, if the father is positive, about 60% consent to amniocentesis. 28 If sickle cell disease is diagnosed in the fetus, about 50% of parents elect therapeutic abortion. 32,42 In a recent study, 18,907 samples from pregnant women were screened for abnormal hemoglobin including thalassemias and hemoglobin S. In 810 (4.3%), an abnormal hemoglobin was identified 66% occurred in mothers unaware that they carried an abnormal hemoglobin, and 80% occurred in mothers unaware that they were at risk for giving birth to a child with a serious hematologic disorder. Eighty-six percent of mothers who received counseling said they wanted their partner tested and 55% of partners were tested. Seventy-seven pregnancies were identified as being at risk because the partner also was a carrier of an abnormal hemoglobin. Of these 77 pregnancies, the gestation was too advanced for prenatal diagnosis in 12 cases and the condition for which the pregnancy was at risk was too mild for this service to be offered in 12 others. Prenatal diagnosis was offered in the remaining 53 pregnancies and accepted by 25 couples (47%). Of 18 amniocenteses actually performed, 5 fetuses were found to have clinically significant hemoglobinopathies and one of these pregnancies was terminated. 43

There is evidence from some European communities with a high prevalence of beta-thalassemia that the birth rate of affected infants has declined significantly following the implementation of routine prenatal screening, 30 and there are data suggesting a similar trend in some North American communities that have introduced community education and testing for thalassemia. 16 Time series studies do not, however, prove that such trends are due specifically to the effects of prenatal screening. top link

Recommendations of Other Groups

Screening for sickle cell disease in all newborns, regardless of their race or ethnic origin, has been recommended by a National Institutes of Health consensus conference 8 and by a guideline panel convened by the Agency for Health Care Policy and Research. 4 Screening infants from high-risk groups (e.g., those of African, Caribbean, Latin American, Southeast Asian, Middle Eastern, or Mediterranean ethnicity) has been recommended by the World Health Organization, 44 the British Society for Haematology, 45 the American Academy of Family Physicians (AAFP), 46 and the Canadian Task Force on the Periodic Health Examination. 47 The recommendations of the AAFP are currently under review. The American Academy of Pediatrics 48 and Bright Futures 49 recommend routine screening for hemoglobinopathies as required by individual states. At present 29 states, Puerto Rico, and the District of Columbia mandate screening all newborns for hemoglobinopathies, and 12 states offer screening as an option. 50

The American College of Obstetricians and Gynecologists, 51 the British Society for Haematology, 45 and the Canadian Task Force 47 recommend selective prenatal screening and counseling of pregnant women from high-risk ethnic groups. The Canadian Task Force 47 recommends that parents with established positive carrier status be offered prenatal DNA analysis of amniocentesis or CVS tissue sampling. No major organizations recommend routine screening of adolescents and young adults for carrier status. top link

Discussion

Hemoglobinopathies occur among all ethnic and racial groups. Efforts at targeting specific high-risk groups for newborn screening inevitably miss affected individuals due to difficulties in properly assigning race or ethnic origin in the newborn nursery. In one study of more than 500,000 newborns, parental race as requested on a screening form was found to be inaccurate or incomplete in 30% of cases. 27 Proponents of selective screening of high-risk populations emphasize that, especially in geographic areas with a small population at risk, cost-effectiveness is compromised and considerable expense incurred in screening large numbers of low-risk newborns to identify the rare individuals with sickle cell disease or other uncommon hemoglobin disorders. 52 Studies supporting this argument have compared universal screening to no screening, not targeted screening. Recent research that accounts for the additional procedural and administrative costs of targeted screening suggests that universal screening may be the more cost-effective alternative. 4 Whether to screen all infants (universal screening) or only those infants from ethnic groups known to be at relatively high risk of having sickle cell disease (targeted screening) is therefore a policy question to be addressed by individual screening programs, taking into consideration cost-effectiveness analyses, disease prevalence, and available resources.

There has been considerable debate over the value of screening for hemoglobinopathies in persons of reproductive age. Critics cite evidence that sickle cell screening programs in the past have failed to adequately educate patients and the public about the significant differences between sickle cell trait and sickle cell disease. This has resulted in unnecessary anxiety for carriers and inappropriate labeling by insurers and employers. 53 In addition, there is no evidence that counseling, however comprehensive, will be remembered throughout the individual's reproductive life, influence partner selection, alter use of prenatal testing, or ultimately reduce the rate of births of affected children. 9,29 Proponents argue that these outcomes should not be used as measures of effectiveness since the goal of genetic counseling is to facilitate informed decision making by prospective parents. 9,20,29 In this regard, clinicians are responsible for making the individual aware of the diagnosis, the risk to future offspring, and the recommended methods to reduce that risk, regardless of the strength of the evidence that such counseling reduces the number of affected offspring.

CLINICAL INTERVENTION

Screening newborn infants for hemoglobinopathies with hemoglobin electrophoresis or other tests of comparable accuracy on umbilical cord or heelstick blood specimens is recommended ("A" recommendation). In geographic areas with a very low incidence of hemoglobin disorders, selective screening of newborns may be more efficient than universal screening. Infants with sickle cell disease must receive prompt follow-up, including oral penicillin prophylaxis, diagnostic testing, immunizations, and regular evaluations of growth and nutritional status. Their families should receive genetic counseling regarding testing of family members and risks to future offspring, information about the disease, education about early warning signs of serious complications, and referrals for peer support groups and sources of medical and mental health services.

Offering screening for hemoglobinopathies with hemoglobin electrophoresis or other tests of comparable accuracy to pregnant women at the first prenatal visit is recommended ("B" recommendation), especially for those who are members of racial and ethnic groups with a high incidence of hemoglobinopathies (e.g., individuals of African, Caribbean, Latin American, Mediterranean, Middle Eastern, or Southeast Asian descent). Carriers identified through testing should be urged to have the father tested and should receive information on the availability of prenatal diagnosis if the father is positive and the fetus is at risk of having a clinically significant hemoglobinopathy.

There is insufficient evidence to recommend for or against screening for hemoglobinopathies in adolescents and young adults from ethnic and racial groups known to be at increased risk for sickle cell disease, thalassemias, and other hemoglobinopathies in order for them to be able to make informed reproductive choices ("C" recommendation). Recommendations to offer such testing may be made on other grounds, including burden of suffering and patient preference. If provided, testing should be accompanied by counseling, which should include a description of the significance of the disease, how it is inherited, the availability of a screening test, and the implications to individuals and their offspring of a positive result.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by John Andrews, MD, MPH, and Modena Wilson, MD, MPH. top link

44. Screening for Phenylketonuria

Burden of Suffering

PKU is an inborn error of phenylalanine metabolism that occurs in 1 of every 12,000 births in North America. 1,2 In the absence of treatment during infancy, nearly all persons with this disorder develop severe, irreversible mental retardation. Many also experience neurobehavioral symptoms such as seizures, tremors, gait disorders, athetoid movements, and psychotic episodes with behaviors resembling autism. 3 These clinical manifestations of PKU have rarely developed in children born after the mid-1960s, when routine screening was legislated and early treatment for PKU became common. This has resulted in a cohort of healthy phenylketonuric women who have entered childbearing age. If dietary restriction of phenylalanine is not maintained during pregnancy, these women are at increased risk of giving birth to a child with mental retardation, microcephaly, congenital heart disease, and low birth weight. 4 The incidence of this maternal PKU syndrome is 1 of every 30,000-40,000 pregnancies. 5 In the absence of dietary control in women with PKU who become pregnant, it is estimated that exposure of the fetus to the teratogenic effects of maternal hyperphenylalaninemia could result in an increase in the incidence of PKU-related mental retardation to the level seen before PKU screening was established. 6 top link

Accuracy of Screening Tests

Blood phenylalanine determination by the Guthrie test has been the principal screening test for PKU for three decades. 7 Although well-designed evaluations of the sensitivity and specificity of the Guthrie test have never been performed, 8 sensitivity estimates 9,10 and international experience with its use in millions of newborns suggest that false-negative results are rare. Fluorometric assays, which can detect differences in blood phenylalanine levels as low as 0.1 mg/dL, are alternative forms of testing that also offer excellent sensitivity. 8 Most missed cases of PKU do not appear to be due to false-negative results of the screening tests, but to submission of an inadequate sample, clerical error involving the sample, or failure to follow up positive results. 9 Standards for adequate blood collection on filter paper for neonatal screening programs have been published. 10a

False-positive as well as false-negative results can occur in PKU screening. In certain situations and population conditions, the ratio of false positives to true positives is as high as 32 to 1. 8 Although false positives have been viewed for many years as less important than false-negative results because they can be corrected easily by repeating the test, recalling patients for a second PKU test may generate considerable parental anxiety. 11,12

The sensitivity of the Guthrie test is influenced by the age of the newborn when the sample is obtained. The current trend toward early discharge from the nursery (resulting in PKU screening being performed as early as 1 to 2 days of age) has raised concerns that test results obtained during this early period may have low sensitivity. This is because the blood level of phenylalanine is typically normal in affected neonates at birth and, with the initiation of protein feedings, increases progressively during the first days of life. Using the conventional cutoff of 4 mg/dL, diagnostic levels of phenylalanine may not be present in some phenylketonuric newborns tested in the first 24 hours of life. Prospective, longitudinal evaluations of serum phenylalanine levels in infants known to be at risk for PKU have demonstrated a variable rate of false-negative results when screening occurred within the first 24 hours of life. 13,14 The false-negative rates ranged from 2% to 31% for the first day of life, but decreased to 0.6% to 2% on the second day and to 0.3% by the third day. 8,13-16 Current rates may be lower due to the participation of many laboratories in a voluntary proficiency program run by the Centers for Disease Control and Prevention (CDC). The use of fluorometric assays, which offer more precise measurements of blood phenylalanine levels than the Guthrie test, result in lower false-negative rates as well. 8 Two additional solutions to improve sensitivity -- repeat testing of all newborns after early discharge and lowering the cutoff value to reduce the false-negative rate -- have encountered criticism for several reasons. Repeat testing would have low yield and reduced cost-effectiveness 17,18 it has been estimated that detecting even one case of PKU in this manner would require performing an additional 600,000 to perhaps 6 million tests. 8,18 Lowering the cutoff value, on the other hand, improves sensitivity at the expense of specificity, thereby increasing the ratio of false positives to true positives. 8 As of 1991, nine of the 53 screening programs in the U.S. used a cutoff level of greater than 2 mg/dL to define abnormal. 19 The majority of labs continue to use a cutoff of 4 mg/dL or greater.

The development of a cloned phenylalanine hydroxylase gene probe has made possible the prenatal diagnosis of PKU in families with previously affected children by analyzing DNA isolated from cultured amniotic cells or samples of chorionic villi. 20-22 Through the use of polymerase chain reaction, 31 alleles of the phenylalanine hydroxylase gene have been identified. 1 This may eventually permit the screening of the general population for carriers of these alleles, thereby detecting at-risk families prior to the birth of an affected child. 21-23

Routine screening of pregnant women for maternal PKU has been recommended as a means of preventing fetal complications. 2,5,24 This disorder is rare in the general population, however, and as a result of screening programs, many women with PKU are aware of their diagnosis. As the cohort of women born before implementation of routine newborn screening move out of their childbearing years, the yield from screening all pregnant women should be very low. In Massachusetts, routine screening of cord blood for 10 years detected only 22 mothers with previously undiagnosed hyperphenylalaninemia. 2,25 top link

Effectiveness of Early Detection

Before treatment with dietary phenylalanine restriction was recommended in the early 1960s, severe mental retardation was a common outcome in children with PKU. A review in 1953 reported that 85% of patients had an intelligence quotient (IQ) less than 40, and 37% had IQ scores below 10 less than 1% had scores above 70. 3 Since dietary phenylalanine restriction was introduced, however, over 95% of children with PKU have developed normal or near-normal intelligence. 26-29 A large longitudinal study reports a mean IQ of 100 in children who have been followed to 12 years of age, 30 and other reports show adolescent and young adult patients are functioning well in society. 31 Although the efficacy of dietary treatment has never been proven in a properly designed controlled trial, the contrast between children receiving dietary treatment and historical controls is compelling evidence of its effectiveness. Recognition of this prompted most Western governments to require routine neonatal screening as early as the late 1960s.

It is essential that phenylalanine restrictions be instituted shortly after birth to prevent the irreversible effects of PKU. 26,28,32,33 Traditionally, strict adherence to the diet was recommended for the first 4-8 years of life, after which liberalization of protein intake could occur without damage to the developed central nervous system. 26,28,32-34 Recent data, however, suggest that discontinuation of the diet may result in a deterioration of cognitive functioning, leading many to recommend continuation of the diet through adolescence and into adulthood. 35-37 Even if these precautions are taken, dietary treatment may not offer full protection from subtle effects of PKU. Intelligence scores in treated persons with PKU, although often in the normal range for the general population, are lower, on average, than those of siblings and parents, 26 and mild psychological deficits, such as perceptual motor dysfunction and attentional and academic difficulties, have been reported. 38-40

Early detection of hyperphenylalaninemia in pregnant women may also be beneficial. The incidence of maternal PKU syndrome is increasing with the growing number of healthy phenylketonuric females now of childbearing age. Maternal hyperphenylalaninemia can produce teratogenic effects, even on normal fetuses who have not inherited PKU. If the mother with classic PKU does not follow a restricted phenylalanine diet during pregnancy, there is an overwhelming risk of birth of an abnormal child. This risk appears to increase as the average maternal levels of phenylalanine maintained during pregnancy increase. 41,42 Over 90% of these children will have mental retardation, 75% microcephaly, 40-50% intrauterine growth retardation, and 10-25% other birth defects. 4,5 Uncertainties exist, however, as to the extent these outcomes can be prevented by instituting treatment with dietary phenylalanine restriction during pregnancy. 4,43 Although some pregnant women under treatment have given birth to normal children, a number of investigators have found that dietary intervention during pregnancy fails to prevent fetal damage. 40,43-47 Preliminary evidence from the North American Collaborative Study of Maternal Phenylketonuria, on the other hand, suggests improved outcome if metabolic control is achieved by the 10th week of pregnancy. 48 Many believe dietary restrictions must be instituted prior to conception for them to be effective. 2,4,45,49 There are concerns, however, that the low-phenylalanine diet may produce deficiencies in calories, protein, and other nutrients that are needed for proper fetal growth. 5,43 top link

Recommendations of Other Groups

Every state has mandated that screening for PKU be provided to all newborns, but participation is not required by statute in Delaware, Maryland, North Carolina, or Vermont. 19 Testing for blood phenylalanine level after 24 hours of life and before 7 days is recommended by the American Academy of Pediatrics (AAP), 50 the American Academy of Family Physicians, 51 the American College of Obstetricians and Gynecologists (ACOG), 52 Bright Futures, 53 and the Canadian Task Force on the Periodic Health Examination. 54 All of these organizations recommend that infants who are not screened in the nursery should be screened by a physician by 3 weeks of age. As earlier hospital discharges (i.e., <24 hours) become the norm, eight states have mandated a second screening of all newborns between 1 and 6 weeks of age, and most other states recommend repeat specimens if first collected before 24-48 hours of age. 19 The AAP endorses a second test for those infants who were screened before 24 hours of age. 50

No major organization recommends routine prenatal screening for maternal PKU. ACOG recommends taking a history for known inborn errors of metabolism at the initial evaluation of the pregnant woman. 55 The AAP recommends that female patients known to have hyperphenylalaninemia be referred to appropriate treatment centers prior to conceiving. 56 top link

Discussion

There is good evidence that early detection of PKU by neonatal screening substantially improves neurodevelopmental outcomes for affected persons. The evidence is less clear for a benefit from screening pregnant women for PKU. Available evidence has not proven that dietary restrictions during pregnancy occur early enough to prevent fetal damage, and such restrictions may have other adverse effects. In addition, the incidence of previously undiagnosed maternal hyperphenylalaninemia is low, since many women who are currently of childbearing age were born after the introduction of widespread PKU screening in the mid-1960s and are likely to have been detected as newborns. The cost-effectiveness of screening during pregnancy has not been established.

CLINICAL INTERVENTION

Screening for phenylketonuria by measurement of phenylalanine level on a dried-blood spot specimen, collected by heelstick and adsorbed onto filter paper, 10a is recommended for all newborns before discharge from the nursery ("A" recommendation). Infants who are tested in the first 24 hours of age should receive a repeat screening test by 2 weeks of age. Premature infants and those with illnesses optimally should be tested at or near 7 days of age, but in all cases before newborn nursery discharge. All parents should be adequately informed regarding the indications for testing and the interpretation of PKU test results, including the probabilities of false-positive and false-negative findings.

There is insufficient evidence to recommend for or against routine prenatal screening for maternal PKU ("C" recommendation), but recommendations against such screening may be made on other grounds, including the rarity of previously undiagnosed maternal hyperphenylalaninemia, cost, and the potential adverse effects of dietary restriction.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Robert Baldwin, MD, and Modena Wilson, MD, MPH. top link

45. Screening for Congenital Hypothyroidism

Burden of Suffering

In the U.S., congenital hypothyroidism occurs in 1 of 3,600-4,000 infants. 1,2 Clinical diagnosis occurs in <5% of newborns with hypothyroidism because symptoms and signs are often minimal. 1 Without prompt treatment, most affected children gradually develop growth failure, irreversible mental retardation, and a variety of neuropsychologic deficits, comprising the syndrome of cretinism. 1,2 These complications have become rare since the introduction in the 1970s of routine neonatal screening and early treatment of congenital hypothyroidism.top link

Accuracy of Screening Tests

In the U.S., screening for congenital hypothyroidism in neonates is almost always done by the radioimmunoassay of serum thyroxine (T4) and thyroid-stimulating hormone (TSH) from dried-blood spot specimens collected by heelstick and adsorbed onto filter paper. 3,4 Laboratories in most of the U.S. measure T4 on all specimens, and TSH only if the T4 level is low in several states, most of Europe, and elsewhere in the world, TSH is the initial screening test. 3,5 Simultaneous measurement of both T4 and TSH has greater sensitivity for congenital hypothyroidism than either of the two sequential methods currently used, 6 but it is not considered cost-effective by most programs at this time. 5,7 The T4 assay appears to have greater precision and reproducibility than the TSH assay, but false negative rates are similar with the two methods. 6 Both types of screening may miss the 3-5% of cases of congenital hypothyroidism that are caused by pituitary dysfunction, as well as the 3-14% of cases in which patients present with hypothyroxinemia and delayed TSH elevation. 8-10 The primary T4-supplemental TSH approach also misses some patients with residual thyroid tissue (i.e., ectopic gland) that results in initially normal T4 with elevated TSH sensitivity for these cases can be improved by repeat screening at 2-6 weeks. 11 Using a higher cutoff point to define abnormal T4 results in fewer false negatives due to these biologic factors. In one population, using a cutoff of the lowest 5% of T4 values to define "low" missed 3.5% of cases, whereas using the lowest 10% resulted in 1.5% being missed. 12 Only 0.2% of cases were missed using the lowest 20% as a cutoff, but at substantially increased cost in terms of repeat testing.

False negatives also occur due to screening errors. Such failures occur in specimen collection, laboratory procedures (e.g., failure to record an abnormal result), or follow-up. 13 Standards for adequate blood collection on filter paper for neonatal screening programs have been published. 13a Infants at increased risk for false negatives from screening errors include those born at home, ill at birth, or transferred between hospitals early in life. 2 In North America, an estimated 6-12% of neonates with congenital hypothyroidism are not detected in screening programs as a result of biologic factors or screening errors. 8,13

In established U.S. screening programs, there are 4-8 false positives for every proven case, 8 although follow-up testing readily corrects these. Screening with primary TSH instead of T4-supplemental TSH may result in fewer false-positive tests. 8,14 False-positive results are more likely when screening is done in the first 24-48 hours of life, since normal TSH values in the first 2 days of life may exceed the standard cutoff used by most programs. 15 Evidence for long-term adverse psychologic effects from falsely positive screening test results is limited by methodologic flaws. 16-19 top link

Effectiveness of Early Detection

Most cases of congenital hypothyroidism present clinically during the first year of life. 20 Retrospective studies of patients with congenital hypothyroidism have reported that delay of diagnosis and treatment beyond the first 1-3 months of life is likely to result in irreversible neuropsychologic deficits. 21-25 More recent prospective studies show that screening neonates and treating affected infants within the first weeks of life results, on average, in normal or near-normal intellectual performance and growth at ages 5-12 years. 26-34 These children appear to have somewhat lower cognitive and motor development compared to sibling or classmate controls, however, and continue to manifest subtle deficits in language, perception, and motor skills. 27,28,31-35 Both age at onset of therapy and quality of therapeutic control achieved during the first year of life affect long-term intellectual outcome, supportive evidence for a benefit from earlier detection and treatment. 27,28,31-33,35 The reduced incidence of severe neuropsychologic effects observed with early, adequate treatment has prompted most Western governments to require routine screening for all neonates.

Screening at birth may, in fact, occur too late to prevent important neurodevelopmental deficits in some infants. Observational studies suggest that infants affected more severely in utero, as evidenced by greater delay in bone age at diagnosis, lower T4 at screening, or a diagnosis of thyroid agenesis, have significantly poorer developmental outcomes compared to those with milder disease or to normal controls, even when detected and treated in the newborn period. 28,31-36 top link

Recommendations of Other Groups

Screening of newborns for hypothyroidism is offered in all states, but mandated in 46 states and the District of Columbia. 3 Five states and the District of Columbia require or strongly recommend a routine second screening test, from 1 to 4 weeks later. 3 Screening is recommended by the Canadian Task Force on the Periodic Health Examination, 37 the American Academy of Family Physicians, 38 Bright Futures, 39 and jointly by the American Academy of Pediatrics and the American Thyroid Association. 5 top link

Discussion

The natural history of congenital hypothyroidism has changed dramatically since newborn screening was instituted in this country. 2 Before screening was available, many children with this disorder were at least moderately, and sometimes profoundly, retarded, while recent prospective studies have demonstrated normal or near-normal intelligence in virtually all of those detected by screening and treated early in life. There is thus good evidence to support screening for congenital hypothyroidism in the newborn.

CLINICAL INTERVENTION

Screening for congenital hypothyroidism with thyroid function tests performed on dried-blood spot specimens is recommended for all newborns, optimally between days 2 and 6, but in all cases before newborn nursery discharge ("A" recommendation). Blood specimens should be collected by heelstick, adsorbed onto filter paper, and air dried using standard technique. 13a The choice of which thyroid function test or tests to perform is generally determined by individual state requirements. 3 Testing procedures and follow-up treatment for abnormal results should follow current guidelines. 5 Care should be taken to ensure that those born at home, ill at birth, or transferred between hospitals in the first week of life are appropriately screened before 7 days of age. Normal newborn screening results should not preclude appropriate evaluation of infants presenting with clinical symptoms and signs suggestive of hypothyroidism.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Carolyn DiGuiseppi, MD, MPH. top link

46. Screening for Postmenopausal Osteoporosis

Burden of Suffering

An estimated 1.3 million osteoporosis-related fractures occur each year in the U.S. 1 About 70% of fractures in persons aged 45 or older are types that are related to osteoporosis. 2 Most of these injuries occur in postmenopausal women. Over half of all postmenopausal women will develop a spontaneous fracture as a result of osteoporosis. 3 It has been estimated that about one quarter of all women over age 60 develop vertebral deformities and about 15% of women sustain hip fractures during their lifetime. 4,5 The annual cost of osteoporosis-related fractures in the U.S. has been estimated to be over $8 billion in direct and indirect costs. 6 Most fractures in elderly women are due in part to low bone mass osteoporosis-related fractures commonly involve the proximal femur, vertebral body, and distal forearm. 7 Of these sites, the proximal femur (hip) has the greatest effect on morbidity and mortality there is a 15-20% reduction in expected survival in the first year following a hip fracture. 8 Hip fractures are also associated with significant pain, disability, and decreased functional independence. 9 Among persons living at home at the time of a hip fracture, about half experience a deterioration in social function within 2.5 years. 10

Low bone density is strongly associated with an increased risk of fracture. 11 By one estimate, a 50-year-old woman in the 10th percentile of bone density has a 25% lifetime risk of hip fracture (vs. 8% for those in the 90th percentile). 12 A World Health Organization study group has recommended that osteoporosis be defined as a bone density more than 2.5 standard deviations (SD) below the normal bone mass in young women, and that osteopenia (low bone mass) be defined as bone density 1-2.5 SD below the normal mean. 13 Risk of postmenopausal osteoporosis is a function of rate of bone loss as well as peak bone mass. The principal risk factors for osteoporosis are female sex, advanced age, Caucasian race, low body weight, and bilateral oophorectomy before menopause. 1,4 Other historical risk factors such as parity, lactation history, and caffeine intake have been shown to be poor predictors of bone mass. 14-16 Smoking is a probable risk factor for hip fracture, but it is a less reliable predictor of bone mass. 17 The lower weight and poorer health of smokers compared to nonsmokers may be responsible for the associations between smoking and bone mass and fracture risk. 18 top link

Accuracy of Screening Tests

A number of radiologic screening tests have been proposed for both clinical and research purposes to detect low bone mass in asymptomatic persons. These include conventional skeletal radiographs, quantitated computed tomography, single photon absorptiometry, dual photon absorptiometry, and dual energy x-ray absorptiometry. Although skeletal x-rays can detect focal bone disorders and fractures, they do not reliably detect bone loss of less than 20-30%, and they are of limited value in estimating bone mass. 19 The other techniques vary in their availability, cost, and convenience, and provide measures expressed as bone mineral content (BMC) in grams/cm, or as bone mineral density (BMD) in grams/ cm 2 .

Single photon absorptiometry (SPA), in which radioisotopes are the photon source, can measure BMC or BMD in cortical bone in the radius or calcaneus. 20 Dual photon absorptiometry (DPA), dual energy x-ray absorptiometry (DXA), and quantitative computed tomography (QCT) provide direct measures of BMD and are most useful in evaluating the trabecular bone density in locations beneath large amounts of soft tissue (e.g., lumbar vertebrae, proximal femur). DPA and DXA use radioisotopes (DPA) or x-rays (DXA) to emit photons at two different energy levels, thereby correcting for the effect produced by layers of soft tissues. 20-22 DXA is now widely used in the clinical setting, and provides more reproducible measures of bone density, with shorter examination times (5-10 vs. 20-40 minutes) than DPA. 20-22 The precision of DXA (variation in results on repeated measurement) is about 0.5-2%, compared to 1.5-4.0% for DPA. 23 Current data on the performance of these devices have been obtained primarily at specialized research centers, however. Most experts agree that DXA is a safe, accurate, and precise modality for measuring bone density that may be useful in the clinical setting. 24 Reproducibility of SPA is similar to DPA and DXA, but the cost per scan is significantly lower than DXA. Evidence suggests that SPA of the radius or calcaneus is also predictive for future risk of nonspine fracture. 25

QCT is highly accurate in examining the anatomy and density of transverse sections and trabecular regions within the spine, but it is less practical as a routine screening test due to cost and higher radiation exposure. Ultrasound technology for assessing bone density and architecture is under development and may be of value in the future. Other screening tests under investigation include biochemical markers of bone turnover, which may be able to identify those women who will develop more significant bone loss. 26 top link

Effectiveness of Early Detection

There is little evidence from controlled trials that women who receive bone density screening have better outcomes (improved bone density or fewer fractures) than women who are not screened. The primary argument for screening is based on evidence that postmenopausal women with low bone density are at increased risk for subsequent fractures of the hip, vertebrae, and wrist, 27-35 and that interventions can slow the decline in bone density after menopause.

Prospective cohort studies have demonstrated the dose-response relationship between BMD and fracture risk. 11,36,37 In 2-year follow-up of 8,134 women over 65, annual risk of hip fracture for women in the lowest quartile of femoral neck BMD was approximately 1%, almost twice that of women in the second lowest quartile and more than 8 times that of women in the highest quartile. 11 Various studies have estimated that each standard deviation decrease in BMC or BMD is associated with a 1.5-2.8-fold increase in risk of fracture. 38 There are no studies, however, determining how well perimenopausal bone density predicts long-term risk of fracture. Because the rate of postmenopausal bone loss varies among women, bone mass at menopause correlates only moderately with bone mass 10-20 years later, when most fractures occur. 39

Randomized trials have demonstrated that calcium supplementation and estrogen are effective in preserving bone density in postmenopausal women. 40-43 Due to the long delay between menopause and fracture, few prospective studies have been able to demonstrate directly that these interventions reduce fractures. Calcium plus vitamin D reduced hip fractures among very elderly women in France (mean age 84). 43 In a randomized trial in healthy postmenopausal women, calcium supplementation slowed bone loss and significantly reduced symptomatic fractures over 4 years. 43a Numerous observational and nonrandomized experimental studies suggest that risk of fracture can be reduced 25-50% by estrogen replacement therapy (see Chapter 68). The benefits of hormone prophylaxis on bone mass and fracture risk appear greatest with treatment begun close to menopause (before the period of rapid bone loss), and continued for longer periods (>5 years). Benefits appear to wane after stopping estrogen. 44 As a result, preventing fractures in older postmenopausal women may require continuing hormone therapy indefinitely. Other agents that inhibit bone resorption (e.g., calcitonin, bisphosphonates) or stimulate bone formation (e.g., sodium fluoride) can preserve or increase bone mass, but their use in asymptomatic persons remains investigational. 40

There is limited evidence that screening influences treatment decisions, and that women appreciate the more precise estimates of risk provided by BMD measurement. Women who had below average bone density were more likely to take calcium, vitamins, or estrogen than those with above average values (84% vs. 38%) in one study. 45 Compared to the low rates of compliance with hormone therapy in average women (see Chapter 68), 60% of women with low bone density detected by screening were still taking hormone therapy 8 months after screening. 46 The effect of BMD screening on long-term compliance is not known.

There are several important limitations to screening as a means of preventing fractures. In a single measure of bone density, there is a small risk of inaccurate values, and there is no value of BMD that discriminates well between patients who develop a fracture and those who do not. 44 Other risk factors that independently influence falls or bone strength may be more important than low BMD for identifying older women at high risk of fracture. In a prospective study of over 9,500 women over 65, the presence of multiple risk factors (e.g., age >=80, fair/poor health, limited physical activity, poor vision, prior postmenopausal fracture, psychotropic drug use, among others) was a much stronger predictor of hip fracture than low bone density: incidence of first hip fracture in women with 5 or more risk factors was 19/1,000 woman-years versus 1.1/1,000 in women with two or fewer risk factors. 18 Screening perimenopausal women is less predictive of risk later in life, and even women with "normal" bone density are likely to benefit from measures to prevent postmenopausal bone loss. Equally important, there is no consensus on what interventions are indicated for any particular level of bone density. Hygienic measures such as adequate calcium and vitamin D intake, exercise, and smoking cessation can be recommended irrespective of bone density. The decision to begin estrogen, in contrast, often depends on factors other than risk of osteoporosis (see Chapter 68).

Screening could have adverse effects, if it leads to "labeling" in patients diagnosed with osteopenia or osteoporosis, or false reassurance in those with normal bone density. In one study of women referred for screening, women with low bone density were more likely to restrict their activities, and those with normal bone density were less likely to follow routine hygienic measures to prevent osteoporosis (e.g., calcium or vitamin D). 45 Interpreting and explaining the values obtained is complex and may require considerable time for patient counseling about the significance of an abnormal bone density. Although the absolute benefit of preserving bone mass may be greatest in women with low bone density, the overall balance of risks and benefits of hormone therapy in an individual patient is likely to depend on other factors. 39 If estrogen therapy is likely to be recommended on other grounds, the clinical usefulness of routine screening is limited. 47 If other more specific and expensive therapeutic modalities (e.g., bisphosphonates, calcitonin) are shown to be effective in reducing fractures in asymptomatic high-risk women, however, this may increase the role of screening to identify appropriate candidates for treatment. top link

Recommendations of Other Groups

Recommendations against routine radiologic screening for osteoporosis have been issued by the Canadian Task Force on the Periodic Health Examination 18 and the American College of Physicians (ACP) 48 updated ACP guidelines are due out in 1996. Both of these organizations and a World Health Organization study group 49 concluded, however, that bone density measurements may be useful to guide treatment decisions in selected postmenopausal women considering hormone replacement therapy. The American Academy of Family Physicians recommends measuring BMC in women 40-64 years old with risk factors for osteoporosis (e.g., Caucasians, bilateral oophorectomy before menopause, slender build) and in women for whom estrogen replacement therapy would otherwise not be recommended these recommendations are under review. 50 The American College of Obstetricians and Gynecologists does not recommend routine screening for osteoporosis. 51 The National Osteoporosis Foundation is in the process of revising its guidelines for screening for osteoporosis. 20 top link

Discussion

Routine bone densitometry of all postmenopausal women is likely to be time-consuming and very expensive. Screening times vary from 5-15 minutes for SPA and DXA to 20-45 minutes for QCT and DPA. 24 Average costs of screening have been estimated to be $75 with SPA, $75-100 with DXA, $100-150 with DPA, and $100-200 with QCT. 23,24 The costs and inconvenience of screening may be justified if screening reduces the burden of osteoporosis, but further research is necessary to demonstrate both the clinical effectiveness and cost-effectiveness of different screening and treatment strategies. 40,52

Although routine screening may not be appropriate for asymptomatic women, measurement of bone density may be useful for identifying persons at high risk of fracture who might not otherwise consider effective treatments such as estrogen. Measures of bone density provide more reliable estimates of risk than clinical assessment, and they may help both the patient and the clinician make more informed decisions about the potential benefits and risks of therapies such as estrogen. 45 Women who have been identified as having low bone density may be more likely to take estrogen and comply with other preventive measures, but the effect of screening on long-term outcomes (compliance with therapy, bone density, or fracture) has not been adequately studied. The net benefit of screening may be small if high-risk women do not continue long-term therapy, or if screening causes those with normal BMD to forego preventive measures. There is little reason for screening if the information is not likely to influence decisions by the patient or provider. For most women, osteoporosis prevention is only one of many factors that go into the decision whether or not to take estrogen.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against screening for osteoporosis or decreased bone density in asymptomatic, postmenopausal women ("C" recommendation). Recommendations against routine screening may be made on the grounds of the inconvenience and high cost of bone densitometry, and lack of universally accepted criteria for initiating treatment based on bone density measurements. All perimenopausal and postmenopausal women should be counseled about the potential benefits and risks of hormone prophylaxis (see Chapter 68). Although direct evidence of benefit is not available, selective screening may be appropriate for high-risk women who would consider hormone prophylaxis only if they knew they were at high risk for osteoporosis or fracture.

All women should also receive counseling regarding universal preventive measures related to fracture risk, such as dietary calcium and vitamin D intake (Chapter 56), weight-bearing exercise (Chapter 55), and smoking cessation (Chapter 54). Elderly persons should also receive counseling regarding preventive measures to reduce the risk of falls and the severity of fall-related injuries (Chapter 58).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Robert B. Wallace, MD, MPH, Denise Tonner, MD, and David Atkins, MD, MPH. top link

47. Screening for Adolescent Idiopathic Scoliosis

Burden of Suffering

Scoliosis, a lateral spinal curve of 11 degrees or greater, affects an estimated 500,000 adults in the United States. 1 Idiopathic scoliosis accounts for about 65% of cases of structural scoliosis, 2,3 and a large proportion of these cases develop during adolescence. A lateral spinal curve of 11 degrees or greater is present in about 2-3% of adolescents at the end of their growth period. Curves greater than 20 degrees occur in less than 0.5% of adolescents. 4 The potential adverse effects of scoliosis include the progressive development of unpleasant cosmetic deformities, back pain, social and psychological problems during both childhood (e.g., poor self-image, social isolation) and adulthood 5 (e.g., limited job opportunities, lower marriage rate), and the financial costs of treatment.

There is little firm evidence that persons with idiopathic scoliosis are at significantly greater risk of experiencing back complaints than is the general population most existing epidemiologic studies have lost a large proportion of patients to follow-up and lack adequate statistical power to detect a difference. 6,7 Data on the psychosocial effects of scoliosis and poor cosmesis are also limited. Long-term studies suggest a poor correlation between the location or magnitude of curves and the extent of psychosocial complaints. 6 A number of surveys and uncontrolled long-term studies of scoliosis patients have reported low marriage rates in women and high rates of unemployment, disability, and poor self-esteem, 9-12 but these studies lacked internal control groups, and many patients in the cohorts had spinal conditions other than adolescent idiopathic scoliosis. Persons with severe curves are at increased risk of restrictive pulmonary disease and increased mortality, but such curves are usually early-onset and at least 100-120 degrees in magnitude. 6,12-15 Severe curves of this magnitude have become uncommon in the United States and generally occur only as a consequence of severe, early-onset infantile or juvenile scoliosis. 17

Only a subset of curves detected through screening are destined to progress to a point of potential clinical significance. The probability that curves will progress more than 5 degrees can vary from 5% to 90%, depending on the patient's age, sex, and skeletal maturity, and the pattern and magnitude of the curve. 18-21 Progression is less likely in older children with greater skeletal maturity and with smaller curves. Depending on the patient population, between 25% and 75% of curves detected on screening may remain unchanged, and 3-12% of curves may improve. 21-24 The reported probability that curves less than 19 degrees will progress is 10% in girls between age 13 and 15 and 4% in children over age 15. 18,19 In curves that progress, one study found that the probability was 34% that the curves would progress more than 10 degrees, 18% that they would progress more than 20 degrees, and 8% that they would progress more than 30 degrees. 21 Another study of patients with untreated curves found that 25% ceased progression before reaching 25 degrees and that 12% ceased progression before reaching 29 degrees. 22 top link

Accuracy of Screening Tests

The principal screening test for scoliosis is the physical examination of the back, which includes upright visual inspection of the back and the Adams forward-bending test. 25 Patients with abnormal findings on initial physical examination are often then referred for a more thorough physical examination. Some physicians also obtain a standing roentgenogram to measure the degree of curvature (e.g., Cobb angle). Roentgenographic findings serve as the reference standard for estimating the sensitivity and specificity of screening tests. The reported 95% confidence interval for intraobserver and interobserver variability in measuring the Cobb angle on radiographs is 3-5 degrees and 6-7 degrees, respectively. 26,27

A relatively large proportion of children screened in schools are found to be "positive" on initial examination, but only some of these cases are ultimately found to have scoliosis. In studies of school screening, 11-35% of screened children were classified as positive and were referred for further evaluation in one study, 37% of those referred for orthopedic evaluation were found to have no abnormality. 24,28 The sensitivity and specificity of the physical examination depend on the skills of the examiner and the degree of spinal curve being sought. In one study, public health nurses with special training in school screening were able to detect all children (sensitivity of 100%) with a Cobb angle greater than 20 degrees. The specificity of the examination was 91%. The sensitivity and specificity of the examination in detecting curves greater than 10 degrees were 73.9% and 77.8%, respectively. 29

The positive predictive value (PPV) of visual inspection and the forward-bending test varies with the degree of curvature by which a "true positive" is defined, the prevalence of scoliosis in the screened population, and the skills of the examiners. The magnitude of the PPV is inversely related to the degree of curvature used to define scoliosis, since the prevalence of small curves is greater than large curves. In an Australian study, the PPV was 78% for curves greater than 5 degrees in a population with an estimated prevalence for this degree of curvature of 3%. 30 In another study, the PPV was 54% for curves greater than 10 degrees (prevalence of 2%) and 24% for curves greater than 20 degrees (prevalence of 1%). A Canadian study involving specially trained school nurses reported a PPV of 18% in detecting curves greater than 10 degrees (prevalence of 1.7%) and a PPV of 4% in detecting curves greater than 20 degrees (prevalence of 0.3%). 28

Other scoliosis screening tests include the inclinometer 31,32 and Moire topography. The inclinometer has a reported sensitivity of 96-98%, specificity of 29-68%, and reliability coefficients of 0.86-0.97 in detecting a Cobb angle of 20 degrees or more. 33 In some studies, Moire topography correlates poorly with the Cobb angle. 34 A study that combined Moire topography with the forward bending test found that Moire topography had a sensitivity of 95% and the forward bending test had a sensitivity of 46% in detecting curves of 10 degrees or greater. The calculated PPV of the test was 29% (study prevalence of 4%). 35

There is limited information about the value of repeated screening of children who have previously tested negative for scoliosis. Although the probability of false-positive results would be increased by such a practice, repeat screening could potentially detect cases in older adolescents that escaped detection in early puberty or that developed into significant curves after screening was performed. There are few data that confirm these benefits. In one study, 43% of the cases that were detected on screening during tenth grade had previously tested negative 2-3 years earlier. 30 top link

Effectiveness of Early Detection

Direct evidence of the effectiveness of scoliosis screening would require controlled prospective studies demonstrating that persons who receive screening experience better outcomes than those who are not screened. No such studies have ever been published, although there is some evidence that patients with advanced curves may be more likely to fail treatment (to progress further or undergo surgery) than patients with smaller curves. 36

The effectiveness of screening has been inferred from temporal studies that compared outcomes in local communities before and after the institution of large screening programs. These studies reported an increase in the number of referrals to local scoliosis clinics, the proportion of curves detected by screening, and the use of braces they also reported a decrease in the mean age of referred cases, mean curve size, number of curves progressing to 40 degrees, proportion of cases requiring treatment, and the rate of spine fusions. 37-40 However, most of the studies provided limited information about the comparability of the "before" and "after" groups, making it difficult to determine whether the time trends were due to screening or to other temporal factors.

The rationale behind screening is the assumption that the early detection of curves permits prompt initiation of conservative therapeutic measures that may prevent progression of the curves and thereby avoid the complications of advanced scoliosis. The principal forms of conservative treatment for curves detected through screening include spinal orthoses (braces), electrical stimulation, and exercise therapy. Surgery may also be recommended for cases detected through screening, and it is argued that early surgery for large curves may produce better outcomes than surgery performed at later ages.

Braces are generally effective in providing immediate correction of curves initial standing roentgenograms often demonstrate a 50-60% correction in the curve. 41 The effectiveness of braces in preventing progression is less certain. There have been no published controlled prospective studies establishing the effectiveness of brace treatment. A multicenter prospective controlled trial of brace therapy has recently been completed, 42 but the results were not published as of this writing. Most existing evidence regarding the effectiveness of brace therapy comes from uncontrolled case series reports. Early series with limited follow-up reported corrections in lateral curvature of as much as 50%. Although gradual loss of correction over the course of treatment was noted, follow-up 1-2 years after discontinuing brace treatment revealed significant improvement over pre-brace values in a large proportion of patients. 43 Mean rates of curve progression in braced patients were lower than rates expected from natural history data. 44 Long-term studies (more than 5 years of follow-up) have since demonstrated that the early post-treatment correction observed in these reports was often temporary. A gradual loss of correction was noted in the years following brace treatment, with mean overall improvement in such studies averaging 0-4 degrees compared with pre-brace values. 45-47

The absence of internal controls in most bracing studies limits inferences about the independent effects of braces on outcomes. Some investigators have relied on historical control groups to infer effectiveness. A recent review of over 1000 braced patients, for example, concluded that braces altered the natural history of the disease because treatment failures were significantly less common in this series than was observed in a 1984 study by the same authors. 48 A retrospective review that did include a control group of matched, untreated patients reported that braced patients had a lower rate of curve progression and a higher rate of curve regression than untreated patients. 49 The differences were not statistically significant, but the study may have lacked adequate sample size to detect a difference. Another controlled study of similar design reported no statistically significant differences in any parameter of curve progression but also had a small sample size. 50

Outcome measures in most bracing studies relate only to curve correction and provide little information on health outcomes (e.g., back pain, patient feelings about their appearance, psychosocial impact). Available evidence is limited to an uncontrolled study, which found that braced patients noted an improvement in back "surface shape," as determined by a computerized photogrammetric surface mapping procedure. 51 Compliance problems limit the effectiveness of brace treatment. 52 Braces are generally recommended to be worn for 23 hours/day, a program that is often difficult for adolescents to follow and that influences compliance. 53 One study reported that only 15% of patients were highly compliant and that patients wore their braces an average of 65% of the recommended time. 54 Another study reported that complaints were uncommon among adolescents who wore braces. 55

Lateral electrical surface stimulation (LESS), in which surface electrodes are applied to the skin nightly for at least 8 hours until skeletal maturity is attained, 56 has only been evaluated in uncontrolled case series reports. Although early case series reports found low rates of progression (0-5%) in patients who received LESS, 57,58 subsequent studies found that 18-56% of patients progressed more than 10 degrees. 59,60 A chart review of patients who had completed treatment with LESS and were fully compliant found that over two thirds of curves progressed at least 5 degrees 50% of the patients required fusion or ended treatment with a curve greater than 40 degrees. 61

Exercises have been advocated as prophylactic therapy to prevent the need for more extensive treatment (e.g., braces) and as adjunctive therapy to enhance the effectiveness of braces. 62 Scientific evidence to support either use of exercise therapy is limited. Exercise alone has historically demonstrated poor effectiveness in preventing curve progression, 62,63 although there have been few published studies in this area. A study of a school-based exercise program for adolescents with scoliosis found that curve progression after 1 year was not significantly different between the study group and a matched control group. 63 Supporting evidence includes a small randomized controlled trial (grade I evidence) of adolescents wearing a cast, which showed that exercise was more effective than traction in improving curves on lateral bending 64 an uncontrolled cohort study that showed improved vital capacity in hospitalized scoliosis patients who received physiotherapy 65 and an uncontrolled case series report, which found that some braced patients who performed a thoracic flexion exercise had reduced vertebral rotation and thoracic curves after exercise. 66 The study lacked controls, follow-up, and an assessment of clinical outcomes.

Surgery is generally not considered unless significant progression has occurred. Few clinical trials have compared surgery with no surgery to assess its efficacy case series reports provide the largest body of evidence. In these studies, Harrington instrumentation and other surgical techniques appear to be effective in correcting scoliotic curves in the frontal plane -- Cobb angles are corrected by 40-70% 67-71 -- but thoracic hypokyphosis, deviations in axial rotation, and lordosis are often not corrected. 72 Reduced lumbar lordosis ("flat-back" deformity) 73 and "crankshaft" deformities (in skeletally immature patients with posterior arthrodeses) 74,75 can develop over time, although modifications in devices and techniques have reduced the risk of these complications. A small improvement in pulmonary function has also been reported. 76 Cotrel-Dubousset instrumentation appears to achieve correction in the frontal plane while maintaining normal sagittal contour, and some correction of axial rotation with improved cosmesis has also been reported. 77 Spinal decompensation due to torsional changes and spinal cord damage are potential complications of Cotrel-Dubousset instrumentation. 78,79 Another limitation to surgical techniques is loss of fixation, which can result in partial or total loss of correction. There is an estimated 10-25% loss of correction from Harrington instrumentation, but the risk may be lower in patients who are immobilized by a cast or brace. 80,81 Loss of correction appears to be uncommon with Cotrel-Dubousset instrumentation (loss of correction less than 2%), and the latter does not require immobilization. 82

Few controlled studies have evaluated surgery in terms of clinical outcomes, such as back pain and functional status. Although spinal curves and axial rotations are influenced by surgery, they do not correlate well with the incidence of back pain or other symptoms. 83 Studies that have demonstrated effects on clinical outcomes have suffered from design limitations. An uncontrolled retrospective study of patients who underwent spinal fusion found that complaints of low back pain were lower than reported rates in the general population and in scoliotic patients who do not receive fusion. 84 This study did not include internal controls, and it was performed in the years before spinal instrumentation was introduced. Similarly, a review of 32 patients who underwent fusion reported that the preoperative prevalence of poor self-image (38%), uncomfortable sexual intercourse (35%), and frequent or constant back pain (53%) had decreased to zero when surveyed 24-50 months after surgery. 85 This study also lacked a control group. A retrospective cohort study found that surgically treated patients were less likely than nonsurgically treated patients to report pain and were more likely to be performing manual work. 86 The study and comparison groups were not selected randomly and there were important differences between groups in preoperative characteristics. A survey found that patients who had undergone Harrington instrumentation differed significantly from persons without scoliosis in terms of employment, activity levels, and complaints of back pain. 87 The control group did not consist of persons with scoliosis who did not receive surgery, and thus it is unclear whether observed differences were due to scoliosis or to the effects of surgery.

The adverse effects of screening itself are generally minor, but follow-up testing of abnormal findings may incur anxiety, inconvenience, work and school absenteeism for return visits, financial costs for visits and radiographic tests, and radiation exposure from follow-up roentgenograms (although roentgenograms are not routinely ordered on all follow-up evaluations and, when obtained, radiation exposure can be reduced by modern imaging and shielding techniques). Confirmed or suspected scoliosis may affect future health insurance and work eligibility. These postulated adverse effects have not been proven in controlled studies. Treatment may also incur adverse effects from follow-up visits (e.g., inconvenience, absenteeism, radiation exposure) and from treatment itself. Brace wear, for example, may produce skin irritation, disturbed sleep, restrictions on physical and recreational activities, and difficulty in finding clothes, but studies confirming these effects are lacking. Studies have shown an association between brace wear and adverse psychological effects, diminished self-esteem, and disturbed peer relationships. 88,89

The potential adverse effects of surgery can include the general risks of surgery, such as anesthesia risks, pain, and postoperative complications (e.g., bleeding, infection, pulmonary embolism), although these have been reduced by modern surgical and anesthetic techniques. 90 The overall risk of spinal cord damage is about 1-3%, 71,91 but rates are thought to be lower in uncomplicated surgery or when somatosensory evoked potential spinal cord monitoring is performed. 92 Fusion at certain ages during adolescence may affect the longitudinal growth of the spine. 93 Hook dislodgement and laminar fracture are possible. Other adverse effects of surgery include financial costs, inconvenience and lost productivity associated with hospitalization and convalescence, and external immobilization with casts or braces, which may be required for a period of months after surgery. Potential long-term complications occur generally in adults and include the development of pain caudad to the level of fusion, bursitis, pseudo-arthrosis, kyphotic deformities, and loss of normal lumbar lordosis. 91,94 Often these complications require further surgery during adulthood. top link

Recommendations of Other Groups

The Scoliosis Research Society has recommended annual screening of all children aged 10-14 years. 95 The American Academy of Orthopedic Surgeons has recommended screening girls at ages 11 and 13 years and screening boys once at age 13 or 14 years. 96 The American Academy of Pediatrics has recommended scoliosis screening with the forward bending test at routine health supervision visits at ages 10, 12, 14, and 16 years this recommendation is under review. 97 The Bright Futures guidelines recommend noting the presence of scoliosis during the physical examination of adolescents and children >=8 years of age. 98 The Canadian Task Force on the Periodic Health Examination concluded that there was insufficient evidence to make a recommendation. 99 Scoliosis screening is required by law in some states. 100 top link

Discussion

The clinical logic behind screening for adolescent idiopathic scoliosis is based on a series of critical assumptions. The logic assumes that screening tests are accurate and reliable in detecting curves, that early detection of curves results in improved health outcomes, and that effective treatment modalities are available for cases detected through screening. Implicit in this causal pathway are the assumptions that small curves detected through screening are likely to progress to curves of potential clinical significance, that scoliosis causes important health problems, and that the benefits of early detection outweigh the potential adverse effects of screening and treatment. Scientific evidence to support these assumptions is limited.

The principal screening test for scoliosis, the physical examination of the back, has variable sensitivity and specificity, depending on the skills of the examiner and the size of the curve being sought. The positive predictive value in typical screening settings is low, due to the low prevalence of clinically significant curves. There is little evidence about the incremental value of repeat screening in children with previously normal results.

There have been no controlled studies to demonstrate whether adolescents who are screened routinely for idiopathic scoliosis have better outcomes than those who are not screened. Decreased curve size and surgery rates have been observed in communities that have adopted aggressive screening programs, but it is unclear whether the changes were due to screening or to other temporal factors. Beyond temporary correction of curves, there is inadequate evidence that braces limit the natural progression of the disease. The effectiveness of LESS and exercise has not been demonstrated convincingly in currently available research. Surgery is effective in reducing, but not eliminating, the lateral scoliotic curve. The scoliotic curves for which surgery is recommended (e.g., documented progression beyond 40-50 degrees) are more likely to be detected without screening.

The natural history of idiopathic scoliosis is such that most cases detected at screening will not require treatment because they will not progress significantly. Indications for preventive treatment (e.g., braces) are therefore uncertain and can result in unnecessary treatment. Only a small proportion of adolescents with idiopathic scoliosis are currently considered candidates for treatment (e.g., those having progressive curves greater than 30 degrees). Moreover, the burden of suffering associated with adolescent idiopathic scoliosis is uncertain. Cosmetic deformities and associated psychological and social effects have been difficult to evaluate in formal research. It is also unclear whether physical symptoms can be attributed to idiopathic scoliosis, except in severe cases. Finally, screening may result in mislabeling and the inconvenience, cost, and potential radiation exposure of follow-up evaluations. Both conservative treatment (e.g., braces) and surgery can be associated with medical, psychological, and social adverse effects.

In summary, there is insufficient evidence from clinical research that routine screening is effective in changing the outcome of adolescent idiopathic scoliosis. Limitations in the design of existing studies, however, also make it difficult to conclude that screening is ineffective or harmful. If screening for scoliosis is effective, discontinuation of school screening may have a disproportionate impact on disadvantaged adolescents. Adolescents who have access to primary care providers and to periodic health examinations have an opportunity outside the school setting to obtain back examinations. School screening may provide the only opportunity for back inspections of disadvantaged adolescents, including those from minority and low income families, who often lack access to such providers.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening of asymptomatic adolescents for idiopathic scoliosis ("C" recommendation). The evidence does not support routine visits to clinicians for the specific purpose of scoliosis screening or for performing the examination at specific ages during adolescence. It is prudent for clinicians to include visual inspection of the back of adolescents when it is examined for other reasons. Additional specific inspection maneuvers to screen for scoliosis, such as the forward-bending test, are of unproven benefit.

Note: See the relevant background papers: U.S. Preventive Services Task Force. Screening for adolescent idiopathic scoliosis: policy statement. JAMA 1993269:2664-2666 and U.S. Preventive Services Task Force. Screening for adolescent idiopathic scoliosis: review article. JAMA 1993269:2667-2672.

The draft of this chapter was prepared for the U.S. Preventive Services Task Force by Steven H. Woolf, MD, MPH. top link

48. Screening for Dementia

Burden of Suffering

Dementia is usually defined as global impairment of cognitive function that interferes with normal activities. 1,2 Although impaired short- and long-term memory are typical of dementia, deficits in other cognitive functions in addition to memory (e.g., abstract thinking, judgment, speech, coordination, planning or organization) are required for the diagnosis of dementia. 2,3 Alzheimer's disease accounts for most cases of dementia in North America (50-85%), 4,5 with an additional 10-20% attributed to vascular ("multi-infarct") dementia. The relative importance of vascular dementias is higher in populations where hypertension and stroke are more common (Asians, African Americans, persons over 85). 6-8 Other important causes of dementia include alcoholism, Parkinson's disease, metabolic disorders (vitamin B12 deficiency, hypothyroidism), central nervous system infections (e.g., HIV, neurosyphilis), intracranial lesions, and other illnesses. 4,9

The prevalence of dementia increases steadily with age, roughly doubling every 5 years. 10 Studies of community-dwelling elderly in North America have reported dementia in 0.8-1.6% of persons 65-74 years old, 7-8% of persons 75-84 years old, and 18-32% of persons over 85. 5,8,11,12 The substantially higher prevalence reported from a community survey in East Boston 13 -- 19% for ages 75-84, 47% in those over 85 -- may reflect the inclusion of cases with milder impairment. 4 Estimates of the annual incidence of dementia in community-based studies are 0.6-1% for ages 65-74, 2-3% for ages 75-84, and 4-8% for ages 85 or older 14,14a many incident cases have only mild cognitive impairment, however. Dementia is common among institutionalized elderly 11 and is present in one half to two thirds of the 1.3 million American nursing home residents. 13 Estimates of the number of Americans over 65 currently affected by Alzheimer's disease range from 1.4-4 million. 15,16 This number is projected to increase dramatically as the population of older (over 65) and very old (over 85) men and women increases in the U.S. 15

Family history is consistently associated with an increased risk of Alzheimer's disease, with an estimated 3-fold higher risk among first-degree relatives. 17 Genetic risk factors for Alzheimer's disease have been identified 18,19 but are of uncertain value in clinical practice. A variety of other possible risk factors (e.g., lower educational level, 20 prior head trauma, family history of Down syndrome) and protective factors (e.g., smoking, estrogen replacement) for Alzheimer's disease have been reported, but the nature of each these associations remains uncertain. 4,5

Alzheimer's disease progresses over a period of 2-20 years, causing increasing functional impairment and disability due to acute medical illnesses, depression, wandering, incontinence, adverse drug reactions, poor personal hygiene, and unintentional injuries (falls, burns, etc.). 21,22 Survival is reduced in patients with Alzheimer's disease 23 and in patients with any cognitive impairment 24 mortality is strongly associated with severity of dementia. Dementia is estimated to account for about 120,000 deaths annually. 25 Care of the demented patient imposes an enormous psychosocial and economic burden on family and other caretakers. The annual costs of treating Alzheimer's disease alone, including medical and nursing costs and lost productivity, have been estimated to be $67 billion. 15 top link

Accuracy of Screening Tests

Dementia is easily recognized in its advanced stages, but numerous studies indicate that clinicians often overlook the early signs of dementia. 26-29a The significance of early symptoms, whose onset is insidious, may be underestimated patients and clinicians alike may mistakenly attribute changes to "normal aging." 29 Other patients, fearing a label of Alzheimer's disease, deliberately minimize their symptoms, and patients with more advanced dementia may not be aware of their deficits. Clinicians fail to detect an estimated 21% to 72% of patients with dementia, especially when the disease is early in its course. 26-29a Conversely, clinicians may mistakenly attribute the symptoms of depression or drug toxicity in older subjects to irreversible dementia.

The routine physical examination and patient history is not sensitive for dementia, especially if family members are not present to corroborate patient self-report. Many clinicians include only a cursory examination of mental status as part of the routine history and physical. The inability to recall the correct date or place is reasonably specific (92-100%), but highly insensitive (15-53%) for dementia. 30,31 Neurologic findings, such as release signs, gait disorders, and impaired stereognosis, are usually late findings and are not sufficiently sensitive or specific to screen for dementia. 32

The usual diagnostic standard for dementia consists of detailed assessment of mental status and careful investigation to rule out other causes of cognitive impairment. A variety of abbreviated instruments have been examined for their ability to screen for dementia in the outpatient setting. 33 The most widely studied of these instruments is the Mini-Mental State Examination (MMSE), a short, structured examination that takes 5-10 minutes to administer. 34 The MMSE contains 30 items and is reproducible using a standardized version. 35 Various studies suggest that an MMSE score of less than 24 of 30 has a reasonable sensitivity (80-90%) and specificity (80%) for discriminating between dementia cases and normal controls. 36 There are only limited data, however, on its performance as a screening test for early dementia among a representative population of outpatients. The positive predictive value (PPV) of MMSE for dementia depends on the definition of an abnormal score and the prevalence of dementia. Based on its performance in one community study, 37 a MMSE score of 20 or less has a PPV of only 48% when the prevalence of dementia is 10% (e.g., a population of 75-84-year-olds), but a much higher PPV (73%) when prevalence of dementia is 25% (e.g., age over 85). 31 The predictive value of intermediate MMSE scores (21-25) appears to be low (21-44%) for dementia in most populations. 31

Recent data suggest that level of education and cultural differences have important effects on the range of MMSE scores in a given population. Among individuals with only 5-8 years of education versus those with college education, the cutpoints that identified the lowest 25% on MMSE were 23 and 29, respectively. 38 Spanish-speaking persons scored significantly lower than did English speakers on several MMSE items in one community-based study. 39 These data suggest that applying a uniform MMSE cutoff may miss significant changes among well-educated patients (false-negative result) and generate more frequent false-positive results among persons who are less educated or from different cultures. 40 Shorter screening instruments such as the Short Portable Mental Status Questionnaire 41 and the Clock Drawing Test 42 seem to be reasonably sensitive and specific for moderate to severe dementia, but they have not been adequately studied as screening tests in asymptomatic outpatients. Because they each examine a lesser range of cognitive function, they are not likely to be as sensitive as the MMSE or more comprehensive tests for detecting early dementia.

An alternative to screening for cognitive problems is to screen for functional impairment, which is a diagnostic criterion for dementia. 2 The Instrumental Activities of Daily Living (IADL) assesses level of function in eight common tasks. 43 When IADL was administered to a random sample of community-living persons over 65 (prevalence of dementia 2%), subjects who reported difficulty using the telephone, using public transportation, taking medications, or handling finances were 12 times more likely to be diagnosed with dementia. 44 The Functional Activities Questionnaire, which scores function in 10 activities, also seems to be useful in measuring impairment and diagnosing dementia. 45,46 While these instruments generally rely on other informants (spouse, etc.), one recent study suggests that patients with mild dementia can reliably describe their functional status. 47 Because nondementing illnesses also interfere with daily activities, neither screen is specific for dementia.

The low predictive value of most screening tests for dementia raises the possibility that unselective screening may have adverse effects. Many asymptomatic patients with abnormal results on MMSE or other screening tests will not have dementia these patients may be subjected to further tests (e.g., neuropsychological testing, blood tests, lumbar puncture, computed tomography [CT]) to confirm the diagnosis, rule out other reasons for altered mental status, and assign a cause of dementia. Comprehensive follow-up, although posing little risk to patients, will be time-consuming and expensive. If clinicians make a diagnosis based on screening alone, patients may be incorrectly diagnosed as having a progressive, incurable illness. Nonetheless, in the absence of screening, misdiagnosis of dementia is common in outpatient practice. 26 In one study in which general practice doctors administered a brief (10-15 minutes) standardized assessment to all patients over age 80, they revised their initial impression of cognitive function for 32 of 174 (18%) patients. 48 Interestingly, 16 patients initially diagnosed as "possibly demented" were reclassified as "not demented" after screening. top link

Effectiveness of Early Detection

There are several potential benefits of detecting dementia before patients are severely impaired: reversible causes of dementia may be identified and treated, treatments to slow the progression of disease can be instituted, measures can be taken to reduce the morbidity associated with dementia, and patients and their family members can anticipate and prepare for problems that will arise as dementia progresses.

Although early reports suggested that a substantial proportion of dementia was potentially reversible, 49 the number of patients who experience long-term improvements is relatively small. An overview of earlier studies concluded that only 11% of dementing illnesses improved in older patients, and only 3% resolved completely. 50 The most common correctable causes were drug intoxication, depression, and metabolic abnormalities. Among 36 cases of dementia evaluated in one community-based screening study, no cases of reversible dementia were found. 51 Among 85-year-old residents of Gothenburg, Sweden, only 3 of 147 cases of dementia were potentially reversible. 7

Various treatments to improve cognitive function in Alzheimer patients have been examined in randomized clinical trials. Drugs that increase central levels of acetylcholine, such as tetrahydroaminoacridine (tacrine), have shown the most promise. Although several studies reported no benefit, 52,53 the three largest trials suggested a significant but modest benefit of tacrine in patients with mild to moderate dementia (average MMSE scores 16-19) over 6-30 weeks. 54-56 In one trial, the benefit of tacrine on cognitive test results was comparable to delaying disease progression by 5 months. 54 Improvements in overall clinical function have been small and inconsistent 54 but increase at higher doses. 55 The usefulness of tacrine is limited by high cost (over $100 per month) and frequent gastrointestinal side effects: up to 25% of patients taking lower doses, and two thirds of those on high doses, stopped therapy due to nausea, vomiting, or elevated liver enzymes. 53-56 Dihydroergotoxine (hydergine) improved some measures of cognitive function in previous trials, but does not produce important clinical benefits. 53,57 Other therapies under investigation include chelation therapy, 58 neuroprotective agents, and growth factors, but consistent evidence of clinical benefits is lacking. 5,53

Early detection of vascular dementia may prompt better control of risk factors for cerebrovascular disease (treatment of hypertension, smoking cessation, aspirin therapy). 59 The effect of these measures on progression of vascular dementia, however, is not known. In a 2-year follow-up of 52 patients with multi-infarct dementia, smoking cessation was associated with improving cognitive function, but low blood pressure was associated with worsening function. 60 About one half of elderly demented patients manifest at least one coexisting illness, and treatment of associated disorders may improve function in patients with dementia. 61,62

Identifying patients with early cognitive problems allows patients and their families to take measures to reduce the medical morbidity caused by progressive dementia. Patients are at increased risk of falls and automobile accidents as dementia progresses. 63 Effective interventions to prevent falls or accidents in patients with dementia have not been determined, however (see also Chapters 57 and 58). Comprehensive geriatric assessment has been shown to increase the number of older patients able to live independently at home, 64 but it is not possible to separate the benefits of cognitive assessment from other components (e.g., medical evaluation, social evaluation, drug management, follow-up).

An early diagnosis also permits care providers, especially family and friends of the patient, to benefit from support and self-help strategies in order to minimize the financial, emotional, and medicolegal pressures that will occur throughout the patient's illness. The psychiatric symptoms (depression, delirium or disruptive behavior) accompanying dementia can be anticipated and treated with psychotropic drugs and/or counseling. 65 Decisions about durable power of attorney and advance directives can be made while the patient is still competent to participate. These benefits of early detection are based on clinical experience, but there are no data to prove that routine screening improves these outcomes.

An early diagnosis of dementia may also have adverse consequences: patients may have difficulty obtaining health or life insurance and may be excluded from retirement communities or long-term care facilities. Negative attitudes toward patients with dementia have been documented among professionals and lay people. 66 top link

Recommendations of Other Groups

There are no formal recommendations for routine screening for cognitive impairment or dementia. The Canadian Task Force on the Periodic Health Examination concluded that there was insufficient evidence to recommend for or against screening for asymptomatic cognitive impairment, but they advised that clinicians should remain alert for clues suggesting deteriorating cognitive function. 67 The American Academy of Family Physicians recommends that physicians include questions about functional status in the patient history of patients over 65, and remain alert for evidence of changes in cognitive function. 68 A National Institutes of Health consensus development conference concluded that no single test can diagnose dementia and urged clinicians to take the time necessary to conduct a thorough clinical evaluation. 1 Guidelines on the recognition and early assessment of dementia prepared by an expert panel convened by the Agency for Health Care Policy and Research (AHCPR), U.S. Public Health Service, are due to be released in 1996. top link

Discussion

Dementia is responsible for an enormous and growing burden on affected patients, their family members, and the clinicians who care for them. Early signs of dementia are often overlooked in routine encounters, and a variety of brief tests of mental status are available to help clinicians assess cognitive function more accurately in their patients. In the absence of more effective treatments to improve prognosis in patients with dementia, however, it is uncertain whether routine use of these instruments in all older patients will be of sufficient benefit to justify the inconvenience, costs, and possible harms of unselective screening. The predictive value of available screening tests is relatively low in the general population of asymptomatic older adults. Administering tests such as the MMSE to all older patients, and further evaluating those with positive results, will be time-consuming and expensive. Some patients may be incorrectly diagnosed with dementia on the basis of screening tests alone. Although there are many plausible benefits of early detection, there are few studies demonstrating that routine screening actually reduces the medical, psychological, and social consequences of dementia. Other appropriate interventions (treating hypertension, correcting underlying illnesses, and taking precautions to prevent accidents) can be recommended for older patients with or without dementia.

Despite the limitations of unselective screening, clinicians can improve the timely diagnosis of dementia by being alert to suggestive signs and symptoms in their older patients (trouble with daily activities, concerns voiced by family members), and by using standardized instruments to evaluate cognitive function in those suspected of having dementia. A positive screening test is more meaningful in patients when there is prior reason to suspect dementia (due to the higher prevalence of disease), and normal mental status test results may provide reassurance. Screening tests, however, should not be used in isolation to diagnose dementia.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening for dementia in asymptomatic elderly persons ("C" recommendation). Clinicians should periodically ask patients about their functional status at home and at work, and they should remain alert to changes in performance with age. When possible, information about daily activities should be solicited from family members or other persons. Brief tests such as the MMSE should be used to assess cognitive function in patients in whom the suspicion of dementia is raised by restrictions in daily activities, concerns of family members, or other evidence of worsening function (e.g., trouble with finances, medications, transportation). Possible effects of education and cultural differences should be considered when interpreting results of cognitive tests. The diagnosis of dementia should not be based on results of screening tests alone. Patients suspected of having dementia should be examined for other causes of changing mental status, including depression, delirium, medication effects, and coexisting medical illnesses.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins MD, MPH, with contributions from materials prepared for the Canadian Task Force on the Periodic Health Examination by Christopher Patterson MD, FRCP and materials prepared for the AHCPR Panel on Recognition and Initial Assessment of Alzheimer's and Related Dementia. top link

49. Screening for Depression

Burden of Suffering

Depression is a common and costly mental health problem, seen frequently in general medical settings. 1 Major depressive disorder, diagnosed by structured psychiatric interviews and specific diagnostic criteria, is pres-ent in 5-13% of patients seen by primary care physicians. 2-7 The prevalence of this disease in the general population is about 3-5%. 8 The annual economic burden of depression in the U.S. (including direct care costs, mortality costs, and morbidity costs) has been estimated to total almost $44 billion. 9 Depression is more common in persons who are young, female, single, divorced, separated, seriously ill, or who have a prior history or family history of depression. 10

Major depressive disorder can result in serious sequelae. The suicide rate in depressed persons is at least 8 times higher than that of the general population. 11 In 1993, 31,230 suicide deaths were reported, although the actual number is probably much higher. 12 Most persons who commit suicide have a mental disorder, with depression associated with about half of suicides. 9,11 The incidence of documented suicides by adolescents and young adults has tripled in the last 25 years, with 5,000 youths committing suicide each year and perhaps as many as 500,000-1,000,000 making an attempt 13 (see Chapter 50).

On a population basis, the most important effect of major depression may be on quality of life and productivity rather than suicide. This effect is widespread and has been shown to be comparable to that associated with major chronic medical conditions such as diabetes, hypertension, or coronary heart disease. 14,15 Also, depressed persons frequently present with a variety of physical symptoms -- three times the number of somatic symptoms of controls in one study. 16 If their depression is not recognized, these patients may be subjected to the risks and costs of unnecessary diagnostic testing and treatment. 17,18 top link

Accuracy of Screening Tests

The prevailing standard for the diagnosis of depression is the opinion of an examining psychiatrist that a patient's symptoms meet the criteria described in the fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). 19 For research purposes, psychiatric diagnoses have been operationalized through the development of structured diagnostic interview instruments such as the Diagnostic Interview Schedule (DIS). 20

To aid in the detection of this important disorder, screening questionnaires have been proposed to predict a patient's risk of depression. Several brief (2-5 minutes) questionnaires have been tested for routine use by primary care providers. These include the Beck Depression Inventory (BDI), 21 the Center for Epidemiologic Studies Depression scale (CES-D), 22 and the Zung Self-Rating Depression Scale (SDS). 23 These three instruments have been shown to detect adult patients with depressive symptoms fairly accurately in primary care settings, with sensitivities and specificities that vary depending on the cutoff score selected. For example, when compared to the diagnosis of major depression in primary care patients using a standardized psychiatric instrument such as the DIS, the BDI had a sensitivity of 100% and a specificity of 89% at a cutoff score of 16 6 the CES-D had a sensitivity of 89% and a specificity of 70% at a cutoff score of 27 2 and the SDS had a sensitivity of 97% and a specificity of 63% at a cutoff score of 50. 24 A recent meta-analysis of 18 studies that compared various depression screening instruments to accepted diagnostic criteria in primary care patients estimated an overall sensitivity of 84% and specificity of 72% for these tests. 27a The authors calculated that screening 100 primary care patients (prevalence of major depression 5%) would identify 31 patients with a positive screen, 4 of whom actually have major depression.

Depression screening questionnaires developed specifically for children and adolescents include the Center for Epidemiologic Studies Depression Scale for Children (CES-DC) 25 and the Children's Depression Inventory (CDI). 26 They have been validated against structured diagnostic interview instruments developed for children and adolescents. At a cutoff of 15, the CES-DC had a sensitivity of 71% and a specificity of 57% for major depressive disorder. 25 In addition, the adult CES-D and BDI have been tested on adolescents, with a sensitivity of 84% and a specificity of 75% for the CES-D at a cutoff of 24 and a sensitivity of 84% and a specificity of 81% for the BDI at a cutoff of 11. 27

It should be noted that the usual nomenclature used in the assessment of screening tests for asymptomatic persons is not strictly applicable to depression screening, because the diagnosis of depression is itself based on symptoms. A patient cannot truly be asymptomatic and have major depressive disorder. Thus, these screening questionnaires are actually being evaluated for their ability to detect unrecognized, rather than strictly asymptomatic, depressive symptoms (sleeplessness, loss of appetite, etc.) and disease. top link

Effectiveness of Early Detection

It has been repeatedly documented that primary care providers do not recognize major depression in approximately half of their adult patients with this disorder. 2,28-31 Because the majority of persons with depression are seen by nonpsychiatrist physicians, 32 and because effective treatments -- drugs, psychotherapy, or a combination of the two -- are available for the treatment of depression, 33 it has been proposed that routine depression screening could result in improved recognition and earlier treatment of depression with improved patient outcome.

Clinical trials have shown that the use of depression screening tests in primary care settings can increase clinician detection of depression. 31,34-36 A randomized controlled trial of screening with SDS found increased recognition and increased treatment of depression in the patients detected by screening. 37 A prospective controlled study found that providing SDS scores to the physician and prescribing a 4-week course of antidepressants to those with elevated scores resulted in lower patient SDS scores than in controls in whom SDS results were withheld and who received unspecified care. 38 This study had several design limitations, however, including confounding variables, different data collection techniques for controls, short follow-up, and the use of questionnaire scores as outcome measures.

These and other studies have established that depression screening can lead to increased recognition and, in some studies, treatment of depression in primary care patients. Separate research has found that the treatment of depressed patients leads to improved outcome. 33 Taken together, however, these studies still constitute insufficient evidence to conclude that routine depression screening is indicated in unselected patients, because it has not been shown that the early detection and treatment of depression in primary care leads to improved outcome when compared to routine diagnosis and treatment of this disorder when symptoms appear and are detected. While there is evidence that the initiation of treatment in the early stages of a recurrent episode of depression in psychiatric settings results in a better outcome than intervention when the traditional symptoms of depression become conspicuous, 39 data are not available demonstrating a similar advantage of early detection and treatment for the initial onset of depression or for the typically less severe depression seen in primary care. No published studies have shown improvement in rigorously assessed psychiatric outcome in primary care patients screened, treated, and compared to controls, although a study of this type is currently under way with an adult population. 40

No studies to date have demonstrated that screening asymptomatic children or adolescents for depression leads to improved outcomes. top link

Recommendations of Other Groups

The Canadian Task Force on the Periodic Health Examination found fair evidence to exclude the use of depression detection tests from the periodic health examination of asymptomatic people. 41 The Depression Guideline Panel sponsored by the Agency for Health Care Policy and Research recommended that providers maintain a high index of suspicion for depression and evaluate risk factors, detecting depressive symptoms with a clinical interview. 42 The American Academy of Family Physicians advises physicians to remain alert for depressive symptoms in adolescents and adults 43 this policy is under review. The American Medical Association recommends that all adolescents be asked annually about behaviors or emotions that indicate recurrent or severe depression. 44 Bright Futures recommends annual screening of adolescents for behaviors or emotions that may indicate recurrent or severe depression or risk of suicide. 45 top link

Discussion

At present, available depression questionnaires lack the evidence necessary to support their routine use as screening tools in the periodic health examination for primary care patients. 27a Emerging research, both on current screening questionnaires 40 and on new primary care mental disorder diagnostic tools, 46 may change this situation. The enormous burden of suffering from this disease, its high prevalence in primary care settings, and its frequent presentation with somatic symptoms that lead to extensive medical testing and interventions all argue for better awareness of depressive symptoms by primary care physicians so that fewer cases of depression will escape detection. It is also important that depressed persons who are identified receive adequate follow-up care.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against the routine use of standardized questionnaires to screen for depression in asymptomatic primary care patients ("C" recommendation). Clinicians should, however, maintain an especially high index of suspicion for depressive symptoms in adolescents and young adults, persons with a family or personal history of depression, those with chronic illnesses, those who perceive or have experienced a recent loss, and those with sleep disorders, chronic pain, or multiple unexplained somatic complaints. Physician education in recognizing and treating affective disorders is recommended (see Chapter 50). Persons with depressive symptoms should be evaluated further and, if diagnosed with major depressive disorder, either treated or referred for treatment.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Douglas B. Kamerow, MD, MPH. top link

50. Screening for Suicide Risk

Burden of Suffering

In 1993, the age-adjusted rate of suicide in the U.S. was approximately 11.2/100,000 persons 31,230 suicide deaths were reported. 1 The actual incidence is uncertain because suicidal intent is often difficult to prove after the fact uniform criteria for declaring a death due to suicide have only recently been developed. 2 An estimated 210,000 persons attempt suicide each year, resulting in over 10,000 permanent disabilities, 155,500 physician visits, 259,200 hospital days, over 630,000 lost work days, and over $115 million in direct medical expenses. 3 The highest rate of completed suicide is among men aged 65 years and older, but suicide attempts are more commonly reported among women and among men and women aged 20-24 years. 4 The suicide rate in American teenagers has increased substantially in recent years. 5 Suicide is the third leading cause of death in persons 15-24 years old 1 as well as a leading cause of years of potential life lost. 6 Suicides among young persons may also lead to suicide clusters, in which a number of other adolescents in the same community commit suicide. 7

The most important risk factor for suicide is psychiatric illness. The majority of suicide victims have affective, substance abuse, personality or other mental disorders. 8-9a Persons with a history of one or more psychiatric hospital admissions carry a particularly high risk of suicide. 10 Other risk factors for suicide and attempted suicide, particularly in persons with underlying mental or substance abuse disorders, include social adjustment problems, serious medical illness, living alone, recent bereavement, personal or family history of suicide attempt, family history of completed suicide, divorce, separation, and unemployment. 4,8,11,12

Firearms are used in about 60% of all suicides. 1,4,8,9,13 Firearm-related deaths accounted for nearly all the increase in suicide rates during the 1980s. 14 Case-control studies have demonstrated that the risk of suicide is almost five times higher for persons who live in a household where at least one firearm is kept, when compared with persons who live in a household free of guns. 14a-16 The second most common means of suicide among males is hanging, whereas among females it is poisoning (drug overdose). 4 Alcohol intoxication is associated with at least 25-50% of all suicides 9 and is especially common in suicides involving firearms. 4 top link

Accuracy of Screening Tests

About one half to two thirds of persons who commit suicide visit physicians less than 1 month before the incident, and 10-40% visit in the preceding week. 9,17 It is often difficult, however, for physicians to identify suicidal patients accurately. Direct questions about suicidal intent may have low yield only 3-5% of persons threatening suicide express unequivocal certainty that they want to die. 18 Nearly 30% of American high school students report having seriously thought about committing suicide, 19,20 making it unlikely that suicidal thoughts alone would be a useful index of suspicion in this population. Although the clinician can identify established risk factors in the medical history (e.g., psychiatric illness, prior suicide attempt, access to firearms, substance abuse, recent life event such as death or divorce), the majority of patients with these characteristics do not intend to kill themselves. 21,22 Asking general medical patients about sleep disturbance, depressed mood, guilt, and hopelessness correctly identified 84% of those who had experienced suicidal thoughts within the previous year. 22a The study was not designed to assess actual suicide risk, however, and has not been replicated in the clinical setting. If validated, these questions may identify patients who may benefit from in-depth evaluation for suicide risk.

Researchers have attempted to identify specific risk factors that are the strongest predictors of suicidal behavior. Many studies have shown, however, that structured instruments to assess these risk factors misclassified many persons as high risk who did not subsequently attempt suicide and (with some instruments) identified many as low risk who did commit suicide. 23-28 For example, one scoring system, 25 based on 4-6 years of longitudinal data from 4,800 psychiatric patients, was able to identify correctly 35 of 63 (56%) subsequent suicides, but it generated 1,206 false positives (positive predictive value less than 3%).

Also, physicians may not effectively assess risk factors for suicide, such as previous suicide attempts or psychiatric illness. In one study of completed suicides, 29 over two thirds of victims had made previous attempts or threats, but only 39% of their physicians were aware of this history. Although psychological autopsy studies (retrospective psychiatric evaluation based on interviews with survivors) reveal that nearly all victims have evidence of previous psychiatric diagnoses (e.g., depression, bipolar disorder, alcohol and other drug abuse, schizophrenia) and previous psychiatric treatment, 17,21,30 many primary care clinicians fail to recognize the presence of mental illness. Several studies have shown that depression is frequently overlooked (see Chapter 49), as is substance abuse (see Chapters 52 and 53). Improved early detection of these conditions might help persons at risk for suicide, but further research is needed to evaluate its effectiveness in reducing suicide rates.

Recent studies have identified evidence of altered serotonin activity in patients who complete suicide, particularly those with depression and schizophrenia. 31-33 No studies have evaluated these biochemical markers as screening tools in the general population. top link

Effectiveness of Early Detection

Suicide is a relatively rare event, and large samples and lengthy follow-up would be needed for studies to demonstrate significant reduction in suicide rates as a result of a specific intervention such as mental health counseling and hospitalization, limitation of access to potential instruments of suicide, and treatment of underlying conditions. 18 Although these measures seem to be clinically prudent, no direct evidence that they reduce suicide rates was found. Effects on less specific outcome measures, such as feelings of hopelessness, 34 have been reported. Even in the setting of attempted suicide, there is limited and conflicting evidence that intervention is beneficial but there is also no conclusive evidence that it is not. Surveys indicate that patients receiving psychiatric consultation for attempted suicide find the therapy to be of limited benefit, 35 and 35-40% choose not to remain in treatment. 36,37 One study of hospitalized patients admitted for poisoning or self-inflicted injury reported fewer subsequent suicide attempts in persons who received psychiatric counseling than in controls who were discharged prematurely before seeing a psychiatrist. 38 Another cohort study of patients hospitalized for self-poisoning found no difference in subsequent suicide attempts among patients who attended psychiatric outpatient follow-up and those who did not. 39 Among suicide attempters without immediate psychiatric or medical needs randomized to receive hospital admission or discharge home, there were no differences in psychological testing or further suicide attempts between the two groups at 1 week long-term follow-up was not evaluated, however. 40 Some selection biases were apparent in all of these studies, thereby limiting the generalizability of their results to all suicide attempters. Findings from these studies may not be applicable to successful suicide because people who attempt and those who complete suicide may differ. Involuntary hospitalization can be of immediate benefit to persons planning suicide and is often required for medicolegal reasons in persons with suspected suicidal ideation, 41,42 but no reliable data on the long-term effectiveness of this measure were found.

Another potential intervention is limiting access to the most common instruments of suicide, such as firearms and drugs. Although there is no direct evidence that removal of firearms can prevent suicide, studies have shown that geographic locations with reduced availability of these weapons have lower suicide rates among adolescents and young adults. 8,43 Studies of deaths by drug overdose have found that, in over half of cases, the ingested drugs were either prescribed by a physician within the preceding week or were provided in a refillable prescription. 44 There is little information, however, on how the physician can best identify persons who require nonlethal quantities of prescription drugs, or whether these measures will prevent subsequent suicide. Legislation in one country restricting the prescription of sedatives may have been associated with a reduced rate of suicide, but the evidence was not conclusive. 45

Since it has been estimated that as many as 90% of persons who commit suicide suffer from psychiatric disorders, it is possible that treatment of these underlying illnesses may prevent suicide. 34 Indirect evidence suggests that patients with affective disorders who receive comprehensive psychiatric care have lower suicide rates than most persons with psychiatric illnesses, 46,47 but studies with control groups are needed to exclude the possibility of selection bias in these results. A Swedish population-based time series study evaluated suicide rates before and 1 year after all postgraduate physicians in a community were trained to recognize and manage affective disorders appropriately. 48 The suicide rate in the community decreased 50% in the year following the program, which was significant compared to previous trends in that community and to national rates in Sweden. Repetition of these results in a controlled trial with longer follow-up is needed. As many as 50% of persons who kill themselves are intoxicated with alcohol or other drugs, 9 and a significant proportion also suffer from a substance abuse disorder. 8 Early detection and treatment of alcohol and other drug abuse has the potential to prevent suicide, but firm evidence of this effect is lacking. top link

Recommendations of Other Groups

The American Academy of Pediatrics recommends asking all adolescents about suicidal thoughts during the routine medical history. 49 The American Medical Association 50 and Bright Futures 51 recommend that providers screen adolescents annually to identify those at risk for suicide.

The Canadian Task Force on the Periodic Health Examination found insufficient evidence to recommend for or against the inclusion of suicide risk evaluation in the periodic health examination. Based on the high burden of suffering, however, they recommend that clinicians routinely evaluate the risk of suicide among persons in high-risk groups, particularly if there is evidence of psychiatric disorder (especially psychosis), depression, or substance abuse, or if the patient has recently attempted suicide or has a family member who committed suicide. 52 The recommendations of the American Academy of Family Physicians are currently under review. top link

Discussion

Suicide is a leading cause of death in the U.S., but there is no evidence that screening the general population for suicide risk is effective in reducing suicide rates. Routine medical history is often not sufficient to recognize suicide risk or suicidal intent. Several screening instruments have been developed to identify risk factors, but these do not accurately predict the likelihood of suicide. Even when a risk factor or suicidal intent is detected, there is weak evidence that interventions effectively reduce suicide rates. Several studies have evaluated treatment of those who attempt suicide, but results were conflicting, and these studies may not be generalizable to the population of those who complete suicide. Training primary care clinicians to recognize and treat appropriately underlying mental health problems such as depression and substance abuse may be effective, but long-term controlled studies have yet to be performed.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening by primary care clinicians to detect suicide risk in asymptomatic persons ("C" recommendation). Clinicians should be alert to evidence of suicidal ideation when the history reveals risk factors for suicide, such as depression, alcohol or other drug abuse, other psychiatric disorder, prior attempted suicide, recent divorce, separation, unemployment, and recent bereavement. Patients with evidence of suicidal ideation should be questioned regarding the extent of preparatory actions (e.g., obtaining a weapon, making a plan, putting affairs in order, giving away prized possessions, preparing a suicide note). It may also be prudent to question the person's family members regarding such actions. Persons with evidence of suicidal intent should be offered mental health counseling and possibly hospitalization.

The training of primary care clinicians in recognizing and treating affective disorders in order to prevent suicide is recommended ("B" recommendation). Clinicians should be alert to signs of depression (see Chapter 49) and other psychiatric illnesses, and they should routinely ask patients about their use of alcohol and other drugs (see Chapters 52 and 53). Patients who are judged to be at risk should receive evaluation for possible psychiatric illness, including substance abuse, and counseling and referral as needed.

Patients who are recognized as having suicidal ideation, or patients who suspect suicidal thoughts in their relatives or friends, should be made aware of available community resources such as local mental health agencies and crisis intervention centers. Parents and homeowners should also be counseled to restrict unauthorized access to potentially lethal prescription drugs and to firearms within the home (also see Chapters 58 and 59).

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by S. Patrick Kachur, MD, MPH, and Carolyn DiGuiseppi, MD, MPH.top link

51. Screening for Family Violence

Burden of Suffering

Family violence is a serious public health problem for many Americans. Family violence includes child abuse (physical and sexual abuse), domestic violence (physical or sexual abuse of spouse or intimate partner), and elder abuse (abuse or neglect of older persons). 1 Because many cases of family violence go unreported, the true magnitude of the problem can only be estimated. 2

Child Abuse.

In 1993, child protective service agencies substantiated maltreatment of over 1 million children in the U.S. (a rate of 14/1,000 children) over 1,028 deaths due to child maltreatment were reported in 1993. 3 Intentional injury is the leading cause of injury-related death in children under 1 year of age. 4 Parents or other relatives are responsible for over 90% of reported cases of child maltreatment. 3 In addition to physical injuries, children who have been victims of or witnesses to violence often experience abnormal physical, social, and emotional development adolescents and adults who were abused as children are more likely to abuse tobacco and alcohol, attempt suicide, and exhibit violent or criminal behavior. 2,5-7

Approximately 140,000 cases of child sexual abuse were reported in 1993, 3 but the true incidence has been estimated to be as high as 450,000 cases per year. 8 In sexual abuse cases where the abuser was known to the child, over two thirds involved abuse by family members. 9 Girls are victims of sexual abuse two and a half times more frequently than boys. 10 Child sexual abuse often results in severe psychological trauma, 11 has been associated with a variety of psychological problems persisting into adulthood, and can cause medical complications such as sexually transmitted diseases. Teens who had been sexually abused were significantly more likely than nonabused controls to be sexually active, to abuse alcohol or drugs, and to have attempted suicide. 7,12

A number of parental and family characteristics have been identified as risk factors or risk markers for child physical abuse -- poor social support, low socioeconomic status, single parent family, and unplanned or unwanted pregnancy 13 -- but abuse is usually the result of multiple interacting factors. 14 Abuse of drugs or alcohol, although not clearly an independent risk factor, often coexists with conditions (poverty, social isolation, etc.) that increase the risk of abuse. 15 Abusive mothers are often themselves victims of physical violence by their spouse or partner, 16 and abusive parents often experienced abuse as children. A poor understanding of normal child development, poor anger control, and use of physical punishment as a discipline technique are more common among abusive parents. 13 In contrast, demographic or family characteristics are of little value in predicting risk of child sexual abuse. 17 top link

Domestic Violence.

Estimates of the prevalence of domestic violence among couples vary depending on the source of data and definition of violence. 18 A national survey of 50,000 households conducted in 1992 and 1993 estimated that over 1 million women (9.3/1,000) and nearly 150,000 men (1.4/1,000) are victims each year of assault, robbery, or rape committed by their spouse, ex-spouse, or intimate partner 19 over half of these incidents result in minor injury, and 3% in serious injury (broken bones, loss of consciousness, hospitalization, etc.). 20 This estimate may be conservative due to underreporting. In a comprehensive survey of family violence, involving detailed interviews of a total of 8,145 families in 1975 and 1985, 16% of couples reported instances of violence in the previous year (including shoving, slapping, or grabbing) 40% of these episodes involved more serious actions such as kicking, punching, or use of a weapon. 21,22 In recent surveys, 2-3% of women reported being kicked, bitten, or hit with fist or some other object by their partner in the preceding year. 22,23 Family studies indicate that both men and women engage in violence against partners, but women are the primary victims of chronic battering and episodes leading to injury. 24 In 95% of episodes of domestic violence leading to criminal investigation, 20 and 59% of spouse murders, 25 women were the victims. The prevalence of domestic violence is also high among female patients in clinical settings: 15% of women visiting an emergency department 26 and 12-23% of women in family practice settings 27,28 reported having been physically abused or threatened by their partner within the last year. Domestic violence tends to be repetitive -- female victims reported an average of six violent incidents per year. 22 The psychological consequences of abuse can be as important as physical injuries: abused women may suffer from posttraumatic stress disorder, and they are more likely than nonabused women to be depressed, attempt suicide, abuse alcohol or drugs, and transfer their aggression to their children. 29,30

Violence between spouses or partners can occur in families from all demographic and economic strata of society, 22 but risk of physical assault appears higher for some groups of women. Women who are under age 35, have not attended college, are of lower socioeconomic status, or are unmarried are more likely to report being victims of domestic violence. 20 A review of 52 studies found that only one risk marker -- witnessing parental violence as a child or adolescent -- was consistently associated with being a battered spouse. 31 Childhood family violence and alcohol problems are more common among abusive partners. 22 In general, however, the primary care physician is not able to predict reliably which patients are likely to be affected by domestic partner violence. 32

Pregnant women are also at risk from domestic violence. 33,34 In surveys of pregnant women (primarily from urban, public clinics), 7-18% of women reported physical abuse (including forced sexual activity) during the current pregnancy. 35-38 Many studies have reported an association between violence and worse outcomes in pregnancy. Battered women are more likely to register late for care, suffer preterm labor or miscarriage, or have low birth weight infants than nonabused controls. 35-39 top link

Elder Abuse.

Elderly persons are also vulnerable to physical or psychological abuse or neglect by family members or other caregivers. 40,41 Community surveys in Boston and Canada estimated that 3-4% of persons over age 65 are victims of physical abuse, neglect, or regular verbal abuse. 42,43 Factors that appear to increase vulnerability to abuse among older persons include poor or failing health, cognitive impairment, and lack of family, financial, or community support. 41 The abuser is usually a relative, most often the spouse. 44 Family members who have a history of substance abuse, mental illness, or violence, or who are financially dependent on the elder person, are more likely to be abusive. 41 Accurate estimates of the medical consequences of elderly abuse (patient visits, hospitalizations, or costs of care) are not available. 42 It is estimated that less than 1 in 5 cases of elder abuse is reported, due to denial or minimization of the problem by the victim, abuser, or health professionals. 45 In one report, up to 60% of elder abuse victims admitted for acute medical care remained permanently institutionalized. 46 The incidence of mistreatment of elders in institutions is not known. A survey of nursing home staff revealed that 36% of the staff had witnessed physical abuse, and 81% had witnessed psychological abuse of patients. 47 top link

Accuracy of Screening Tests

Family violence may come to attention when it results in severe injuries, but ongoing abuse often goes unrecognized in the clinical setting. The clinician can identify victims of domestic violence through the patient interview, use of a standardized questionnaire, or the physical examination.

There are few reliable techniques for screening for child abuse. Questionnaires can identify risk factors for child abuse and neglect, but the potential to falsely label families as "potential abusers" is a limitation to their use in clinical practice. 48 Eliciting evidence of child physical or sexual abuse through patient interview is difficult. Young children may not be able to answer reliably, both child and parent may be ashamed or fearful of admitting to abuse, and some abusive parents may not regard their use of physical punishment as abuse. Most authorities recommend exploring for potential problems with open-ended, nonjudgmental questions about parenting and discipline (e.g., "What do you do when he misbehaves? Have you ever been worried that someone was going to hurt your child?"). 14,49 The value of standardized questions or screening instruments to improve the detection of child abuse is not known. Physical findings suggestive of abuse noted during routine or symptomatic examinations have been described. 50 Burns, bruises, and other lesions can be suggestive due to their appearance (e.g., patterns resembling hands, belts, cords, and other weapons) or location (buttocks, lower back, upper thighs, and face). Multiple traumatic injuries without a plausible explanation are also suspicious. At the same time, accidental injuries may produce similar findings in children, and many abused children (especially victims of sexual abuse) have no obvious physical findings. In a survey of studies of sexually abused children, normal examinations were reported in up to 73% of girls and 82% of boys. 51 Neither the sensitivity nor specificity of screening for abuse with physical examination is known.

Some studies report that less than 10% of battered women are accurately diagnosed by physicians, even in hospitals with an established protocol for this problem. 30,33 The routine patient interview often fails to detect abuse in adult patients, in large part because physicians do not routinely ask about domestic violence. Only a third of physicians in one survey felt that routine questions on abuse should be part of the annual examination. 52 Many physicians are reluctant to ask about abuse, out of fear of offending their patients, inability to "fix" abusive relationships, frustration in dealing with resistant patient behavior, and lack of time to deal with the problem. 53 Both victim and abuser may deny abuse for a variety of reasons -- embarrassment, psychologic repression, or fear of reprisal, abandonment, or legal consequences.

Consistent use of screening protocols significantly improves the detection of abuse as a cause of trauma, 54 and similar measures have been shown to increase the detection of domestic violence affecting pregnant and nonpregnant outpatients. The large majority of abuse victims favored routine questions about abuse, and half indicated that they would volunteer information about domestic violence only if specifically asked. 52 Directly asking individuals about the occurrence of abuse has been shown to elicit more positive reports (29% vs. 7%) than the use of a written self-report. 55 The Abuse Assessment Screen, containing five questions on the frequency and severity of past and current physical abuse and forced sexual activity, has been validated against more comprehensive instruments in pregnant women. 56 Incorporation of this instrument into the standard social service interview of pregnant patients significantly increased the detection of recent abuse compared to historical controls (15% vs. 3%). 35

There are fewer studies on screening for elder abuse. The value of the patient interview may be limited if the abuser is present. A 15-item instrument for detecting elder abuse had a sensitivity of 64% and specificity of 91% in a pilot study, but has not been validated for screening in routine practice. 57 top link

Effectiveness of Early Detection

The repetitive nature of family violence suggests that early detection may be important in preventing future problems from abuse. Specifically, patients can be counseled about the nature and course of family violence, given information about available resources (community counseling and support groups, shelters, protective service agencies, etc.), and counseled about means to prevent further abuse. Psychological counseling, by either the primary care clinician or a mental health professional, may help the patient terminate personal relationships with violent individuals. The clinician may also identify individuals who are at increased risk of committing abuse in the future. Such persons may be referred for psychiatric counseling or family therapy to learn stress management and nonviolent alternatives for conflict resolution. Finally, the clinician is able (in many instances, required) to report suspected cases of abuse and neglect to appropriate protective service agencies for further evaluation and intervention.

Intervention studies in child abuse have concentrated on primary prevention. 48 Two randomized clinical trials have shown that home visits to high-risk families decrease the rate of child abuse and the need for medical visits early in life. 58,59 Interventions may need to be ongoing to retain effectiveness: extended follow-up of one of these trials found no effect of intervention on the rate of abuse and neglect later in life (ages 25-50 months). 60 Unfortunately, most clinicians do not have the option of providing this level of intervention. Studies evaluating the effectiveness of treatments for abused children are limited, and their results have been mixed. 61 Recurrent abuse despite interventions may occur in up to 60% of cases. 62 The effectiveness of treating sexual abusers of children remains controversial one outpatient program reduced recidivism by half. 63

The effectiveness of early intervention for domestic violence is also difficult to determine. Most interventions for spouse abuse (e.g., shelters, legal action) are crisis oriented and have been directed at women who have already been injured by domestic violence. The options available to women are often limited by associated factors common in abusive relationships: financial dependence on an abusive partner, fear of retribution, alcohol or drug problems, or psychological vulnerability. 22,64 As a result, many abused women decline offers of help. 65 For women who do attempt to terminate an abusive relationship, the available resources to assist them are often limited and temporary. In a controlled study of battered women leaving a shelter, women who received services of an advocate for 4-6 hours per week reported better overall quality of life, but no significant difference in levels of physical abuse, compared to controls. 66 Whether treatment of abusive men is effective in reducing domestic violence remains controversial. A randomized trial of group therapy (vs. standard care) for convicted wife-abusers showed that repeat abuse was significantly lower for the treatment group. 67 Effective approaches to couples who engage in mutual, less severe violence (pushing, shoving, etc.) have not been developed. A large controlled study is under way to examine whether an integrated program to improve detection and management of domestic violence in the primary care setting leads to better clinical outcomes. 68

Effective interventions for elder abuse may also be limited, in large part because the abuser is often the primary caregiver to the victim. 41 If the only alternative is nursing home placement, victims may be reluctant to give up their independence in order to escape abuse. A review of elder physical abuse victims in Illinois reported that most victims received few tangible services from social service agencies other than case management (primarily monitoring). 69 Among abused elders, an advocate program decreased social isolation and improved services, but a reduction in subsequent abuse was not demonstrated. 70 top link

Recommendations of Other Groups

The American Academy of Pediatrics, 71 American Medical Association, 72,73 American Academy of Family Physicians (AAFP), 74 and the Bright Futures guidelines 49 all recommend that physicians remain alert for the signs and symptoms of child physical abuse and child sexual abuse in the routine examination. Bright Futures suggests including questions about child discipline, and abuse of the child or parents, at the discretion of the clinician. The AMA's Guidelines for Adolescent Preventive Services (GAPS) recommend that teens should be asked annually about a history of emotional, physical, and sexual abuse. 75 The use of screening devices to identify families at risk for child maltreatment is not recommended by the Canadian Task Force on the Periodic Health Examination (CTF). 48 Legislation in all states requires health care professionals to report suspected cases of child abuse. 73

The American College of Obstetricians and Gynecologists (ACOG), 76 the U.S. Surgeon General, 77 the American College of Physicians, 78 and the AAFP 74 all recommend that clinicians be alert to the possibility of domestic violence as a causal factor in illness and injury. ACOG and AMA guidelines on domestic violence recommend that physicians routinely ask women direct, specific questions about abuse. 79,80 ACP and AAFP guidelines are currently under review. An expert panel convened by the National Research Council and the Institute of Medicine (Washington, DC) to evaluate the effectiveness of family violence interventions is scheduled to publish its findings in 1996. Healthy People 2000, a report of national health objectives, 81 and the Joint Commission on Accreditation of Healthcare Organizations 82 recommend that all emergency departments use protocols to improve the detection and treatment of victims of domestic violence.

The CTF determined that there was insufficient evidence to include or exclude case-finding for elder abuse as part of the periodic health examination, but recommended that physicians be alert for indicators of abuse and institute measures to prevent further abuse. 44 The AMA recommends that physicians routinely ask elderly patients direct, specific questions about abuse. 83 Many states require reporting of domestic violence 84 and elder abuse. 41 top link

Discussion

Family violence is an important cause of physical and psychological harm in children and adults, yet it often goes undetected by clinicians. Identifying victims of domestic violence provides important information to clinicians and may allow early intervention to reduce the risk from future abuse. Although the benefit of routine screening has not been directly assessed, several factors support greater efforts by clinicians to detect domestic violence between spouses or sexual partners: the substantial prevalence of violent behavior among couples, the repetitive nature of domestic violence, and its high medical and societal costs. 1 Contrary to common perceptions, most patients appreciate being asked about possible abuse, and direct questioning may substantially increase reporting of episodes of domestic violence.

At the same time, clinicians face important obstacles in preventing violence or sexual abuse within the family. The etiology of domestic violence is multifactorial and is a function of social conditions, family conflict, cultural attitudes, and biologic factors. Interventions for physical or sexual abuse, mostly outside of the medical domain, vary greatly in effectiveness. Although crisis interventions (arrests, referral to shelters) are appropriate to protect victims in specific cases, there are few adequately controlled studies to determine the effect of counseling or referral on the long-term outcome of family violence. Appropriate screening methods for child abuse and elder abuse are also uncertain. Screening for abuse through the patient history is problematic with young children, may be unreliable if the abuser is also present, and can be complicated by denial in all age groups. Errors in diagnosing abuse are of great concern because of the serious emotional, legal, and societal implications of either failing to take action in cases of abuse or of incorrectly accusing innocent persons.

Despite the limited and imperfect options for detecting and intervening in domestic violence, the benefits are substantial for those families where the cycle of abuse can be interrupted. It is also important for clinicians to maintain a high index of suspicion when examining other persons at risk of physical or sexual abuse (e.g., children and the elderly), to assess potential risk factors for domestic violence, and to refer abuse victims and perpetrators to other professionals and community services to help prevent future incidents.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against the use of specific screening instruments for family violence, but including a few direct questions about abuse (physical violence or forced sexual activity) as part of the routine history in adult patients may be recommended on other grounds ("C" recommendation). These other grounds include the substantial prevalence of undetected abuse among adult female patients, the potential value of this information in the care of the patient, and the low cost and low risk of harm from such screening. All clinicians examining children and adults should be alert to physical and behavioral signs and symptoms associated with abuse and neglect. Various guidelines are available to help clinicians in recognizing abuse and neglect in children, 71-73 spouses/partners, 80 and elders. 81 In all states, suspected cases of child abuse or neglect must be reported to local child protective services agencies. In most states, suspected elder abuse must also be reported. 41 All individuals who present with multiple injuries and an implausible explanation should be evaluated with attention to possible abuse or neglect. Injured pregnant women and elderly patients should receive special consideration for this problem. Suspected cases of abuse should receive proper documentation of the incident and physical findings (e.g., photographs, body maps) treatment of physical injuries arrangements for counseling by a skilled mental health professional and the telephone numbers of local crisis centers, shelters, and protective service agencies. The safety of children of victims of abuse should also be ensured.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by Craig F. Thompson, MD, MPH, and David Atkins, MD, MPH, with contributions from materials prepared by Christopher Patterson, MD, FRCPC, Harriet L. MacMillan, MD, FRCPC, James H. MacMillan, MSc, and David R. Offord, MD, FRCPC, for the Canadian Task Force on the Periodic Health Examination.top link

52. Screening for Problem Drinking

Burden of Suffering

Over half a million Americans are under treatment for alcoholism, but there is growing recognition that alcoholism (i.e., alcohol dependence) represents only one end of the spectrum of "problem drinking." 1 Many problem drinkers have medical or social problems attributable to alcohol (i.e., alcohol abuse or "harmful drinking") without typical signs of dependence, 2,3 and other asymptomatic drinkers are at risk for future problems due to chronic heavy alcohol consumption or frequent binges (i.e., "hazardous drinking"). Heavy drinking (more than 5 drinks per day, 5 times per week) is reported by 10% of adult men and 2% of women. 4 In large community surveys using detailed interviews, 5-8 the prevalence of alcohol abuse and dependence in the previous year among men was 17-24% among 18-29-year-olds, 11-14% among 30-44-year-olds, 6-8% among 45-64-year-olds, and 1-3% for men over 65 among women in the corresponding age groups, prevalence of abuse or dependence was 4-10%, 2-4%, 1-2%, and less than 1%, respectively. Problem drinking is even more common among patients seen in the primary care setting (8-20%). 9

Medical problems due to alcohol dependence include alcohol withdrawal syndrome, psychosis, hepatitis, cirrhosis, pancreatitis, thiamine deficiency, neuropathy, dementia, and cardiomyopathy. 10 Nondependent heavy drinkers, however, account for the majority of alcohol-related morbidity and mortality in the general population. 1 There is a dose-response relationship between daily alcohol consumption and elevations in blood pressure and risk of cirrhosis, hemorrhagic stroke, and cancers of the oropharynx, larynx, esophagus, and liver. 11-13 A number of studies have reported a modest increase in breast cancer among women drinking 2 drinks per day or more, but a causal connection has not yet been proven. 14 Three large cohort studies, involving over 500,000 men and women, observed increasing all-cause mortality beginning at 4 drinks per day in men 11,12 and above 2 drinks per day in women. 15 Women achieve higher blood alcohol levels than do men, due to smaller size and slower metabolism. 11,15 Compared to nondrinkers and light drinkers, overall mortality was 30-38% higher among men, and more than doubled among women, who drank 6 or more drinks per day. 11,12 Of the more than 100,000 deaths attributed to alcohol annually, nearly half are due to unintentional and intentional injuries, 16 including 44% of all traffic fatalities in 1993 17 and a substantial proportion of deaths from fires, drownings, homicides, and suicides (see Chapters 50,51,57,58, and 59).

The social consequences of problem drinking are often as damaging as the direct medical consequences. Nearly 20% of drinkers report problems with friends, family, work, or police due to drinking. 10 Persons who abuse alcohol have a higher risk of divorce, depression, suicide, domestic violence, unemployment, and poverty (see Chapters 49- 51). 10 Intoxication may lead to unsafe sexual behavior that increases the risk of sexually transmitted diseases, including human immunodeficiency virus (HIV). Finally, an estimated 27 million American children are at risk for abnormal psychosocial development due to the abuse of alcohol by their parents. 25

Moderate alcohol consumption has favorable effects on the risk of coronary heart disease (CHD). 18-23 CHD incidence and mortality rates are 20-40% lower in men and women who drink 1-2 drinks/day than in nondrinkers. 15,21,22 A meta-analysis of epidemiologic studies suggests little additional benefit of drinking more than 0.5 drinks per day. 20 The exact mechanism for the protective effect of alcohol is not known but may involve increases in high-density lipoprotein 23 and/or fibrinolytic mediators. 24

Alcohol Use during Pregnancy.

The proportion of pregnant women who report drinking has declined steadily in the U.S. 26 Recent surveys indicate 12-14% of pregnant women continue to consume some alcohol, 27,28 with most reporting only occasional, light drinking (median: 4 drinks per month). 26 Binge drinking or daily risk drinking (usually defined as 2 drinks per day or greater) is reported by 1-2% of pregnant women, 27-29 but higher rates (4-6%) have been reported in some screening studies. 30,31 Excessive use of alcohol during pregnancy can produce the fetal alcohol syndrome (FAS), a constellation of growth retardation, facial deformities, and central nervous system dysfunction (microcephaly, mental retardation, or behavioral abnormalities). 32 Other infants display growth retardation or neurologic involvement in the absence of full FAS (i.e., fetal alcohol effects [FAE]). 10 FAS has been estimated to affect approximately 1 in 3,000 births in the U.S. (1,200 children annually), making it a leading treatable cause of birth defects and mental retardation. 33,34

The level of alcohol consumption that poses a risk during pregnancy remains controversial. 10,35 FAS has only been described in infants born to alcoholic mothers, but the variable incidence of FAS among alcoholic women (from 3-40%) 33 suggests that other factors (e.g., genetic, nutritional, metabolic, or temporal) may influence the expression of FAS. 10 The reported incidence of FAS is higher in Native Americans and blacks than in whites. 33,36 Most studies report an increased risk of FAE among mothers who consume 14 drinks per week or more, 35,37-39 but the effects of lower levels of drinking have been inconsistent. 35,40,41 Modest developmental effects have been attributed to light drinking (7 drinks per week) in some studies, but underreporting by heavy drinkers and confounding effects of other important factors (nutrition, environment, etc.) make it difficult to prove or disprove a direct effect of light drinking. 10,35,42 Timing of exposure and pattern of drinking may be important, with greater effects proposed for exposure early in pregnancy and for frequent binge drinking. 10 top link

Alcohol Use by Adolescents and Young Adults.

Use of alcohol by adolescents and young adults has declined over the past decade, but remains a serious problem. 43 Among 12-17-year-olds surveyed in 1993, 18% had used alcohol in the last month, and 35% in the last year. 4 In a separate 1993 survey, 45% and 33%, respectively, of male and female 12th graders reported "binge" drinking (5 or more drinks on one occasion) within the previous month. 44 The leading causes of death in adolescents and young adults -- motor vehicle and other unintentional injuries, homicides, and suicides -- are each associated with alcohol or other drug intoxication in about half of the cases. Driving under the influence of alcohol is more than twice as common in adolescents than in adults. 45 Binge drinking is especially prevalent among college students: half of all men and roughly one third of all women report heavy drinking within the previous 2 weeks. 43,46 Most frequent binge drinkers report numerous alcohol-related problems, including problems with school work, unplanned or unsafe sex, and trouble with police. 46 top link

Accuracy of Screening Tests

Accurately assessing patients for drinking problems during the routine clinical encounter is difficult. The diagnostic standard for alcohol dependence or abuse (Diagnostic and Statistical Manual of Mental Disorders [DSM] IV) 2 requires a detailed interview and is not feasible for routine screening. Physical findings (hepatomegaly, skin changes, etc.) are only late manifestations of prolonged, heavy alcohol abuse. 47 Asking the patient about the quantity and frequency of alcohol use is an essential component of assessing drinking problems, but it is not sufficiently sensitive or specific by itself for screening. In one study, drinking 12 or more drinks a week was specific (92%) but insensitive (50%) for patients meeting DSM criteria for an active drinking disorder. 48 The reliability of patient report is highly variable and dependent on the patient, the clinician, and individual circumstances. Heavy drinkers may underestimate the amount they drink because of denial, forgetfulness, or fear of the consequences of being diagnosed with a drinking problem.

A variety of screening questionnaires have been developed which focus on consequences of drinking and perceptions of drinking behavior. The 25-question Michigan Alcoholism Screening Test (MAST) 49 is relatively sensitive and specific for DSM-diagnosed alcohol abuse or dependence (84-100% and 87-95%, respectively) 49,50 but it is too lengthy for routine screening. Abbreviated 10- and 13-item versions are easier to use but are less sensitive and specific in primary care populations (66-78% and 80%, respectively). 51,52 The four-question CAGE instrument [a] is the most popular screening test for use in primary care 53 and has good sensitivity and specificity for alcohol abuse or dependence (74-89% and 79-95%, respectively) in both inpatients 54,55 and outpatients. 56-58 The CAGE is less sensitive for early problem drinking or heavy drinking, however (49-73%). 58,59 Both the CAGE and MAST questionnaires share important limitations as screening instruments in the primary care setting: an emphasis on symptoms of dependence rather than early drinking problems, lack of information on level and pattern of alcohol use, and failure to distinguish current from lifetime problems. 52

Some of these weaknesses are addressed by the Alcohol Use Disorders Identification Test (AUDIT), a 10-item screening instrument developed by the World Health Organization (WHO) in conjunction with an international intervention trial. The AUDIT incorporates questions about drinking quantity, frequency, and binge behavior along with questions about consequences of drinking. 60 For the study population in which it was derived, a score of 8 of 40 on the AUDIT had high sensitivity and specificity for "harmful and hazardous drinking" (92% and 94%, respectively) as assessed by more extensive interview. 60 Validation studies have reported more variable performance of the AUDIT. Sensitivity and specificity for current abuse/dependence were high (96% and 96%, respectively) in an inner-city clinic 61 among rural outpatients, AUDIT was less sensitive and specific (61% and 90%) for current drinking problems but superior to the Short MAST-13. 51 Because it focuses on drinking in the previous year, however, AUDIT is less sensitive for past drinking problems. 62 Further validation studies in other populations are under way.

Brief screening tests may be less sensitive or less specific in young persons: sensitivity of the CAGE for problems due to alcohol among college freshmen was 42% in men and 25% in women. 63 Only 38% of college students with an AUDIT score of 8 or greater met DSM criteria for abuse or dependence 64 many of these "false-positive" results were due to drinking patterns (frequent binge drinking) that would be considered hazardous. Alternative screens have been developed for adolescents, such as the Perceived-Benefit-of-Drinking scale 65 and the Problem Oriented Screening Instrument for Teenagers (POSIT), 66 but they have not yet been adequately validated in the primary care setting.

Instruments that focus on alcohol dependency (e.g., CAGE or MAST) are not sensitive for levels of drinking considered dangerous in pregnancy. 67 Women may underreport alcohol consumption while pregnant, 68 and direct questions about drinking may provoke denial. 69 Brief instruments that incorporate questions about tolerance to alcohol ("How many drinks does it take to make you feel high?" or "How many drinks can you hold?") were more sensitive than the CAGE (69-79% vs. 49%) for risk-drinking in pregnancy (2 drinks per day or greater). 30,70 Women who require 3 or more drinks to feel high, or who can drink more than 5 drinks at a time, are likely to be at risk. 71

Laboratory tests are generally insensitive and nonspecific for problem drinking. Elevations in hepatocellular enzymes, such as aspartate aminotransferase (AST), or the erythrocyte mean corpuscular volume (MCV) are found in less than 10% and 30% of problem drinkers, respectively. 72 Serum gamma-glutamyl transferase (GGT) is more sensitive (33-60%) in various studies, 54,55,72 but elevations in GGT may be due to other causes (medications, trauma, diabetes, and heart, kidney, or biliary tract disease). Even when the prevalence of problem drinking is high (30%), the predictive value of an elevated GGT has been estimated at only 56%. 72 top link


[a] C: "Have you ever felt you ought to Cut down on drinking?" A: "Have people Annoyed you by criticizing your drinking?" G: "Have you ever felt bad or Guilty about your drinking?" E: "Have you ever had a drink first thing in the morning to steady your nerves or get rid of a hangover (Eye opener)?top link

Effectiveness of Early Detection

Numerous studies demonstrate that clinicians are frequently unaware of problem drinking by their patients. 10 Early detection and intervention may alleviate ongoing medical and social problems due to drinking and reduce the future risks from excessive alcohol use.

Nondependent Drinkers.

A number of randomized trials have now demonstrated the efficacy of brief outpatient counseling (5-15 minutes) for nondependent problem drinkers. In four Scandinavian studies, which enrolled patients with elevated GGT and heavy alcohol consumption, brief counseling to reduce drinking and regular follow-up produced significant improvements (decreased GGT and/or decreased alcohol consumption) in treated versus control subjects 73-76 counseling reduced reported sick days in one study. 74 In the longest of these studies, patients receiving counseling had fewer hospitalizations and 50% lower mortality after 5 years. 73 Some of this benefit, however, may have been due to the close medical follow-up (every 1-3 months) in the intervention group rather than the initial counseling.

Additional trials have demonstrated that brief interventions can reduce alcohol consumption in problem drinkers identified by screening questionnaires or self-reported heavy drinking. 77-79 Most recently, an international WHO study examined the effects of 5 or 20 minutes of counseling about drinking in 1,500 "at-risk" male and female drinkers: >35 drinks per week for men >21 drinks per week for women or intoxicated twice per month or self-perceived drinking problem. 80 After 9 months, self-reported alcohol consumption among men was reduced 32-38% in the intervention groups and 10% in controls. Among women, alcohol consumption declined significantly (>30%) among both treated and control groups. A meta-analysis of six brief-intervention trials estimated that interventions reduced average alcohol consumption by 24%. 81 Although self-reported consumption may be subject to bias, reported changes in drinking correlated with objective measures (GGT, blood pressure) in most studies. Two additional studies demonstrated significant reductions in blood pressure as a result of advice to stop drinking or substitution of nonalcoholic beer. 82,83 top link

Pregnancy.

There are no definitive controlled trials of treatments for excessive drinking in pregnancy. 84 In several uncontrolled studies, a majority of heavy-drinking pregnant women who received counseling reduced alcohol consumption, 32,85,86 and reductions in drinking were associated with lower rates of FAS. 32,86 Many women spontaneously reduce their drinking while pregnant, however, and women who continue to drink differ in many respects from women who cut down (e.g., heavier drinking, poorer prenatal care and nutrition). As a result, it is difficult to determine precisely the benefit of screening and counseling during pregnancy. In two trials that employed a control group, the proportions of women abstaining or reducing consumption were similar in intervention and control groups. 87,88 top link

Adolescents.

A 1990 Institute of Medicine (IOM) report concluded that specific recommendations for the treatment of alcohol problems in young persons were impossible, due to disagreement over what constitutes a drinking problem in adolescents, the wide variety of interventions employed, and the absence of any rigorous evaluation of different treatments. 1 Alcohol interventions in adolescents have focused on primary prevention of alcohol use. 10 Recent reviews of school-based programs found that most effects were inconsistent, small, and short-lived programs that sought to develop social skills to resist drug use seem to be more effective than programs that emphasize factual knowledge. 89,90 top link

Alcohol-Dependent Patients.

Patients with alcohol dependence usually receive more intensive treatment. A 1989 report of the IOM 91 reviewed a variety of alcohol treatment modalities and concluded that various treatments were effective, but there was no single superior treatment for all patients, and few treatments were effective for the majority of patients. They found no evidence that residential versus nonresidential programs, or long- versus short-duration programs, were more effective for the average patient, and no studies existed that adequately evaluated the independent effect of Alcoholics Anonymous (AA). In a subsequent trial among employees referred for alcohol problems, patients who received inpatient treatment and mandatory AA follow-up were more likely to be abstinent at 2-year follow-up (37% vs. 16%) than patients assigned to mandatory AA only. 92

Two short-term (12 weeks) randomized trials demonstrated a significant benefit of naltrexone, an opioid antagonist, as an adjunct to treatment of alcohol dependence. In one study, patients receiving naltrexone and supportive psychotherapy had significantly higher abstinence rates than did subjects receiving placebo (61% vs. 19%). 93 In the second, men receiving naltrexone reported less alcohol craving and fewer drinking days than placebo-treated men. 94 In both trials, naltrexone significantly reduced the likelihood of relapse (heavy drinking or steady drinking) among subjects who did not achieve complete abstinence. The benefits of alcohol-sensitizing agents, however, remain uncertain. 10 Disulfiram (i.e., Antabuse) did not improve long-term abstinence rates in a controlled trial, but it did reduce drinking days among patients receiving the highest dose. 95

In a 10-year follow-up of 158 patients completing inpatient treatment, 61% reported complete or stable remission of alcoholism. 96 Completing an extended inpatient program was associated with significantly lower mortality among alcoholic patients in a second study. 97 Many such studies of alcohol treatment, however, suffer from important methodologic limitations: inadequate control groups, insufficient or selective follow-up, and selection bias due to the characteristics of patients who successfully complete voluntary treatment programs. 91,98,99 Since spontaneous remission occurs in as many as 30% of alcoholics, 100,101 reduced consumption may be inappropriately attributed to treatment. Successful treatment is likely to represent a complex interaction of patient motivation, treatment characteristics, and the posttreatment environment (family support, stress, etc.). 1,10 The IOM review concluded that treatment of other life problems (e.g., with antidepressant medication, family or marital therapy, or stress management) and empathetic therapists were likely to improve treatment outcomes. 91 top link

Recommendations of Other Groups

There is a consensus among professional groups such as the American Medical Association (AMA) 102 and the American Academy of Family Physicians (AAFP) 103 that clinicians should be alert to the signs and symptoms of alcohol abuse and should routinely discuss patterns of alcohol use with all patients. AAFP recommendations are under review. The Canadian Task Force on the Periodic Health Examination (CTF) 104 and a 1990 IOM panel 1 recommended screening adults for problem drinking, using patient inquiry or standardized instruments, and offering brief counseling to nondependent problem drinkers.

The American Academy of Pediatrics (AAP), 105 AMA Guidelines for Adolescent Preventive Services (GAPS), 106 the Bright Futures guidelines, 107 and the AAFP 103 all recommend careful discussion with all adolescents regarding alcohol use and regular advice to abstain from alcohol. The AAP also advises physicians to counsel parents regarding their own use of alcohol in the home. Recommendations of the U.S. Surgeon General, 108 the American College of Obstetricians and Gynecologists, 109 and the AAP 109,110 advise counseling all women who are pregnant or planning pregnancy that drinking can be harmful to the fetus and that abstinence is the safest policy. The CTF recommends that all women be screened for problem drinking and advised to reduce alcohol use during pregnancy. 104

Several organizations have made recommendations about "safe" levels of alcohol consumption for nonpregnant adults. The National Institute on Alcohol Abuse and Alcoholism, 111 the U.S. Surgeon General, 112 and dietary guidelines produced jointly by the U.S. Departments of Health and Human Services and Agriculture 113,114 recommend no more than 2 drinks per day for men and 1 drink per day for nonpregnant women. Slightly higher limits were proposed by national health authorities in the U.K. 115 top link

Discussion

Alcohol problems are common in the primary care setting, but they often go undetected by clinicians. Although imperfect, asking patients direct questions about the quantity, frequency, and pattern of their drinking is an important way to identify those who are most likely to experience problems due to alcohol. Questions about tolerance to the effects of alcohol may circumvent denial among pregnant women and heavy drinkers. The CAGE and other brief screening instruments are useful supplements to the standard patient history, but they may be less sensitive for early problems and hazardous drinking. The AUDIT may detect a broader range of current drinking problems, but its performance in the primary care setting needs further evaluation. Although laboratory tests such as GGT are not sufficiently sensitive or specific for routine screening, they may be useful in selected high-risk patients to confirm clinical suspicion or to motivate changes in drinking. Neither questionnaires nor laboratory tests should be considered diagnostic of problem drinking without more detailed evaluation (see Clinical Intervention).

Detecting early problem drinkers is important, because they account for a large proportion of all alcohol problems and they are more likely to respond to simple interventions than patients with alcohol dependency. There is now good evidence that brief counseling can reduce alcohol consumption in problem drinkers, and several trials have also reported improved clinical outcomes. Since the risks from alcohol rise steadily at higher levels of consumption, reducing drinking should also benefit heavy drinkers (i.e., hazardous drinkers) who do not yet manifest problems due to drinking. Early attention to problem drinking is especially important in young adults: hazardous drinking is common, adverse effects of alcohol increase with duration of use, and few persons initiate drinking after age 30. 116 Early detection is also important for alcohol-dependent patients, but effective treatment requires more intensive and sustained efforts to promote abstinence.

Uncertainties remain about optimal screening methods and interventions during pregnancy, but screening is justified by the strong evidence of the adverse effects of alcohol on the fetus. Although the risks of occasional, light drinking during pregnancy have not been established, abstinence can be recommended as a prudent approach for pregnant women. At the same time, women concerned about the effects of previous moderate drinking early in pregnancy can be reassured that important harms have not been demonstrated from such limited exposures. Because exposure early in pregnancy may be most important, screening and advice should be directed toward women contemplating pregnancy and those at risk for unintended pregnancy, not just women who are already pregnant.

There is insufficient evidence to make precise recommendations about desirable levels of drinking, but the strong association between heavy alcohol use and risk of future complications justifies advising all drinkers to drink moderately and avoid frequent intoxication, even in the absence of current problems (see below).

Clinical Intervention

Screening to detect problem drinking and hazardous drinking is recommended for all adult and adolescent patients ("B" recommendation). Screening should involve a careful history of alcohol use and/or the use of standardized screening questionnaires. Patients should be asked to describe the quantity, frequency, and other characteristics of their use of wine, beer, and liquor, including frequency of intoxication and tolerance to the effects of alcohol. One drink is defined as 12 ounces of beer, a 5-ounce glass of wine, or 1.5 fluid ounces (one jigger) of distilled spirits. Brief questionnaires such as the CAGE or AUDIT may help clinicians assess the likelihood of problem drinking or hazardous drinking (see Table 52.1). Responses suggestive of problem drinking should be confirmed with more extensive discussions with the patient (and family members where indicated) about patterns of use, problems related to drinking, and symptoms of alcohol dependence. 2 Routine measurement of biochemical markers, such as serum GGT, is not recommended for screening purposes. Discussions with adolescents should be approached with discretion to establish a trusting relationship and to respect the patient's concerns about the confidentiality of disclosed information.

All pregnant women should be screened for evidence of problem drinking or risk drinking (2 drinks per day or binge drinking) ("B" recommendation). Including questions about tolerance to alcohol may improve detection of at-risk women. All pregnant women and women contemplating pregnancy should be informed of the harmful effects of alcohol on the fetus and advised to limit or cease drinking. Although there is insufficient evidence to prove or disprove harms from occasional, light drinking during pregnancy, abstinence from alcohol can be recommended on other grounds: possible risk from even low-level exposure to alcohol, lack of harm from abstaining, and prevailing expert opinion ("C" recommendation). Women who smoke should be advised that the risk of low birth weight is greatest for mothers who both smoke and drink.

Patients with evidence of alcohol dependence should be referred, where possible, to appropriate clinical specialists or community programs specializing in the treatment of alcohol dependence. Patients with evidence of alcohol abuse or hazardous drinking should be offered brief advice and counseling. Counseling should involve feedback of the evidence of a drinking problem, discussion of the role of alcohol in current medical or psychosocial problems, direct advice to reduce consumption, and plans for regular follow-up. Problems related to alcohol (e.g., physical symptoms, behavioral or mood problems, or difficulties at work and home) should be monitored to determine whether further interventions are needed. There is no single definition of "hazardous" drinking in asymptomatic persons, but successful intervention trials have generally defined 5 drinks per day in men, 3 drinks per day in women, or frequent intoxication to identify persons at risk. Several U.S. organizations have suggested lower limits for "safe" drinking: 2 drinks per day in men and 1 drink per day in women. 18 All persons who drink should be informed of the dangers of driving or other potentially dangerous activities after drinking (see Chapter 57). The use of alcohol should be discouraged in persons younger than the legal age for drinking ("B" recommendation), although the effectiveness of alcohol abstinence messages in the primary care setting is uncertain.

Table 52.1 AUDIT Structured Interview[a]

[a] Score of greater than 8 (out of 41) suggests problem drinking and indicates need for more in-depth assessment. Cut-off of 10 points recommended by some to provide greater specificity.

[*] 5 points if response is 10 or more drinks on a typical day.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH, with contributions from materials prepared for the Canadian Task Force on the Periodic Health Examination by Deborah L. Craig, MPH, and Jean L. Haggerty, MSc. top link

53. Screening for Drug Abuse

Burden of Suffering

The abuse of both illicit and legal drugs remains an important medical problem in the U.S. Although casual (i.e., occasional) use of illicit drugs declined steadily in the general population from 1979 to 1992, drug use appears to be increasing since then, especially among teenagers and young adults. 1,2 Moreover, there has been little improvement in the numbers of persons using drugs on a regular basis. 3,4 In a 1991 survey of over 8,000 persons aged 15-54 years, 3.6% met diagnostic criteria for drug dependence or drug abuse in the past year, 5 and drug-related emergency visits in the U.S. reached all-time highs in 1993. 6 An estimated 5.5 million Americans, half of whom are in the criminal justice system, are affected by drug abuse or dependence. 7

In a national household survey in 1993, 14% of adults ages 18-25 and 3% of those over 35 reported using illicit drugs within the last month. 2 Occasional use of marijuana accounts for a large proportion of reported drug use, but many drug users used other illicit drugs (cocaine, heroin, phencyclidine, methaqualone, hallucinogens, etc.), legal drugs not prescribed by a physician (e.g., amphetamines, benzodiazepines, barbiturates, and anabolic steroids), or inhalants (amyl and butyl nitrite, gasoline, nitrous oxide, glue, and other solvents). An estimated 5 million Americans smoke marijuana regularly (at least once a week), almost 500,000 use cocaine weekly, and over 500,000 used heroin or other injectable drug in the past year. 2 Others have estimated that up to 500,000 Americans are addicted to heroin and 1-1.6 million currently use injection drugs. 8 Drug use is more common among men, the unemployed, adults who have not completed high school, and urban residents. The overall prevalence of drug use does not differ greatly among white, African American, and Hispanic populations, but patterns of drug use may differ. 4

Adverse effects of drug use are greatest in heavy users and those dependent on drugs, but some can occur from even occasional drug use. Cocaine can produce acute cardiovascular complications (e.g., arrhythmias, myocardial infarction, cerebral hemorrhage, and seizures), nasal and sinus disease, and respiratory problems (when smoked). 9,10 Dependence on cocaine produces diminished motivation, psychomotor retardation, irregular sleep patterns, and other symptoms of depression. 11 "Crack," a popular and cheaper smokeable form of cocaine, is also highly addictive. Mortality among injection drug users (IDUs) is high from overdose, suicide, violence, and medical complications from injecting contaminated materials (e.g., human immunodeficiency virus [HIV] infection, hepatitis, bacterial endocarditis, chronic glomerulonephritis, and pulmonary emboli) in some cities, up to 40% of IDUs are infected with HIV. 12 Although the extent of adverse effects of marijuana use is controversial, chronic use may be associated with respiratory complications or amotivational syndrome. 13,14 In a 1991 survey, 8% of cocaine users and 21% of marijuana users reported daily use for 2 weeks or more. 15

The indirect medical and social consequences of drug use are equally important: criminal activities related to illegal drugs take a tremendous toll in many communities, use of injection drugs and crack are major factors in the spread of HIV infection 16,17 (see Chapter 28), and drugs play a role in many homicides, suicides, and motor vehicle injuries (see Chapters 50,57, and 59). Nearly half of all users of cocaine or marijuana reported having driven a car shortly after using drugs. 13,15

Drug Use During Pregnancy.

A national probability sample of 2,613 women giving birth in 1992-1993 estimated that 5.5% used some illicit drug during pregnancy: the most frequently used drugs were marijuana (2.9%) and cocaine (1.1%). 18 Anonymous urine testing of nearly 30,000 women giving birth in California in 1992 detected illicit drugs in 5.2%: marijuana (1.9%), opiates (1.5%), and cocaine (1.1%) were the most frequently detected substances. 19 Prevalence of drug use is generally higher among mothers who smoke or drink, are unmarried, are not working, have public or no insurance, live in urban areas, or receive late or no prenatal care. 18-20 Anonymous urine testing detected cocaine use in 7-15% of pregnant women from high-risk, urban communities 21,22 and in 0.2-1.5% of mothers in private clinics and rural areas. 23,24 The most important forms of substance abuse during pregnancy are the use of alcohol and tobacco, however (see Chapters 52 and 54). 25

Drug use during pregnancy has been associated with a variety of adverse outcomes, but problems associated with drug use (e.g., use of alcohol or cigarettes, poverty, poor nutrition, and inadequate prenatal care) may be more important than the direct effects of drugs. 26,27 Regular use of cocaine and opiates is associated with poor weight gain among pregnant women, impaired fetal growth, and increased risk of premature birth cocaine appears to increase the risk of abruptio placentae. 28 The effects of social use of cocaine in the first trimester are uncertain. 29,30 Cocaine has been blamed for some congenital defects, 27 but the teratogenic potential of cocaine has not been definitively established. Infants exposed to drugs in utero may exhibit withdrawal symptoms due to opiates, or increased tremors, hyperexcitability, and hypertonicity due to cocaine. 27,31 Possible long-term neurologic effects of drug exposure are difficult to separate from the effects of other factors that influence development among disadvantaged children. 27,32,33 The effects of marijuana on the fetus remain controversial. 34-36 top link

Drug Use in Children and Adolescents.

Drug use and abuse remain important problems among adolescents. 37 After more than 10 years of decreasing trends, drug use among high school students increased in 1993 and 1994. 1,38 Use of illicit drugs may interfere with school, increase the risk of injuries, contribute to unsafe sex, and progress to more harmful drug use. Among high school seniors in 1994, 22% reported using an illicit drug in the past month: marijuana (19%), stimulants (4%), inhalants (3%), and hallucinogens (3%) were more common than cocaine (1.5%) or heroin (0.3%). 1 Abuse of inhalants is a leading drug problem in younger adolescents 1 and can cause asphyxiation or neurologic damage with chronic abuse. 39 Abuse of anabolic steroids in adolescent boys and young men can cause psychiatric symptoms and has been associated with hepatic, endocrine, and cardiovascular problems. top link

Accuracy of Screening Tests

The diagnostic standard for drug abuse and dependence is the careful diagnostic interview. 40 Important information from the patient history includes the quantity, frequency, and pattern of drug use adverse effects of drugs on work, health, and social relationships and any symptoms of dependence. 41 Clinicians often have trouble accurately identifying drug use and drug abuse among their patients in routine clinical encounters, however. Time may be too limited to take a careful history, some patients may not acknowledge drug problems due to denial, and many others are reluctant to admit to using drugs, for fear of discrimination by health care providers or concerns about confidentiality. It is common for adolescents to distrust authority figures such as clinicians, and young persons may be especially concerned about their drug use becoming known to family members, school officials, or the police. 42

There are few data to determine whether or not the use of standardized screening questionnaires can increase the detection of potential drug problems among patients. Brief alcohol screening instruments such as the CAGE or MAST (see Chapter 52) can be modified to assess the consequences of drug use in a standardized manner, 41,43 but these instruments have not been compared to routine history or clinician assessment. Questionnaires which include items about personal problems, outlook, and high-risk behaviors can identify adolescents at increased risk for drug use, but they have not been validated in prospective studies. 44 Other instruments such as the Addiction Severity Index 45 are useful for evaluating treatment needs but are too long for screening.

Toxicologic tests can provide objective evidence of drug use. The most common tests employ radioimmunoassays (RIA), enzymatic immunoassay (EIA), fluorescence polarization immunoassay (FPI), or thin-layer chromatography (TLC) to measure concentrations of specific drugs and their metabolites in urine specimens. 46 Sensitivity of these tests is generally above 99% compared with reference standards 47 sensitivity for detecting drug use in individuals, however, depends directly on timing of drug use and the urinary excretion of drug metabolites. Marijuana may be detected for up to 14 days after repeated use, but evidence of cocaine, opiates, amphetamines, and barbiturates is present for only 2-4 days after use. Various techniques may be employed by drug users who wish to avoid detection that further reduce the sensitivity of urine testing: water loading, diuretic use, ingestion of interfering substances, or adulterating urine samples. Most importantly, toxicologic tests do not distinguish between occasional users and individuals who are dependent on or otherwise impaired by drug use.

False-positive results from urine drug screening are possible due to cross-reaction with other medications or naturally occurring compounds in foods. 48 To prevent falsely implicating persons as users of illegal drugs, screen-positive samples are usually confirmed with more specific (and expensive) techniques such as gas chromatography-mass spectroscopy (GC-MS). These procedures reduce, but do not eliminate, the possibility of false-positive results due to cross-reactions, contamination, or mislabeled specimens. Proficiency testing of nearly 1,500 urine specimens sent to 31 U.S. laboratories produced no false-positive results and 3% false-negative results. 49 A similar study of 120 clinical laboratories in the U.K. demonstrated higher error rates (4% false-positive, 8% false-negative), largely due to laboratories that did not use confirmatory tests. 50

Screening Pregnant Women and Newborn Infants.

A careful history taken by trusted clinicians remains the most sensitive means of detecting drug use and abuse, 51,52 but many pregnant women conceal use of illicit drugs, since it may provide grounds for action by child welfare agencies. Clinicians often selectively screen for drug use, based on preconceptions of the typical drug-using mother. Studies using sensitive toxicologic tests suggest that only one in four pregnant women who have used opiates, cocaine, or marijuana are identified as drug users in the medical record. 51 Patient history identified only 40-60% of pregnant women with urine tests positive for illicit drugs. 21,53 Detection of drug use is increased by use of a standard protocol for assessing drug use in patients, rather than screening based on the discretion of the clinician. 54,55

Testing of newborn specimens can identify infants exposed to drugs in utero. Assays of infant urine are most common but are not sensitive for drug use early in pregnancy. Among mothers admitting drug use during pregnancy, RIA of infant urine had a sensitivity of 52%, versus 88% for RIA of meconium. 51 Among 39 women who used cocaine, RIA of infant hair was more sensitive (78%) than RIA of infant urine (38%) or meconium (52%). 52 These more sensitive tests are not widely available, however, and have not yet been sufficiently validated for screening purposes. 56 Moreover, clinical history may be more useful than toxicologic testing for identifying newborns at risk: among drug-exposed infants identified by meconium testing, adverse outcomes were limited to infants born to mothers who admitted to drug use. 51 top link

Adverse Effects of Screening.

Drug testing is frequently performed without informed consent in the clinical setting on the grounds that it is a diagnostic test intended to improve the care of the patient. Because of the significance of a positive drug screen for the patient, however, the rights of patients to autonomy and privacy have important implications for screening of asymptomatic persons. 57 If confidentiality is not ensured, test results may affect a patient's employment, insurance coverage, or personal relationships. 58 Testing during pregnancy is especially problematic, because clinicians may be required by state laws to report evidence of potentially harmful drug or alcohol use in pregnant patients. top link

Effectiveness of Early Detection

Early intervention has the potential to avert some of the serious consequences of drug abuse, including injuries, legal problems, and medical complications. Although various treatments have been proven effective in drug-dependent patients (see below), they have largely been studied in patients who have already developed medical, social, or legal problems due to their drug use. There is much less evidence that systematic screening and earlier intervention is effective in improving clinical outcomes among asymptomatic persons, who may be less motivated to undergo treatment than more severely impaired drug users.

The evidence supporting the effectiveness of treatment for drug abuse and dependence was reviewed in 1990 by the Institute of Medicine. 7 The most consistent evidence supports the clinical benefits of methadone maintenance in persons addicted to heroin. Several studies, including two randomized controlled trials, have shown that heroin addicts who remain in methadone maintenance programs have reduced heroin consumption, lower rates of HIV infection, decreased criminality and unemployment, and lower mortality than subjects who are not treated or treated for only short periods. 7,59 Over the short term, methadone treatment is associated with a 95% reduction in self-reported heroin use and a 57-68% reduction in self-reported cocaine use, 60 but some persons switch from heroin to other drugs while on treatment. 61 Moreover, results may be biased due to reliance on patient self-report and loss to follow-up of patients who drop out of treatment. 62

Drug abusers are frequently enrolled in residential treatment programs, often as part of a court order related to drug offenses. Patients entering such programs experience lower rates of drug use, imprisonment, and unemployment than drug users who do not enroll. 7 Longer programs seem to be more effective than short (<3 months) programs. 63 However, attrition rates from residential programs can reach 75%, 64 and selection bias may contribute to the improved outcomes in subjects who complete programs. 65 Less intensive outpatient programs also seem to be effective for drug users, but the wide variation in interventions used limits the conclusions that can be drawn. Attrition is highest in outpatient, nonresidential programs. 7 There are fewer data on long-term (>1 year) outcomes of drug treatment recidivism is high and many patients suffer from other problems (psychiatric disorders, unemployment, homelessness) which reinforce drug use and are often not addressed by drug treatment. 7

Treatment of adolescent substance abusers has been recently reviewed for nearly 1,500 primarily middle-class adolescents aged 12-19 years who entered inpatient or residential treatment programs. 66 Compared to use before treatment, there was a significant reduction in regular drug use (weekly or more) 1 year after treatment (85% vs. 29%), and 50% of teens had been abstinent for 6 months. Increasing parental participation in treatment was associated with greater levels of abstinence. High school primary prevention programs which emphasize "life skills" have reduced tobacco or alcohol use over the short term (1 year), 67 but long-term effects on illicit drug use have not been well studied. In a 6-year randomized trial among 3,597 high school students, a prevention curriculum delivered in grades 7-9 significantly reduced smoking and alcohol use, but not marijuana use, in high school seniors a subgroup of students who received a more complete intervention were less likely to use marijuana regularly (5% vs. 9%). 68

Treatment of Pregnant Drug Abusers.

There are few controlled trials of interventions for pregnant women who use illicit drugs. Women who use crack and other forms of cocaine account for the largest group of pregnancies at risk from illicit drugs, but optimal treatment for cocaine users is uncertain. In two observational studies, risk of low birth weight decreased substantially with increasing number of prenatal visits. 69,70 Women who reduced use of cocaine during pregnancy, or used cocaine infrequently, had outcomes similar to nonusers in several studies. 30,34 Methadone maintenance is the usual treatment for pregnant women addicted to opiates: withdrawal during pregnancy is dangerous, and the regular contact required for methadone treatment may encourage women to receive regular prenatal care. 27 Methadone can be safely withdrawn after delivery but it prolongs withdrawal in the infant. Because the most seriously impaired drug users often present late for care, if at all, options for improving the course of drug-exposed pregnancies are often limited. top link

Recommendations of Other Groups

The American Medical Association (AMA) 71 and the American Academy of Family Physicians (AAFP) 72 advise physicians to include an in-depth history of substance abuse as part of a complete health examination for all patients. The AAFP, 72 AMA Guidelines for Adolescent Preventive Services (GAPS), 73 Bright Futures recommendations, 74 and American Academy of Pediatrics 75,76 suggest that clinicians discuss the dangers of drug use with all children and adolescents and include questions about substance abuse as a part of routine adolescent visits. The American College of Obstetricians and Gynecologists recommends that clinicians take a thorough history of substance use and abuse in all obstetric patients, and remain alert to signs of substance abuse in all women. 77,78

The AMA supports drug testing (in conjunction with rehabilitation and treatment) as part of preemployment examinations for jobs affecting the health and safety of others. 71 The AMA and most other medical organizations endorse urine testing when there is reasonable suspicion of substance abuse, but none of these groups recommends routine drug screening in the absence of clinical indications.top link

Discussion

Many Americans face substantial health risks from illicit drugs and the nonmedical use of other drugs, but questions remain about appropriate methods for screening for drug abuse among asymptomatic patients. The routine use of screening instruments or laboratory tests has not yet been proven effective in reducing harmful drug use. Nonetheless, information about drug use is an important component of the medical interview, especially for adolescents and young adults, and a careful history remains the best way to identify those who need treatment. Despite frequent treatment failures, the medical and social benefits of treating drug abuse are substantial for patients who achieve long-term abstinence. Reducing drug use is also likely to have important benefits to society in reducing criminal activity and the spread of HIV. 7

Urine testing is sensitive and specific for recent drug use but has many limitations as a routine screening test: it does not distinguish occasional use from drug abuse or dependence sensitivity and specificity vary with timing of drug use and the effectiveness of early intervention has not been examined in asymptomatic drug users detected by toxicologic screening. 79 Routine screening in asymptomatic individuals also poses important risks: testing without informed consent may violate patient autonomy the predictive value of positive test results may be low in populations with a low prevalence of drug use and patients may be discriminated against if confidentiality of results is not ensured.

Efforts to screen for drug use in pregnancy have been prompted by concern about the adverse effects on the developing fetus, the impact of parental drug use on child safety and welfare, and the realization that many drug-using mothers go undetected by routine patient history. Use of standardized clinical assessment in all pregnant women can increase the identification of drug use, but there is little evidence that routine urine screening in asymptomatic women reduces drug use during pregnancy or results in better perinatal outcomes. Treatment services for pregnant, drug-abusing women are often scarce, testing may not identify those pregnancies at highest risk, and positive tests have direct legal and social consequences for the mother and child. 80 Where clinicians must report drug use in pregnancy, routine testing may lead some women to avoid needed prenatal care.

CLINICAL INTERVENTION

There is insufficient evidence to recommend for or against routine screening for drug abuse with standardized questionnaires or biologic assays ("C" recommendation). Including questions about drug use when taking a history from adolescent and adult patients may be recommended on other grounds, including the prevalence of drug use and the serious consequences of drug abuse and dependence. Clinicians should be alert to signs and symptoms of drug abuse and ask about the use of illicit drugs and legal drugs of abuse (e.g., sedatives, stimulants) use of inhalants should be considered in older children, adolescents, and young adults. The quantity, frequency, patterns of consumption, and adverse consequences of drug use (e.g., interference with school or work, evidence of dependence) should be assessed for all patients who report drug use. Clinicians should establish a trusting relationship with patients, approach discussion of drug use in a nonjudgmental manner, and respect the patient's concerns about the confidentiality of disclosed information.

All pregnant women should be advised about the potential risks to the fetus of drug use during pregnancy and the potential to transmit drugs to infants through breastfeeding. Routine drug testing of urine or other body fluids is not recommended as the primary method of detecting drug use in pregnant women or other asymptomatic adults. Selective use of urine testing during pregnancy may be appropriate when the possibility of drug use is suggested by clinical signs and symptoms (e.g., growth retardation, inadequate weight gain, inadequate prenatal care) periodic testing can also help monitor and encourage abstinence in women who have used drugs. Pregnant women who abuse drugs should be advised of the importance of regular prenatal care and be referred for treatment, where available.

Patients should give consent prior to drug testing and be informed of any legal obligations on the part of the clinician to report drug use to child protective agencies or other authorities. Both positive and negative results should be interpreted with understanding of the kinetics of drug metabolism and the limitations of testing methods, and positive screening tests should be confirmed by more reliable methods.

All patients who report potentially harmful use of drugs should be informed of the risks associated with their drug use and advised to cut down or stop. Decisions about treatment should be based on evidence of drug abuse or drug dependence obtained through careful patient interview, including discussion with friends or family members where appropriate. A treatment plan should be developed for the patient and family that is tailored to the drug of abuse and the needs of the patient. Patients with evidence of drug dependence should be referred to appropriate drug-treatment providers and community programs specializing in the treatment of drug dependencies. Persons who continue to inject drugs should be screened periodically for HIV infection and advised of measures that may reduce the risk of infections due to drug use: use a new sterile syringe with each use, never share or re-use injection equipment, use clean (sterile, if possible) water to prepare drugs, clean the injection site with alcohol prior to injection, and safely dispose of syringes after use (see Chapters 28 and 62). Drug-using patients should be informed of available resources for sterile injection equipment.

The draft update of this chapter was prepared for the U.S. Preventive Services Task Force by David Atkins, MD, MPH. top link


Copyright and Disclaimer