- Home
- Search for Research Summaries, Reviews, and Reports
EHC Component
- EPC Project
Topic Title
- Comparative Effectiveness of Diagnosis and Treatment of Obstructive Sleep Apnea in Adults
Full Report
- Research Review Aug. 8, 2011
Related Products for this Topic
- Research Protocol Mar. 15, 2010
- Disposition of Comments Report Oct. 25, 2011
- Clinician Summary Aug. 8, 2011
- Consumer Summary Aug. 8, 2011
- La GuĂas Sumaria de los Consumidores Dec. 5, 2011
Related Links for this Topic
Original Nomination
Executive Summary – Aug. 8, 2011
Diagnosis and Treatment of Obstructive Sleep Apnea in Adults
Formats
- View PDF (PDF) 219 kB
- Help with Viewers, Players, and Plug-ins
Table of Contents
- Background
- Objectives
- Key Questions
- Analytic Framework
- Methods
- Results
- Discussion
- Limitations
- Implications for Future Research
- Full Report
- For More Copies
- Notes
Background
Obstructive sleep apnea (OSA) is a relatively common disorder in the United States that affects people of all ages, but is most prevalent among the middle-aged and elderly. Affected individuals experience repeated collapse and obstruction of the upper airway during sleep, which results in reduced airflow (hypopnea) or complete airflow cessation (apnea), oxygen desaturation, and arousals from sleep. Adverse clinical outcomes associated with OSA include: cardiovascular disease, hypertension, non-insulin-dependent diabetes, and increased likelihood of motor vehicle and other accidents due to daytime hypersomnolence. Studies estimate the prevalence of OSA at approximately 10 to 20 percent of middle-aged and older adults. Evidence also indicates that these rates are rising, likely due to increasing rates of obesity.
Based on the considerable mortality and morbidity associated with it and its attendant comorbidities, OSA is an important public health issue. Complicating diagnosis and treatment, however, is the great degree of clinical uncertainty that exists regarding the condition, due in large part to inconsistencies in its definition. Ongoing debate surrounds what type and level of respiratory abnormality should be used to define the disorder as well as what is the most appropriate diagnostic method for its detection. In addition, there is no current established threshold level for the apnea-hypopnea index (AHI) that would indicate the need for treatment. By consensus, people with relatively few apnea or hypopnea events per hour (often <5 or <15) are not formally diagnosed with OSA. Also of concern are the high rates of perioperative and postoperative complications among OSA patients, as are the numbers of asymptomatic and symptomatic individuals who remain undiagnosed and untreated.
Three main categories of outcomes of interest in comparative effectiveness research are clinical (or health) outcomes (i.e., events or conditions that the patient can feel, such as disability or quality of life or death), intermediate or surrogate outcomes (such as laboratory measurements), and adverse events. Objective clinical outcomes relevant to patients with OSA include comorbidities found to be associated with untreated sleep apnea, primarily cardiovascular disease (including congestive heart failure, hypertension, stroke, and myocardial infarction) and non-insulin-dependent diabetes. In addition, mortality due to cardiovascular disease, diabetes, motor vehicle accidents, and other causes represent important adverse outcomes of OSA. Intermediate outcomes of interest in the management of patients with OSA include sleep study measures (e.g., AHI), blood pressure (an intermediate outcome for cardiovascular disease), and hemoglobin A1c (a measure of control of diabetes mellitus).
All interventions have the potential for adverse events. Therefore, it is important to gather information on both the benefits and harms of interventions in order to fully assess the net comparative benefits. Compliance with continuous positive airway pressure (CPAP) and other devices is an important issue related to the effective treatment of OSA. Interventions that have better compliance or that may improve compliance are clearly of interest. Also of relevance is establishing definitive diagnostic standards and measures that would more clearly identify OSA patients, both symptomatic and asymptomatic. Such standards would serve to reduce OSA-related morbidities as well as related health care costs. Studies have found that prior to diagnosis, OSA patients have higher rates of health care use, more frequent and longer hospital stays, and greater health care costs than after diagnosis. Therefore, this review is of additional interest to the requesting organizations and broadly for the identification of diagnostic tests that would contribute to the early and definitive diagnosis of patients with OSA.
Objectives
In response to several nominations received through the Effective Healthcare Web site, which were evaluated and found to meet program criteria, the Agency for Healthcare Research and Quality (AHRQ) requested that the Tufts Evidence-based Practice Center (Tufts EPC) conduct a Comparative and Effectiveness Review (CER) of studies regarding the diagnosis and treatment of OSA. Key Questions that are clinically relevant for the diagnosis and treatment of OSA were developed with input from domain experts and other stakeholders and from comments received in response to public review. Seven Key Questions are addressed in this report. Three pertain to diagnosis of and screening for OSA (Key Questions 1–3), two address the comparative effectiveness of treatments (Key Questions 5 and 7), and two address associations between baseline patient characteristics and long-term outcomes and treatment compliance (Key Questions 4 and 6).
Key Questions
Diagnosis
- How do different available tests compare in their ability to diagnose sleep apnea in adults with symptoms suggestive of disordered sleep? How do these tests compare in different subgroups of patients, based on: race, sex, body mass index, existing non-insulin-dependent diabetes mellitus, existing cardiovascular disease, existing hypertension, clinical symptoms, previous stroke, or airway characteristics?
- How does phased testing (screening tests or battery followed by full test) compare to full testing alone?
- What is the effect of preoperative screening for sleep apnea on surgical outcomes?
- In adults being screened for obstructive sleep apnea, what are the relationships between apnea-hypopnea index or oxygen desaturation index and other patient characteristics with respect to long-term clinical and functional outcomes?
Treatment
- What is the comparative effect of different treatments for obstructive sleep apnea in adults?
- Does the comparative effect of treatments vary based on presenting patient characteristics, severity of obstructive sleep apnea, or other pretreatment factors? Are any of these characteristics or factors predictive of treatment success?
- Characteristics: Age, sex, race, weight, bed partner, airway, other physical characteristics, and specific comorbidities
- Obstructive sleep apnea severity or characteristics: Baseline questionnaire (and similar tools) results, formal testing results (including hypoxemia levels), baseline quality of life, positional dependency
- Other: Specific symptoms
- Does the comparative effect of treatments vary based on the definitions of obstructive sleep apnea used by study investigators?
- Does the comparative effect of treatments vary based on presenting patient characteristics, severity of obstructive sleep apnea, or other pretreatment factors? Are any of these characteristics or factors predictive of treatment success?
- In obstructive sleep apnea patients prescribed nonsurgical treatments, what are the associations of pretreatment patient-level characteristics with treatment compliance?
- What is the effect of interventions to improve compliance with device use (positive airway pressure, oral appliances, positional therapy) on clinical and intermediate outcomes?
Analytic Framework
To guide the development of the Key Questions for the diagnosis and treatment of OSA, we developed an analytic framework (Figure A) that maps the specific linkages associating the populations and subgroups of interest, the interventions (for both diagnosis and treatment), and outcomes of interest (intermediate outcomes, health-related outcomes, compliance, and adverse effects). Specifically, this analytic framework depicts the chain of logic that evidence must support to link the interventions to improved health outcomes.
Figure A. Analytic framework for the diagnosis and treatment of obstructive sleep apnea in adults
CVD, cardiovascular disease; KQ, Key Question; NIDDM, non-insulin-dependent diabetes mellitus; QoL, quality of life.
Methods
Input from Stakeholders
During a topic refinement phase, the initial questions were refined with input from a panel of Key Informants. The Key Informants included experts in sleep medicine, general internal medicine, and psychiatry; a representative from Oregon Division of Medical Assistance programs; a person with OSA; a representative of a sleep apnea advocacy group; and the AHRQ Task Order Officer.
After a public review of the proposed Key Questions, the clinical experts from among the Key Informants were reconvened to form the Technical Expert Panel, which served to provide clinical and methodological expertise and input to help refine Key Questions, identify important issues, and define parameters for the review of evidence, including study eligibility criteria.
Data Sources and Selection
We conducted literature searches of studies in MEDLINE® (inception–September 2010) and the Cochrane Central Register of Controlled Trials (through 3rd quarter 2010). All English-language studies with adult human subjects were screened to identify articles relevant to each Key Question. The search strategy included terms for OSA, sleep apnea diagnostic tests, sleep apnea treatments, and relevant research designs.
The reference lists of related systematic reviews and selected narrative reviews and primary articles were also reviewed, and relevant articles were screened. After screening of the abstracts, full-text articles were retrieved for all potentially relevant articles and rescreened for eligibility.
Data Extraction and Quality Assessment
Study data were extracted into customized forms. Together with information on study design, patient and intervention characteristics, outcome definitions, and study results, the methodological quality of each study was rated from A (highest quality, least likely to have significant bias) to C (lowest quality, most likely to have significant bias).
Data Synthesis and Analysis
For all Key Questions or specific comparison of interventions with at least two studies, summary tables present the study and baseline patient characteristics, the study quality, and the relevant study results. For each comparison, separate tables include all the studies that reported specific outcomes. For Key Question 1 (diagnosis), we graphically display the Bland-Altman limits of agreement and the sensitivity and specificity of studies comparing portable monitors to polysomnography (PSG). For Key Question 5 (treatment), when there were three or more similar studies evaluating the same outcome, we performed random effects model meta-analyses of the following: the sleep study measures AHI, arousal index, minimum oxygen saturation; the standard measure of sleepiness, the Epworth Sleepiness Scale (ESS); the quality-of-life measure Functional Outcomes Sleep Questionnaire (FOSQ); and compliance. We performed subgroup meta-analyses based on study design (parallel or crossover), minimum AHI threshold to diagnose OSA, specific intervention (when appropriate), and other factors. Of note, where interventions (either diagnostic tests or treatments) are not discussed, this does not imply that the interventions were excluded from analysis (unless explicitly stated); instead, no studies of these interventions met eligibility criteria.
As per the AHRQ updated methods guide series, we assessed the evidence for each question (or comparison of interventions) based on the risk of bias, study consistency, directness of the evidence, and degree of certainty of the findings. Based on these factors, we graded the overall strength of evidence as high, moderate, low, or insufficient.
When there were substantial differences in conclusions for different outcomes within the same comparison, we also described the evidence supporting each outcome as sufficient, fair, weak, limited, or no evidence.
Results
Key Question 1. How do different available tests compare in their ability to diagnose sleep apnea in adults with symptoms suggestive of disordered sleep? How do these tests compare in different subgroups of patients based on: race, sex, body mass index, existing noninsulin dependent diabetes mellitus, existing cardiovascular disease, existing hypertension, clinical symptoms, previous stroke, or airway characteristics?
Comparison of Portable Devices and Polysomnography
PSG devices are classified as Type I monitors. Portable monitors are classified as either Type II, which record all the same information as PSG; Type III, which do not differentiate between whether the patient is asleep or awake, but have at least two respiratory channels (two airflow channels or one airflow and one effort channel); or Type IV, which fail to fulfill criteria for Type III monitors but usually record more than two bioparameters.
The strength of evidence is moderate, among 15 quality A, 45 quality B, and 39 quality C studies, that Type III and Type IV monitors may have the ability to accurately predict AHI suggestive of OSA with high positive likelihood ratios and low negative likelihood ratios for various AHI cutoffs in PSG. Type III monitors perform better than Type IV monitors at AHI cutoffs of 5, 10, and 15 events/hr. Analysis of difference versus average analyses plots suggest that substantial differences in the measured AHI may be encountered between PSG and both Type III and Type IV monitors. Large differences compared with in-laboratory PSG cannot be excluded for all portable monitors. The evidence is insufficient to adequately compare specific monitors to each other.
No recent studies compared Type II monitors with PSG. A prior Technology Assessment of home diagnosis of OSA concluded that “based on [three quality B studies], type II monitors [used at home] may identify AHI suggestive of OSA with high positive likelihood ratios and low negative likelihood ratios,” though “substantial differences in the [measurement of] AHI may be encountered between type II monitors and facility-based PSG.”
Comparison of Questionnaires and Polysomnography
Of the six studies reviewed (one quality A, one quality B, four quality C), the strength of evidence is low among three studies supporting the use of the Berlin questionnaire in screening for sleep apnea because of the likely selection biases. The strength of evidence is insufficient to draw definitive conclusions concerning the use of the STOP, STOP-Bang, ASA Checklist, Epworth Sleepiness Scale, and Hawaii Sleep questionnaires to screen for sleep apnea because each questionnaire was assessed in only a single study.
Clinical Prediction Rules and Polysomnography
The strength of evidence is low among seven studies (three quality A, three quality B, and one quality C) that some clinical prediction rules may be useful in the prediction of a diagnosis of OSA. Ten different clinical prediction rules have been described. Nine clinical prediction rules have been used for the prediction of a diagnosis of OSA (using different criteria). The oropharyngeal morphometric model gave near perfect discrimination (area under the curve [AUC] = 0.996) to predict the diagnosis of OSA, and the pulmonary function data model had 100 percent sensitivity with 84 percent specificity to predict diagnosis of OSA. The remaining models reported lower diagnostic sensitivities and specificities. Each model was deemed useful to predict the diagnoses of OSA by the individual study authors. However, while all the models were internally validated, external validation of these predictive rules has not been conducted in the vast majority of the studies.
Key Question 2. How does phased testing (screening tests or battery followed by full test) compare to full testing alone?
The strength of evidence is insufficient to determine the utility of phased testing, followed by full testing when indicated, to diagnose sleep apnea, as only one study that met our inclusion criteria investigated this question. This prospective quality C study did not fully analyze the phased testing, thus the sensitivity and specificity of the phased strategy could not be calculated due to a verification bias; not all participants received PSG (full) testing.
Key Question 3. What is the effect of preoperative screening for sleep apnea on surgical outcomes?
The strength of evidence is insufficient regarding postoperative outcomes with mandatory screening for sleep apnea. Two quality C prospective studies assessed the effect of preoperative screening for sleep apnea on surgical outcomes. One study found no significant differences in outcomes between patients undergoing bariatric surgery who had mandatory PSG or PSG based on clinical parameters. The second study found that general surgery patients willing to undergo preoperative PSG were more likely to have perioperative complications, particularly cardiopulmonary complications, possibly suggesting that patients willing to undergo PSG are more ill than other patients.
Key Question 4. In adults being screened for obstructive sleep apnea, what are the relationships between apnea-hypopnea index or oxygen desaturation index, and other patient characteristics with respect to long-term clinical and functional outcomes?
The strength of evidence is high from four studies (three quality A, one quality B) indicating that an AHI >30 events/hr is an independent predictor of all-cause mortality; although one study found that this was true only in men under age 70. All other outcomes were analyzed by only one or two studies. Thus, only a low strength of evidence exists that a high AHI (>30 events/hr) is associated with incident diabetes. This association, however, may be confounded by obesity, which may result in both OSA and diabetes. The strength of evidence is insufficient regarding the association between AHI and other clinical outcomes. The two studies of cardiovascular mortality did not have consistent findings, and the two studies of hypertension had unclear conclusions. One study of nonfatal cardiovascular disease found a significant association with baseline AHI (as they did for cardiovascular mortality). One study each found no association between AHI and stroke or long-term quality of life.
Key Question 5. What is the comparative effect of different treatments for obstructive sleep apnea in adults?
- Does the comparative effect of treatments vary based on presenting patient characteristics, severity of obstructive sleep apnea, or other pretreatment factors? Are any of these characteristics or factors predictive of treatment success?
- Characteristics: age, sex, race, weight, bed partner, airway, other physical characteristics, and specific comorbidities
- Obstructive sleep apnea severity or characteristics: baseline questionnaire (and similar tools) results, formal testing results (including hypoxemia levels), baseline quality of life, positional dependency
- Other: specific symptoms
- Does the comparative effect of treatments vary based on the definitions of obstructive sleep apnea used by study investigators?
With some exceptions for studies of surgical interventions, we reviewed only randomized controlled trials (RCT) of interventions used specifically for the treatment of obstructive sleep apnea (OSA).
Comparison of Continuous Positive Airway Pressure and Control
There are 22 trials (11 each of quality B and C) that provide sufficient evidence supporting large improvements in sleep measures with continuous positive airway pressure (CPAP) compared with control. There is only weak evidence that demonstrated no consistent benefit in improving quality of life, neurocognitive measures, or other intermediate outcomes. Despite no evidence or weak evidence for an effect of CPAP on clinical outcomes, given the large magnitude of effect on the intermediate outcomes AHI and ESS, the strength of evidence that CPAP is an effective treatment to alleviate sleep apnea signs and symptoms was rated moderate.
Comparison of CPAP and Sham CPAP
There are 24 trials (5 quality A, 13 quality B, 6 quality C) that provide sufficient evidence supporting large improvements in sleep measures with CPAP compared with sham CPAP, but weak evidence of possibly no difference between CPAP and sham CPAP in improving quality of life, neurocognitive measures, or other intermediate outcomes. Despite no evidence or weak evidence for an effect of CPAP on clinical outcomes, given the large magnitude of effect on the intermediate outcomes of AHI, ESS, and arousal index, the strength of evidence that CPAP is an effective treatment for the relief of signs and symptoms of sleep apnea was rated moderate.
Comparison of Oral and Nasal CPAP
Three small trials (one quality B, two quality C) with inconsistent results preclude any substantive conclusions concerning the efficacy of oral (or full face mask) versus nasal CPAP in improving compliance in patients with OSA. Largely due to small sample size, the reported effect estimates in the studies reviewed were generally imprecise. Thus, overall, the strength of evidence is insufficient regarding differences in compliance or other outcomes between oral and nasal CPAP.
Comparison of Autotitrating CPAP and Fixed CPAP
The strength of evidence is moderate that autotitrating CPAP (autoCPAP) and fixed pressure CPAP result in similar levels of compliance (hours used per night) and treatment effects for patients with OSA. Twenty-one studies (1 quality A, 10 quality B, 10 quality C) comprising an experimental population of over 800 patients provided evidence that autoCPAP reduces sleepiness as measured by ESS by approximately 0.5 points more than fixed CPAP. The two devices were found to result in similar compliance and changes in AHI from baseline, quality of life, and most other sleep study measures. However, there is also evidence that minimum oxygen saturation improves more with fixed CPAP than with autoCPAP, although by only about one percent. Evidence is limited regarding the relative effect of fixed CPAP and autoCPAP on blood pressure. There were no data on objective clinical outcomes.
Comparison of Bilevel CPAP and Fixed CPAP
The strength of evidence is insufficient regarding any difference in compliance or other outcomes between bilevel CPAP and fixed CPAP. Five small, highly clinically heterogeneous trials (one quality B, four quality C) with largely null findings did not support any substantive differences in the efficacy of bilevel CPAP versus fixed CPAP in the treatment of patients with OSA. Largely due to small sample sizes, the studies mostly had imprecise estimates of the comparative effects.
Comparison of Flexible Bilevel CPAP and Fixed CPAP
The strength of evidence is insufficient regarding the relative merits of flexible bilevel CPAP and fixed CPAP as there was only one quality B study that investigated this comparison. This study found that flexible bilevel CPAP may yield increased compliance (use ³4 hr/night) compared with fixed CPAP.
Comparison of C-Flex™ and Fixed CPAP
No statistically significant differences in compliance or other outcomes were found between C-Flex and fixed CPAP. The strength of evidence is low for this finding because of the mixed quality (Bs and Cs) of the four primary studies.
Comparison of Humidification in CPAP
The strength of evidence is insufficient to determine whether there is a difference in compliance or other outcomes between positive airway pressure treatment with and without humidification. Five trials examined different aspects of humidified CPAP treatment for patients with OSA. While some studies reported a benefit of added humidity in CPAP treatment in improving patient compliance, this effect was not consistent across all the studies. Overall, the studies were clinically heterogeneous, small, and of quality B (three studies) or C (two studies).
Comparison of Mandibular Advancement Devices and No Treatment or Inactive Oral Devices
The strength of evidence is moderate to show that the use of mandibular advancement devices (MAD) improves sleep apnea signs and symptoms. Five trials (four quality B, one quality C) compared MAD with no treatment, using a variety of different types of MAD, and found significant improvements with MAD in AHI, ESS, and other sleep study measures. Any differences in quality of life measures or neurocognitive tests were equivocal between treatment groups. No trial evaluated objective clinical outcomes. Another five trials (four quality B, one quality C) compared the effects of MAD with inactive oral devices and reported similar findings.
Comparison of Different Oral Devices
The strength of evidence is insufficient to draw conclusions with regard to the relative efficacy of different types of oral MAD in patients with OSA because the reviewed studies were generally small, and each was concerned with a unique comparison. Five studies (four quality B, one quality C) with unique comparisons found little to no differences between different types and methods of use of MAD or other oral devices in sleep study or sleepiness measures. No study evaluated objective clinical outcomes. Only one study evaluated compliance; no significant differences were observed. One trial found that a greater degree of mandibular advancement resulted in an increased number of patients achieving an AHI <10 events/hr; however, the mean AHI was similar between treatment groups.
Comparison of Mandibular Advancement Devices and CPAP
The strength of evidence is moderate that CPAP is superior to MAD in improving sleep study measures. Ten mostly quality B trials overall found that CPAP resulted in greater reductions in AHI and arousal index, and increases in minimum oxygen saturation. The evidence regarding the relative effects on ESS were too heterogeneous to allow conclusions. In a single study, patients were more compliant with MAD than CPAP (hours used per night and nights used). No study evaluated objective clinical outcomes. The strength of evidence is insufficient to address which patients might benefit most from either treatment.
Comparison of Surgery and Control
The strength of evidence is insufficient to evaluate the relative efficacy of surgical interventions for the treatment of OSA. Six trials and one nonrandomized prospective study with unique interventions compared surgery with control treatment for the management of patients with OSA. Three studies were rated quality A, one quality B, and three quality C. The results were inconsistent across studies as to which outcomes were improved with surgery compared with no or sham surgery.
Comparison of Surgery and CPAP
The strength of evidence is insufficient to determine the relative merits of surgical treatments versus CPAP. Of 12 studies (1 quality A, 11 quality C) comparing surgical modalities with CPAP, only two were RCTs, and they compared CPAP with uvulopalatopharyngoplasty (UPPP), removal of the soft tissue at the back of the throat, the uvula, and soft palate. While one of these trials found that CPAP resulted in a higher mortality benefit, the other found no difference between groups. Due to the heterogeneity of interventions and outcomes examined, the variability of findings across studies, and the inherent bias of all but one study regarding which patients received surgery, it is not possible at this time to draw useful conclusions comparing surgical interventions with CPAP in the treatment of patients with OSA. The quality A trial was the only unbiased comparison of surgery and CPAP (patients had previously received neither treatment). It did not find statistically significant differences in ESS and quality of life measures between patients with mild to moderate OSA who had temperature-controlled radiofrequency tissue volume reduction of the soft palate and those who had CPAP at 2 months followup. Likewise, the other trial, comparing maxillomandibular advancement osteotomy and CPAP, did not find statistically significant differences in AHI and ESS in patients with severe OSA. For the nonrandomized studies, comparisons between surgery and CPAP are difficult to interpret since baseline patient characteristics (including sleep apnea severity) differed significantly between groups, particularly in regards to what previous treatments patients had. The reported findings on sleep study and quality of life measures were heterogeneous across studies.
Comparison of Surgery and Mandibular Advancement Devices
The strength of evidence is insufficient regarding the relative merit of MAD versus surgery in the treatment of OSA, as there was only one study (quality B) that examined this question. A statistically significant improvement in AHI was observed in the MAD group compared with the surgery group. No study evaluated objective clinical outcomes.
Comparison of Other Treatments
The strength of evidence is low to show that some intensive weight loss programs may be effective treatment for OSA in obese patients. Three trials (one quality A, two quality B) compared weight loss interventions with control interventions. All three trials found significant relative reductions in AHI with diet. Other outcomes were inconsistent.
The strength of evidence is insufficient to determine the effects of other potential treatments for OSA. Twenty-one studies evaluated other interventions including atrial overdrive pacing, eight different drugs, palatal implants, oropharyngeal exercises, a tongue-retaining device, a positional alarm, combination tongue-retaining device and positional alarm, bariatric surgery, nasal dilator strips, acupuncture, and auricular plaster. All of these interventions were evaluated by one or two studies only. The findings were heterogeneous. No study evaluated objective clinical outcomes.
Key Question 6. In OSA patients prescribed nonsurgical treatments, what are the associations of pretreatment patient-level characteristics with treatment compliance?
Across five studies (one quality A, one quality B, three quality C), the strength of evidence is moderate that more severe OSA as measured by higher AHI is associated with greater compliance with CPAP use. Each study measured compliance differently, including thresholds of 1, 2, or 3 hours of use per night or as a continuous variable, and undefined “objective compliance” measured by the device. The strength of evidence is moderate that a higher ESS score is also associated with improved compliance. There are low strengths of evidence that younger age, snoring, lower CPAP pressure, higher BMI, higher mean oxygen saturation, and the sleepiness domain on the Grenoble Sleep Apnea Quality of Life test are each possible independent predictors of compliance. It is important to note, however, that selective reporting, particularly of nonreporting of nonsignificant associations, cannot be ruled out. The heterogeneity of analyzed and reported potential predictors greatly limits these conclusions. Differences across studies as to which variables were independent predictors may be due to the adjustment for different variables, in addition to differences in populations, outcomes, CPAP machines, and CPAP training and followup. One quality C study of mandibular advancement devices failed to identify potential predictors of compliance.
Key Question 7. What is the effect of interventions to improve compliance with device (positive airway pressure, oral appliances, positional therapy) use on clinical and intermediate outcomes?
The strength of evidence is low that some specific adjunct interventions may improve CPAP compliance, but studies are heterogeneous and no general type of intervention (e.g., education, telemonitoring) was more promising than others. The 18 trials (two quality A, eight quality B, and eight quality C) had inconsistent effects across a wide variety of interventions. Studies generally had small sample sizes with less than 1 year of followup. Compared with usual care, several interventions were shown to significantly increase hours of CPAP use per night in some studies. These included intensive support or literature (designed for patient education), cognitive behavioral therapy (given to patients and their partners), telemonitoring, and a habit-promoting audio-based intervention. However, the majority of studies did not find a significant difference in CPAP compliance between patients who received interventions to promote compliance with device use and those who received usual care. No study of nurse-led care (which was not focused primarily on compliance) showed an effect on compliance rates.
Discussion
The findings of the systematic review have been summarized in Table A. Interventions (either diagnostic tests or treatments) that are not discussed lack studies meeting eligibility criteria. Interventions were not excluded from analysis unless explicitly stated as such.
Diagnosis
In theory, obstructive sleep apnea (OSA) is relatively simple to diagnose. However, PSG, the standard diagnostic test, is inconvenient, resource-intensive, and may not be representative of a typical night’s sleep (particularly the first night the test is given). Furthermore, there are variations across laboratories in the definitions of OSA (using different thresholds of AHI, from 5 to 15 events/hr) and in the way that the PSG results are read and interpreted. Moreover, AHI, which is used as the single metric to define OSA, can vary from night to night and does not take into account symptoms, comorbidities, or response to treatment.
Two approaches have been taken to reduce the resources involved in diagnosing OSA, including tests (questionnaires and clinical prediction rules) to screen for OSA and portable monitors to be used instead of sleep-laboratory PSG. Five questionnaires and 10 validated clinical prediction rules have been compared with PSG. However, very few of the screening tests have been evaluated by more than one set of researchers, and few have been directly compared with each other. Thus, the strength of evidence is low that the Berlin questionnaire is accurate in its ability to screen for OSA; the commonly used STOP and STOP-Bang questionnaires have not been adequately tested. For such tests to be of clinical value, apart from having very high sensitivity and specificity, they should be easy to administer and require only information from symptoms and signs easily obtainable during a physical examination. The evaluated clinical prediction models were all internally validated, but definitive conclusions on the external validity (i.e., generalizability) of these predictive rules in independent populations cannot be drawn from the available literature. The strength of evidence is low that some clinical prediction rules may be useful in the prediction of a diagnosis of OSA. No study examined the potential clinical utility of applying the questionnaires or prediction rules to clinical practice.
Numerous portable monitors (evaluated in 99 studies) have been developed for use in nonlaboratory settings; these use fewer “channels” (specific physiologic measures) than typical 16-channel PSG. The more recent studies do not substantially change the conclusions from the Tufts Evidence-based Practice Center’s (Tufts EPC) 2007 Technology Assessment on Home Diagnosis of Obstructive Sleep Apnea-Hypopnea Syndrome. Although most of the tested portable monitors fairly accurately predict OSA, it is unclear whether any of these monitors can replace laboratory-based PSG. The evidence suggests that the measured AHI from portable monitors is variable compared with PSG-derived AHI, but the source of this variability is unclear. So far, no studies have evaluated the predictive ability for clinical outcomes or response to treatment by portable monitors. Furthermore, no available studies have evaluated the impact of patient triage via screening tests and/or portable monitors.
The value of preoperative screening for OSA remains poorly defined. The only study that directly addressed this question was a retrospective study of patients undergoing bariatric surgery. It showed better perioperative outcomes from routine PSG. There are also no adequate studies that compared phased testing (simple tests followed by more intensive tests in selected patients) with full evaluation (by PSG).
Apnea-Hypopnea Index as a Predictor of Clinical Outcomes
The strength of evidence is high that high baseline (>30 events/hr or range) AHI is a strong and independent predictor of all-cause mortality over several years of followup, with the association being strongest among people with severe OSA (AHI >30 events/hr). However, the strength of evidence for the association between baseline AHI and other long-term clinical outcomes is generally insufficient, and thus the association between reductions in AHI by OSA treatment and improvements in long-term outcomes remains theoretical.
Treatment
The strength of evidence is moderate that fixed CPAP is an effective treatment to minimize AHI and improve sleepiness symptoms, as supported by more than 40 trials of patients treated with CPAP or no treatment. However, no trial reported long-term clinical outcomes, and compliance with CPAP treatment is poor. Because patients frequently do not tolerate CPAP, many alternative treatments have been proposed. First, several alternative CPAP machines have been designed to vary the pressure during the patient’s inspiratory cycle or to titrate the pressure to a minimum necessary level. Other modifications include different masks, nasal pads, and added humidification. The large majority of relevant trials have compared autotitrating CPAP (autoCPAP) with fixed CPAP and the strength of evidence of no clinical differences between them is moderate. The strength of evidence is insufficient for other device comparisons and, overall, the evidence does not support the use of one device for all patients, since such decisions should be individualized.
The second alternative to CPAP therapeutic option is the use of oral devices, which have been designed with the goal of splinting open the oropharynx to prevent obstruction. The most commonly tested are the mandibular advancement devices (MAD), for which the strength of evidence for their efficacy in sleep outcomes is moderate. Based on direct and indirect comparisons, CPAP appeared to be more effective than MAD. However, given the issues with noncompliance with CPAP, the decision as to whether to use CPAP or MAD will likely depend on patient preference.
The third major alternative to OSA treatment includes surgical interventions to alleviate airway obstruction. Given the very few randomized trials and the differences in the populations that choose to undergo surgery versus conservative treatment, the strength of evidence is insufficient to determine the relative value of surgery to no treatment, to CPAP, to MAD, or to alternative types of surgery. Additional interventions were also evaluated in randomized trials, (including weight loss programs, atrial overdrive pacing, eight different drugs, and other interventions) but in general the strength of evidence is insufficient to determine the effects of these potential treatments.
For all the treatment comparisons, it is important to identify which subgroups of patients may benefit most from specific treatments. Unfortunately, the trials are nearly silent on this issue. Very few trials reported subgroup analyses based on baseline characteristics, and for most comparisons there were too few studies or the interventions examined were too heterogeneous to analyze potential differences. Such analyses were feasible for the comparison between CPAP and control, where subgroup meta-analyses based on definitions of OSA (different minimum AHI thresholds) failed to demonstrate any difference in effectiveness of CPAP in reducing AHI or ESS. Though statistical heterogeneity existed across the trials, this was primarily attributed to study design factors that have no clinical implications. Despite statistical heterogeneity, and based on the consistency of findings that support CPAP as effective to minimize AHI in all patients with OSA, it is reasonable to conclude that the relative effectiveness in different populations is a moot point. The one exception to this may be patients with mild OSA (with AHI <15 events/hr), since people with low AHI cannot have as large an improvement in their AHI as people with severe OSA. Notably, across interventions there is little evidence supporting the hypothesis that any OSA treatment improves quality of life or neurocognitive function.
The strength of evidence is insufficient regarding the effect of interventions to improve CPAP compliance. The studies were very heterogeneous in their interventions and each evaluated different interventions. Higher baseline AHI and increased sleepiness as measured by the Epworth Sleepiness Scale are both predictors of improved compliance with CPAP (high strength and moderate strength of evidence, respectively). The unsurprising interpretation of this finding is that patients with more severe symptoms are more likely to accept the discomfort or inconvenience of using CPAP overnight.
Limitations
The most important limitations in the evidence were the lack of trials that evaluated long-term clinical outcomes, the sparseness of evidence to address several Key Questions, and the fact that no study of diagnostic tests or treatments attempted to assess how results may vary in different subgroups of patients. In general, the intervention trials were of quality B or C, with few quality A studies. Followup durations tended to be very short, and study dropout rates were frequently very high. Other frequent methodological problems with studies included incomplete reporting and/or inadequate analyses, which required estimations of pertinent results by the authors of this systematic review. The heavy reliance on industry support for trials of devices may lead to the concern of publication bias. However, this concern may be reduced since most of our conclusions were that the strength of evidence is either low or inadequate for interventions. Furthermore, the effects of CPAP and MAD on sleep measures are sufficiently large that conclusions about the effectiveness of these devices would be unlikely to change with the addition of unpublished trials.
Implications for Future Research
General Recommendation
- The recurrent problem of high dropout rates as evidenced in the literature we reviewed bears further investigation and is crucial for the conduct of future trials. It is important to understand whether this a problem peculiar to this field, whether patients’ symptoms interfere with their desire to fulfill their obligations as research participants, whether patients are not well informed about the serious consequences of sleep apnea and therefore are less motivated to comply with followup, or whether the treatments are so onerous that patients are refusing to continue with them.
Diagnostic Tests
- The most clinically useful evaluation of prediction rules and questionnaires (to screen for or diagnose OSA) would be trials to evaluate whether use of the tests improves clinical outcomes. Individual patient-data meta-analysis of measurements with portable monitors would provide insights on the diagnostic information contributed by different neurophysiologic signals. Future studies of the accuracy or bias of diagnostic tests should focus more on head-to-head comparisons of portable monitors, questionnaires, and prediction rules, to determine the optimal tool for use in a primary care setting to maximize initial evaluation of OSA and triage high-risk patients for prompt PSG. Direct comparisons among existing alternatives to PSG are more important than the current focus on developing new diagnostic tests.
- Trials are needed comparing potential phased testing strategies with direct PSG or addressing the value of preoperative screening for OSA. Studies of appropriate tests for patients, based on the type or severity of their symptoms, would be useful.
Treatments
- Only 3 of the 190 studies of treatments reported clinical outcomes; comparative studies focusing on long-term followup and clinical outcomes are needed.
- Fixed CPAP is clearly an effective treatment for OSA, and no further trials are needed to assess its efficacy, with the exception of trials assessing long-term clinical outcomes. All other interventions should either be:
- directly compared with fixed CPAP, among patients naïve to CPAP, or
- compared with no treatment or alternative treatment among patients who have failed to comply with CPAP treatment.
- Treatment effect heterogeneity should be investigated.
- The benefit from different degrees of mandibular advancement has to be determined.
- Head-to-head comparisons are needed of alternative treatments for patients who do not tolerate CPAP.
- Rigorously conducted head-to-head comparisons of surgical interventions versus CPAP are needed to overcome limitations of existing observational evidence.
- More studies are needed on the various additional interventions (including weight loss, drugs, and specific exercises), and their incremental benefit to accepted treatments for OSA should be examined.
- Interventions to improve compliance to CPAP and MAD should be tested in direct comparisons.
Predictors of Clinical Outcomes and Compliance
- The question of whether OSA severity is associated with long-term outcomes (beyond all-cause mortality) may be informed by patient-level meta-analyses of available large cohorts.
- Predictive models of compliance and response to treatment are needed.
Key Question | Strength of Evidence | Summary/Conclusions/Comments |
---|---|---|
AHI = apnea-hypopnea index, AUC = area under the ROC curve, autoCPAP = autotitrating CPAP, CI = confidence interval, CPAP = continuous positive airway pressure, ESS = Epworth Sleepiness Scale, HR = hazard ratio, MAD = mandibular advancement device, OSA = obstructive sleep apnea, PSG = polysomnography (sleep-laboratory based), RFA = radiofrequency ablation, ROC = receiver operating characteristics, SF-36 = Short Form Health Survey 36, UPPP = uvulopalatopharyngoplasty. Type II monitors are portable devices that record all the same information as PSG (Type I monitors). Type III monitors are portable devices that contain at least two airflow channels or one airflow and one effort channel. Type IV monitors comprise all other devices that fail to fulfill criteria for Type III monitors. They include monitors that record more than two physiological measures as well as single channel monitors. |
||
Key Question 1: Diagnosis Portable monitors vs. PSG |
Low (Type II monitors); Moderate (Types III & IV monitors) |
|
Key Question 1: Diagnosis Questionnaires vs. PSG |
Low / Insufficient |
|
Key Question 1: Diagnosis Clinical Prediction, Rules vs. PSG |
Low |
|
Key Question 2: Diagnosis Phased testing |
Insufficient |
|
Key Question 3: Diagnosis Preoperative screening |
Insufficient |
|
Key Question 4: Predictors AHI as a predictor of long-term clinical outcomes |
Variable (High for all-cause mortality; Low for diabetes; Insufficient for other long-term clinical outcomes) |
|
Key Question 5: Treatment OSA treatments CPAP vs. control |
Moderate |
|
Key Question 5: Treatment OSA treatments Different CPAP devices vs. each other |
Variable (Moderate for autoCPAP vs. CPAP; Low for C-Flex™ vs. CPAP; Insufficient for others) |
|
Key Question 5: Treatment OSA treatments MAD vs. control |
Moderate |
|
Key Question 5: Treatment OSA treatments Oral devices vs. each other |
Insufficient |
|
Key Question 5: Treatment OSA treatments CPAP vs. MAD |
Moderate |
|
Key Question 5: Treatment OSA treatments Surgery vs. control |
Insufficient |
|
Key Question 5: Treatment OSA treatments Surgery vs. CPAP |
Insufficient |
|
Key Question 5: Treatment OSA treatments Surgery vs. MAD |
Insufficient |
|
Key Question 5: Treatment OSA treatments/ Other treatments |
Variable (Low for weight loss vs. control; Insufficient for others) |
|
Key Question 6: Predictors Predictors of treatment compliance |
Variable (see Conclusions) |
|
Key Question 7: Treatment Treatments to improve compliance |
Low |
|
Full Report
This executive summary is part of the following document: Balk EM, Moorthy D, Obadan NO, Patel K, Ip S, Chung M, Bannuru RR, Kitsios GD, Sen S, Iovin RC, Gaylor JM, D’Ambrosio C, Lau J. Diagnosis and Treatment of Obstructive Sleep Apnea in Adults. Comparative Effectiveness Review No. 32. (Prepared by Tufts Evidence-based Practice Center under Contract No. 290-2007-100551). AHRQ Publication No. 11-EHC052-EF. Rockville, MD: Agency for Healthcare Research and Quality. July 2011. Available at: www.effectivehealthcare.ahrq.gov/reports/final.cfm.
For More Copies
For more copies of Diagnosis and Treatment of Obstructive Sleep Apnea in Adults: Executive Summary No. 32 (AHRQ Publication No. 11-EHC052-1), please call the AHRQ clearinghouse at 1-800-358-9295 or email ahrqpubs@ahrq.gov.
Notes
a Please refer to the main report for references.
b Criteria for selecting topics for systematic review include appropriateness, importance, lack of duplication, feasibility, and potential value. See http://www.effectivehealthcare.ahrq.gov/index.cfm/submit-a-suggestion- for-research/how-are-research-topics-chosen/.
c Tufts-New England Medical Center EPC. Home diagnosis of obstructive sleep apnea-hypopnea syndrome. Health Technology Assessment Database www.cms.gov/determinationprocess/downloads/id48TA pdf. 2007;2010.
d Type II monitors are portable devices that record all the same information as PSG (Type I monitors).
e Type III monitors are portable devices that contain at least two airflow channels or one airflow and one effort channel.
f Type IV monitors comprise all other devices that fail to fulfill criteria for Type III monitors. They include monitors that record more than two physiological measures as well as single channel monitors.