Short Contents | Full Contents | Other books NCBI |
|
AHRQ Evidence reports and summaries AHRQ Evidence Reports, Numbers 1-60 1. Systematic Review of the Literature Regarding the Diagnosis of Sleep Apnea THIS EVIDENCE REPORT IS OUTDATED AND IS NO LONGER VIEWED AS GUIDANCE FOR CURRENT MEDICAL PRACTICE. IT IS MAINTAINED FOR ARCHIVAL PURPOSES ONLY. Evidence Report/Technology Assessment Number 1 Prepared for: Department of Health and Human Services Contract No. 290-97-0016
Investigators
The Agency for Health Care Policy and Research (AHCPR), through its Evidence-based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHCPR and conduct additional analyses when appropriate prior to developing their reports and assessments. To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHCPR encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release. AHCPR expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality. We welcome written comments on this evidence report. They may be sent to: Director, Center for Practice and Technology Assessment, Agency for Health Care Policy and Research, 6010 Executive Boulevard, Suite 300, Rockville, MD 20852.
Objective. The objective was to establish the evidence base for diagnosing sleep apnea (SA) in adult patients using systematic review methods. Tests covered were sleep monitoring devices, radiologic imaging, laboratory assays, and clinical signs and symptoms posited for use in screening or diagnosing SA. The standard sleep lab polysomnogram (PSG) was the gold standard. Search strategy. Literature published from 1980 through November 1, 1997 (cutoff) was searched using Medline and Current Contents, supplemented by a manual review of the bibliographies of all accepted papers. Selection criteria. Studies of at least 10 adult patients suspected of or diagnosed with SA had to report the results of any test to establish or support a diagnosis of SA, relative to a standard PSG-derived apnea index (AI, the number of apneic episodes/Hour sleep); apnea-hypopnea index (AHI, the total apneas plus hypopneas during total time asleep, divided by the number of hours asleep); or respiratory distress index (RDI). Eligible languages were English, German, French, Spanish, or Italian. Diagnostic papers reporting prevalence or clinical comorbidities of SA were also accepted. Data collection and analysis. Based on scores for study characteristics (e.g., random order test, blinding of test readers, and use of PSG comparison), 147 studies met or exceeded the minimum evidence score. From these, data on study, patient, and test characteristics and on results were collected. Nondiagnostic studies reporting prevalence or clinical comorbidities were separately extracted. Study and patient-level covariates were summarized and the results were analyzed using fixed effects models. Results were evaluated using summary receiver operating characteristic (ROC) curves where data were available. Main results. In 71 analyzable diagnostic or screening studies (7,572 patients), the sensitivity and specificity of partial channel and partial time PSGs appeared most promising as possible prescreening tests or replacements for full PSG. Prediction models achieved good sensitivity and specificity. Studies of portable devices were variable due to study and device heterogeneity. Radiologic studies and several miscellaneous studies of questionnaires, anthropomorphic signs, and ears/nose/throat (ENT) exams could not be analyzed due to insufficient data. Global clinical impressions and oximetry provided moderate sensitivity and specificity. Least accurate were flow volume loops. The review and analysis were limited by variability in PSG definitions of apnea and hypopnea, and thresholds for SA diagnosis. For sensitivity and specificity determinations, the lowest AI/AHI threshold for SA diagnosis was used. Necessary components of "standard" PSG were not consistent. SA prevalence studies in different patient populations were reviewed. Few such studies utilized gold standard PSG to diagnose SA; so the diagnosis was based on unvalidated tests. Such prevalence estimates are suspect. Conclusions. The best available evidence from literature sources suggests the diagnosis of SA is still best accomplished with full PSG. Progress has been made in establishing reasonable sensitivity and specificity of tests other than full PSG, and future researchers should focus on building this evidence base. Standardization of terms and diagnostic criteria is an absolute requirement to expedite development and enhance the utility of this literature in the future. Suggested citation: Overview In this study, MetaWorks investigators have developed an evidence base via a systematic review of the literature pertinent to diagnostic testing and screening in sleep apnea in adult patients. Sleep apnea (SA) is a recently recognized disorder of sleep characterized by recurrent apneic and hypopneic episodes. Apnea was typically defined as complete cessation of airflow, but in some studies, a >80 percent reduction in airflow was used. For defining hypopnea, most papers suggested a 50 percent or greater reduction in airflow was used, with or without a coincident O2 desaturation of anywhere from 2 percent to 4 percent from some average SaO2 over a preceding interval of time. In view of its high prevalence and serious associated morbidity, SA has recently been described as a major public health concern. A major problem in the field in 1998 is diagnosis: who to test, how to test, and what are the implications of test results regarding the risk of serious clinical sequelae? Sleep apnea is a condition where the gold standard diagnostic method, overnight full channel polysomnography (PSG) in a sleep lab, is intrusive and costly, and the interpretation can be difficult. A standard PSG typically consists of electroencephalogram (EEG), submental (± tibialis) electromyogram (EMG), electrooculogram (EOG), respiratory airflow (usually by oronasal flow monitors), respiratory effort (usually by plethysmography), and oxygen saturation (oximetry). Electrocardiography (ECG) and body position are also frequently monitored in formal sleep studies and stated to be standard requirements of PSG by some groups. If, however, the estimated prevalence of sleep apnea at 2 percent to 4 percent of middle-aged adults is correct, the costs of full PSGs to diagnose all suspected cases would be prohibitive. The development of simpler and less costly alternatives for diagnostic testing would be highly desirable as would simpler prescreening tests prior to full PSG. Diagnostic approaches which might be viewed either as alternatives to PSGs or as screening tests to better select patients for PSG include: partial channel PSGs; partial night or daytime PSGs; portable sleep monitoring devices for use at home; radiologic imaging of the head and neck for anatomic abnormalities predictive of sleep apnea, including cephalometry; magnetic resonance imaging (MRI) and computed tomography (CT) scans; anthropomorphic measurements, such as neck circumference; nasopharyngeal and laryngeal endoscopic measurements of both structure and function; and focused questionnaires. All such interventions were within the scope of this review, provided they compared results against the gold standard diagnostic test, the standard PSG. Although the type of sleep evaluation study preferred (and reimbursed) varies widely among physicians, sleep centers, and managed care organizations, MetaWorks investigators have avoided making specific recommendations in this review. MetaWorks investigators also did not review technical considerations related to data acquisition, storage, retrieval, and analysis of various devices, which were beyond the scope of this project. Rather, it is intended that this synthesis of the best available evidence will serve as an information resource for local decisionmakers and developers of guidelines/recommendations. It should also serve to highlight gaps in literature and areas ripe for future research. Reporting the EvidenceThe key questions that guided this review were: 1) What diagnostic and screening tests are presently available? 2) What is the strength of the evidence in support of each? 3) What is the predictive value of these tests in different populations (which requires estimating the prevalence of SA in different populations)? 4) What are the implications of certain PSG results in terms of serious clinical events occurring as comorbidities in association with a diagnosis of SA? MethodologyIn general, MetaWorks investigators used systematic review methods derived from the evolving science of review research. The review followed a prospective protocol that was developed a priori and shared with the nominating partners on the project (Blue Cross/Blue Shield [BC/BS] of Massachusetts and the Sleep Disorders Centre of Metropolitan Toronto), a panel of technical experts (with representation from consumer groups and professional specialties: neurology, pulmonology, dentistry, otolaryngology, epidemiology, and nursing); and the Task Order Officers at the Agency for Health Care Policy and Research (AHCPR). The protocol outlined the methods to be used for the literature search, study eligibility criteria, data elements for extraction, and methodological strategies to minimize bias and maximize precision during the process of data collection, extraction, and synthesis. The published literature was searched from 1980 to present. The search cutoff date was November 1, 1997, and the retrieval cutoff date was January 30, 1998. The search started with a broad Medline search using the terms "sleep apnea syndrome" and "monitoring, physiologic," "sleep apnea syndrome" and "airway resistance," and "human." Also, MetaWorks investigators searched "sleep apnea syndromes," "sleep apnea syndrome" and "index." In addition, the 1997 Current Contents CD-ROM was searched ("sleep apnea") to the same cutoff date. All citations and abstracts were printed and screened at MetaWorks for any mention of diagnostic tests in adults with SA, for which full papers were obtained. The electronic searches noted above were supplemented by a thorough search of the reference lists of all eligible studies and relevant review articles. To be included in the review, studies had to report results of any diagnostic test or intervention to establish or support a diagnosis of SA in adults, with at least 10 patients as total sample size. Studies reported in the following Western European languages - English, German, French, Spanish, or Italian - were accepted. All eligible papers were scored on features pertinent to diagnostic test study design, execution, and reporting, with a range of possible scores from 0 to 44. Those falling in the lowest 20 percent of the distribution of actual scores were dropped from data extraction and analysis. Each accepted diagnostic study was extracted in duplicate by investigators with one extractor using a blinded copy of each study report, masked as to source of financial support, authors, and journal. The agreement between extractors was approximately 78 percent and differences were resolved by consensus. Key data elements sought for extraction from each study included study level, patient level, and test characteristics. Only clearly reported aggregate results were extracted from studies. Results that were only given for individual patients and results that would require extrapolations from graphs or derivations from figures or tables were not captured. For all tests, sensitivity, specificity, positive predictive value, negative predictive value, and correlation coefficients of each test relative to PSG AI or AHI (RDI) results were sought. (Apnea index [AI] is defined as the number of apneic episodes/hour sleep, and apnea-hypopnea index [AHI] is the total apneas plus hypopneas during total time asleep, divided by the number of hours asleep. The respiratory distress index [RDI] is the same as AHI.) The main objective of the analysis was to evaluate the diagnostic accuracy of alternatives to full PSG for the diagnosis of SA as compared to a full PSG (gold standard). Initially, weighted averages using Mantel-Haenszel fixed effects models combining the comparative summary statistics were calculated and summarized for groups based on diagnostic test category. Study and patient-level covariates and study evidence scores were also summarized for each diagnostic test category. A summary receiver operating characteristic (ROC) curve was calculated for each diagnostic group where data were available. While differences among studies may be an argument against estimating one common sensitivity and specificity using fixed or random effects models, these factors can be described using the summary ROCs, which both display and summarize the heterogeneity. A group of 22 peer reviewers drawn from consumer groups and professional organizations, along with our technical experts and partners, was assembled to review and provide suggestions to the draft final report describing this project. Their feedback, as well as that from AHCPR, was incorporated wherever possible within the original scope of the project. FindingsAll Studies: PSG
Conditions associated with SA:
Future studies of diagnostic test strategies should address the many limitations of the literature noted above. The field could benefit from adoption of a common terminology and definitions for fundamental concepts such as apnea and hypopnea, and the relation between AI and AHI should be established, in order to allow conversions and comparisons across studies. Researchers should seek to clarify what is the frequency of sleep apnea/hypopnea in general populations by gender and age. More naturalistic sleep studies (in the home) are still of interest, as MetaWorks investigators suspect much of the uncertainty about the nature of SA, its pathophysiology, the risk factors, and the clinical consequences, derive from the fact that the phenomenon of SA may be altered by the fact of observing it via standard PSG. Long term follow-up studies are recommended to better document the findings of treated vs. untreated SA. Lastly, all sleep monitoring systems which are proposed as prequalifiers or replacements for PSG must be validated in the setting in which they are intended to be used. |