The scientific assessment of these Guidelines was based on evidence linkages or statements regarding potential relationships between clinical interventions and outcomes. The interventions were examined to assess their impact on a variety of outcomes related to obstetric anesthesia.
Initially, each pertinent outcome reported in a study was classified as supporting an evidence linkage, refuting a linkage, or equivocal. The results were then summarized to obtain a directional assessment for each evidence linkage before conducting a formal meta-analysis. Literature pertaining to 11 evidence linkages contained enough studies with well-defined experimental designs and statistical information sufficient for meta-analyses. These linkages were (1) nonparticulate antacids versus no antacids, (2) continuous epidural infusion of local anesthetics with or without opioids versus parenteral opioids, (3) induction of epidural analgesia using local anesthetics with opioids versus equal concentrations of epidural local anesthetics without opioids, (4) maintenance of epidural infusion of lower concentrations of local anesthetics with opioids versus higher concentrations of local anesthetics without opioids, (5) Combined spinal–epidural local anesthetics with opioids versus epidural local anesthetics with opioids, (6) PCEA versus continuous infusion epidurals, (7) general anesthesia versus epidural anesthesia for cesarean delivery, (8) Combined spinal–epidural anesthesia versus epidural anesthesia for cesarean delivery, (9) use of pencil-point spinal needles versus cutting-bevel spinal needles, (10) ephedrine or phenylephrine reduces maternal hypotension during neuraxial anesthesia, and (11) neuraxial opioids versus parenteral opioids for postoperative analgesia after neuraxial anesthesia for cesarean delivery.
General variance-based effect-size estimates or combined probability tests were obtained for continuous outcome measures, and Mantel-Haenszel odds ratios were obtained for dichotomous outcome measures. Two combined probability tests were used as follows: (1) the Fisher combined test, producing chi-square values based on logarithmic transformations of the reported P values from the independent studies, and (2) the Stouffer combined test, providing weighted representation of the studies by weighting each of the standard normal deviates by the size of the sample. An odds ratio procedure based on the Mantel-Haenszel method for combining study results using 2 X 2 tables was used with outcome frequency information. An acceptable significance level was set at P <0.01 (one-tailed). Tests for heterogeneity of the independent studies were conducted to assure consistency among the study results. DerSimonian-Laird random-effects odds ratios were obtained when significant heterogeneity was found (P <0.01). To control for potential publishing bias, a "fail-safe n" value was calculated. No search for unpublished studies was conducted, and no reliability tests for locating research results were done.
Meta-analytic results are reported in table 4 of the original guideline document. To be accepted as significant findings, Mantel-Haenszel odds ratios must agree with combined test results whenever both types of data are assessed. In the absence of Mantel-Haenszel odds ratios, findings from both the Fisher and weighted Stouffer combined tests must agree with each other to be acceptable as significant.
Interobserver agreement among Task Force members and two methodologists was established by interrater reliability testing. Agreement levels using a kappa statistic for two-rater agreement pairs were as follows: (1) type of study design, kappa = 0.83–0.94; (2) type of analysis, kappa = 0.71–0.93; (3) evidence linkage assignment, kappa = 0.87–1.00; and (4) literature inclusion for database, kappa = 0.74–1.00. Three-rater chance-corrected agreement values were (1) study design, Sav = 0.884, Var (Sav) = 0.004; (2) type of analysis, Sav = 0.805, Var (Sav) = 0.009; (3) linkage assignment, Sav = 0.911, Var (Sav) = 0.002; and (4) literature database inclusion, Sav = 0.660, Var (Sav) = 0.024. These values represent moderate to high levels of agreement.