Prevalence, Screening Accuracy, and Screening Outcomes
Summary
Evidence Report/Technology Assessment: Number 119
Under its Evidence-based Practice Program, the Agency for Healthcare Research and Quality (AHRQ) is developing scientific information for other agencies and organizations on which to base clinical guidelines, performance measures, and other quality improvement tools. Contractor institutions review all relevant scientific literature on assigned clinical care topics and produce evidence reports and technology assessments, conduct research on methodologies and the effectiveness of their implementation, and participate in technical assistance activities.
Select for PDF File (115 KB). PDF Help.
Introduction / Methods / Results / Discussion / Availability of Full Report / References
Authors: Gaynes BN, Gavin N, Meltzer-Brody S, Lohr KN, Swinson T,
Gartlehner G, Brody S, Miller WC.
Introduction
Depression is the leading cause of disease-related
disability among women.1 In particular,
women of childbearing age are at high risk for
major depression.2-4 Pregnancy and new
motherhood may increase the risk of depressive
episodes. Depression during the perinatal period
can have devastating consequences, not only for
the women experiencing it but also for the
women's children and family.5-8
Perinatal depression encompasses major and
minor depressive episodes that occur either
during pregnancy or within the first 12 months
following delivery. When referring to depression
in this population, researchers and clinicians
frequently have not been clear about whether they
are referring to major depression alone or to both
major and minor depression. Major depression is
a distinct clinical syndrome for which treatment is
clearly indicated,9 whereas the definition and
management of minor depression are less clear.
In this report, we refer to major depression alone
by identifying it discretely as major depression.
Minor depression is an impairing, yet less severe,
constellation of depressive symptoms10 for which
controlled trials have not consistently indicated
whether or not particular interventions are more
effective than placebo.11,12 In this report, we refer
to this grouping as major or minor depression or
by the more general terms "depression" or
"depressive illness."
Perinatal depression, whether
one is referring to major depression alone or to
either major or minor depression, often goes
unrecognized because many of the discomforts of
pregnancy and the puerperium are similar to
symptoms of depression.13,14
Another mental disorder that can occur in the
perinatal period is postpartum psychosis. Unlike
postpartum depression, postpartum psychosis is a
relatively rare event with a range of estimated
incidence of 1.1 to 4.0 cases per 1,000 deliveries.15
The onset of postpartum psychosis is usually
acute, within the first 2 weeks of delivery, and
appears to be more common in women with a
strong family history of bipolar or schizoaffective
disorder.16 Postpartum psychosis is an important
disorder in its own right, but it is not addressed
specifically in this report.
The precise level of the prevalence and
incidence of perinatal depression is uncertain.
Published estimates of the rate of major and
minor depression in the postpartum period range
widely—from 5 percent to more than 25 percent
of new mothers, depending on the assessment
method, the timing of the assessment, and
population characteristics.17-19
In addition, although many screening
instruments have been developed or modified to
detect major and minor depression in pregnant
and newly delivered women, the evidence on
their screening accuracy relative to a reference
standard has yet to be systematically reviewed and
assessed.20 Evidence on the effectiveness of
screening all pregnant women and providing a
preventive intervention to those scoring at high
risk has not been systematically investigated and
evaluated either.20
To address these gaps, the Agency for
Healthcare Research and Quality (AHRQ), in
collaboration with the Safe Motherhood Group
(SMG), commissioned this evidence report from
the RTI International-University of North
Carolina's (RTI-UNC's) Evidence-based Practice
Center (EPC) for a systematic review of the
evidence on three questions related to perinatal
depression. These questions address the
prevalence and incidence of perinatal depression,
the accuracy of screening instruments for perinatal depression, and the effectiveness of interventions for
women screened as high risk for developing perinatal
depression.
The three key questions (KQs) are:
- What are the incidence and prevalence of depression (major and minor) during pregnancy and during the postpartum period? Are they increased during pregnancy and the postpartum period compared to nonchildbearing periods?
- What is the accuracy of different screening tools for detecting depression during pregnancy and the postpartum period?
- Does prenatal or early postnatal screening for depressive symptoms with subsequent intervention lead to improved outcomes?
Return to Contents
Methods
In conducting this systematic review, we followed
standardized procedures developed by AHRQ in collaboration
with all its EPCs for such reviews. Throughout the project we
enlisted the assistance of a Technical Expert Advisory Group
(TEAG) to react to work in progress and advise us on
substantive issues and overlooked areas of research. The TEAG
included four individuals who, collectively, have expertise in
obstetrics, psychiatry, psychology, and research methods, along
with clinical and research experience in perinatal depression.
Inclusion and Exclusion Criteria
We made the inclusion and exclusion criteria fairly restrictive
to ensure that our conclusions were based on the highest
quality data available with the lowest risk of bias. Some criteria
were common across the three key questions; others were
specific to the question.
For all key questions, studies had to report on original data,
be in English, and be published from January 1980 through
March 2004. The study also had to be from a developed
country to increase the likelihood of its being generalizable to
the U.S. population. We excluded studies of women with
bipolar disorder, primary psychotic disorders, or maternity
blues (a mild mood disturbance experienced by approximately
half of childbearing women within 3 to 6 days after delivery
that resolves within a few hours to a few days) in which the
outcomes of interest were not distinguishable from those for
women with major or minor depression. For KQs 2 and 3, we
excluded studies that enrolled women with known depressive
disorders at the outset because screening would not be
necessary for a patient already known to have a current
depressive episode.
In addition, studies for all key questions had to assess
women for depression during pregnancy or in the first year
postpartum. Diagnostic confirmation, by means of a clinical
assessment or structured clinical interview, was required for
KQs 1 and 2.
For KQ 1, we excluded studies of the prevalence
and incidence of perinatal depression that relied solely on self-report
screens to identify depression.
In KQ 2, study
investigators used the clinical assessment or structured clinical
interview to assess the properties of the screening instrument.
In KQ 3, we required that patients had to have been
screened, whether by formal instrument or by another type of
screen that identified women as being at risk of having a
depressive illness (e.g., prior history of postpartum depression).
As the screening process was the focus of interest here, for KQ 3, we excluded studies in which a reference standard
confirmation of depression was required for enrollment.
For the first part of KQ 1, we included both prospective and
retrospective studies of the prevalence and incidence of
perinatal depression; for the second part, we included clinical
trials and case-control studies comparing the incidence or
prevalence of depression among pregnant women and newly
delivered mothers to prevalence among women of similar age
during nonchildbearing periods of their lives. We included
only prospective studies in those reviewed for KQs 2 and 3 and
only controlled trials to provide evidence of the effectiveness of
interventions among women at high risk of perinatal depression
for KQ 3.
Literature Search and Retrieval Process
We used three strategies to identify studies providing
evidence related to the key questions: systematic searches of
electronic databases using both a list of Medical Subject
Heading (MeSH®) search terms and author names, hand
searches of reference lists of included articles, and consultation
with the TEAG. We searched standard electronic databases,
including MEDLINE®, Cumulative Index to Nursing &
Allied Health Literature (CINAHL), PsycINFO, Sociofile, and
the Cochrane Library. We found a total of 837 citations in the
electronic searches and picked up an additional 9 citations
through the hand searches and discussion with the TEAG, for a
total of 846 citations.
Three senior reviewers with clinical expertise in perinatal
depression reviewed the abstracts of articles identified during
the literature search. Two clinicians evaluated each abstract
against the inclusion criteria and resolved any differences in
inclusion by consensus. In several instances, the abstracts did
not provide enough information to make an inclusion decision;
we pulled full articles to review for those studies. Of the 846
articles identified, 729 did not meet the inclusion criteria for
any of the key questions and were therefore excluded, 8 studies
were pulled for background only, and the remaining 109
articles were pulled for a full review.
Among the studies pulled for full review, 50 did not meet
our inclusion/exclusion criteria for any of the three key
questions. The most common reason for exclusion was the
absence of a gold standard (i.e., either a clinical assessment or
structured clinical interview) for assessing depression, which
eliminated 26 studies. We excluded 10 of the studies pulled for
the evaluation of the properties of screening instruments
because they did not report sensitivity and specificity or data
that we could use to compute those measures. Other reasons for exclusion were restriction of the study sample to specific
population subgroups (e.g., teenagers, patients of psychiatric
hospitals), depression assessed after the first year postpartum,
no depression outcome measured, and a retrospective study
design.
The remaining 59 studies were included in the review; some
met the inclusion criteria for more than one key question.
Thirty studies were abstracted for KQ 1; 23, for KQ 2; and 15,
for KQ 3.
Data Abstraction and Assessment
The data collection process involved abstracting relevant
information from the eligible articles and generating evidence
tables that present the key details of the study design and the
major findings from the articles. Each article was read and
abstracted by a trained member of the study team; a second
member checked the table entries for accuracy against the
original article.
We also rated the quality of the studies. We developed a
quality rating form for the screening accuracy (KQ 2) articles
from criteria identified by the Cochrane Methods Working
Group on Systematic Review of Screening and Diagnostic
Tests.21 For studies addressing KQ 1 and KQ 3, we modified
the quality rating forms developed by Downs and Black for
randomized controlled trials (RCTs) and observational studies.22
The quality rating forms dealt with the reporting completeness
and clarity, external validity, internal validity, and power or
precision of each study. The senior abstractor completed the
quality rating form for each article; another project team
member then reviewed the completed form for accuracy and
completeness.
In addition to the individual studies, we also rated the
strength of the collective evidence on each key question. We
applied four criteria:
- The number of studies.
- The aggregate sample sizes over the studies.
- The quality of the individual studies.
- The representativeness of the study populations included in the studies.
Meta-Analysis
We conducted a meta-analysis of the different prevalence
and incidence estimates from studies abstracted for KQ 1 to
compute combined prevalence and incidence estimates for
particular periods and points in time. We also conducted
meta-analyses of the different estimates of the receiver operating
characteristic (ROC) curves for screening instruments evaluated
for KQ 2. Because of the diversity of screening instruments
and prevention interventions in the studies found for KQ 3, we
did not conduct a meta-analysis for this key question.
Key Question 1
For KQ 1, we combined all estimates with the same
diagnosis, estimate type, and time period using the meta
command in Stata. This procedure uses the inverse-variance
weighting method to calculate random effects summary
estimates. It also produces Q tests of the homogeneity of the
estimates, forest plots of the individual study estimates, and
combined estimates and their confidence intervals. To satisfy
the normalcy assumptions of these methods, we first
transformed the prevalence estimates into log odds estimates.
We reviewed the forest plots of the studies in each summary
estimate to determine whether we could identify the source of
any heterogeneity between studies. We then reran the meta-analyses
excluding studies that were obvious outliers and for
which we could identify the source of the bias. The new
summary estimates are considered our best estimates of the
prevalence and incidence of perinatal depression for the general
female population in the United States and other developed
countries.
To further analyze associations between the prevalence of
depression and study characteristics, we conducted cumulative
meta-analysis and a series of meta-regressions on the point
prevalence estimates for major and minor depression together
and major depression alone.
Key Question 2
For KQ 2, our main outcomes of interest were sensitivity
and specificity of the screening approaches or instruments as
described in the selected articles. Sensitivity refers to the
proportion of patients with a disease who test positive ("true
positives"); specificity refers to the proportion of patients
without a disease who test negative ("true negatives").
For each reported instrument and associated cutoff, we
calculated sensitivity and specificity from the published data
and constructed 95-percent confidence intervals (CIs) using
exact methods. For instruments with three or more estimates
at a particular cutoff, we created plots of the sensitivity or
specificity with associated 95-percent CIs to provide a graphic
description of the degree of consistency of results. In addition,
where possible, we estimated pooled sensitivity and specificity
values using meta-analytic methods for fixed effects. We
evaluated heterogeneity using the Q statistic test for
homogeneity. In several circumstances, pooled estimates were
not possible to calculate because of perfect estimates of
sensitivity (i.e., 100 percent) with associated variance estimates
equal to zero.
Peer Review
As is customary for all evidence reports and systematic
reviews done for AHRQ, the RTI-UNC EPC requested review
of the draft report from a wide array of outside experts in the
field and from relevant professional societies and public
organizations. AHRQ also requested review from its own staff
and appropriate Federal agencies. We revised this final report
on the basis of that feedback.
Return to Contents
Results
Prevalence and Incidence of Depression
We found 30 studies providing estimates of the prevalence of
perinatal depression.14,19,23-49 Some rates were reported as point
prevalences, the percentage of the population with depression at
a given point in time (e.g., at 24 weeks gestational age or 9 weeks postpartum); others were reported as period prevalences,
the percentage of the population with depression over a period
of time (e.g., during pregnancy or from delivery to the end of
the first 3 months postpartum). Only 13 studies provided
estimates of the incidence of the disorder (i.e., the percentage of
the population with depressive episodes that begin within a
given period of time).
The studies were generally of moderate size—too small for
reliable subgroup analyses. Furthermore, the study populations
were typically restricted to a local community or geographic
region served by one provider or a small number of providers of
obstetrical services and were not representative of the racial and
ethnic mix of the countries in which the studies were
conducted. Other confounders included the risk status of
women at study entry, their socioeconomic status, the interview
methods, and the diagnostic criteria used to identify cases.
Our final combined estimates of prevalence and incidence
were somewhat lower than those found in prior systematic
reviews for three reasons. First, we excluded studies that
assessed depression based on self-report screens alone, which
have been found to overestimate prevalence. Second, we
separated out estimates of major and minor depression from
estimates of major depression alone. Third, we included more
recent studies that use more precise criteria to identify major
depression.
For major depression alone, our final combined point
prevalence estimates ranged from 3.1 percent to 4.9 percent at
different times during pregnancy and from 1.0 percent to 5.9
percent at different times during the first postpartum year. For
major and minor depression, our final combined estimates of
point prevalence ranged from 8.5 percent to 11.0 percent at
different times during pregnancy and from 6.5 percent to 12.9
percent at different times during the first year postpartum.
This nearly twofold higher rate suggests that approximately half
of the women experience a major depressive episode and half a
minor depressive episode at any given time. Confidence
intervals surrounding all of these estimates remain wide,
suggesting that a fair amount of uncertainty remains in the
combined estimates.
Fewer estimates were available for the incidence of
depression. These limited data suggest that as many as 14.5
percent of pregnant women have a new episode of major or
minor depression during pregnancy and 14.5 percent have a
new episode during the first 3 months postpartum.
Considering only major depression, 7.5 percent may have a
new episode during pregnancy, with 6.5 percent having a new
episode in the first 3 months postpartum.
Prevalence estimates for perinatal depression were not
significantly different from the prevalence of depression among
women of similar age who were not pregnant and had not
recently given birth.45-47 However, Cox et al. found that, in the
first 5 weeks postpartum, the odds of a new episode of major
depression are three times that of a comparison group of
females.46 Thus, data from this one study suggest that, after an
event as psychologically and physiologically stressful as labor
and delivery, the likelihood of a new episode of depression may
be substantially higher than in a likely less stressed group of
women of similar age.
Accuracy of Screening Tools
For our analysis of the accuracy of screening tools (KQ 2),
we identified 10 studies reporting test characteristics for
English-language screeners.27,40,42,50-56 In general, studies were of
fair to good quality, although external validity was only poor to
fair. Specifically, the study populations were nearly entirely
white, so the accuracy of these screeners in other perinatal
populations is not clear. A major limitation in the available
evidence is the very small number of depressed patients
involved, a fact that results in substantial imprecision in the
point estimate of sensitivity and prevented us from reasonably
determining an ideal cutoff point.
For depression during pregnancy, we found only one study
reporting on screening accuracy in a population, with 6
patients with major depression and 14 patients with either
major or minor depression. For major depression, sensitivities
for the Edinburgh Postnatal Depression Scale (EPDS) at all
thresholds evaluated (12, 13, 14, 15) were 1.0, underscoring
the markedly small number of depressed patients involved;
specificities ranged from 0.79 (at EPDS >12) to 0.96 (at EPDS
>15). For major or minor depression, sensitivity was much
poorer (0.57 to 0.71), and specificity remained fairly high (0.72
to 0.95).
For postpartum depression, also, the small number of
depressed patients involved in the studies precluded identifying
an optimal screener or an optimal threshold for screening. Our
ability to combine the results of different studies in a meta-analysis
was limited by the use of multiple cutoffs and other
differences in the studies that would have made the pooled
estimate hard to interpret. Where we were able to combine the
results through meta-analysis, the pooled analysis did not add
to what one could conclude from individual studies.
For women with major depression alone, specificity for all
screeners (the Beck Depression Inventory [BDI], the
Postpartum Depression Screening Scale [PDSS], and the
EPDS) was relatively high and overlapped substantially. This
finding suggests that a positive screen was accurate in ruling
major depression in; that is, the risk that a screen with one of
these instruments would be falsely positive was low. By
contrast, sensitivities varied much more. The EPDS and the
PDSS appeared to be more sensitive (with estimates ranging
from 0.75 to 1.0 at different thresholds) than the BDI
instruments (with estimates from 0.32 to 0.68), but the wide
CIs overlapped nearly completely. Thus, we could not say with
confidence that the sensitivity estimates using the different tools
were different.
The point estimates are consistent with what is reported for
depression screeners in primary care settings.57 Still, the
imprecision is important to clarify. If falsely missing depression
(a false negative) is worse than falsely identifying it (as may be
the case with this disorder), clinicians must be able to feel confident that the screen is usually positive if the disease is
there and that a negative result can help rule out the illness.
For patients with major or minor depression, results were
reported for EPDS, BDI, PDSS, and the Center for
Epidemiologic Studies Depression Scale (CES-D). Specificity
estimates remained relatively high, but sensitivity results were
much lower (ranging from 0.43 to 0.71) than for major
depression alone. This means that the ability of the screening
instrument to score women as positive for this condition when
the disease is present was poorer than for major depression
alone. Again, neither any particular cutoff nor any particular
screening instrument performed differently from the others.
No available comparators were found for primary care
populations.
Our results suggest that various screening instruments can
identify perinatal depression, most accurately major depression,
but clinicians need to know more about precision. If one
assumes that the risk of a false-negative depression screen is
worse than the risk of a false-positive screen, perinatal
depression is a condition in which sensitivity is likely to be
more important than specificity. Whether as a screen for major
depression alone or for major or minor depression, specificities
appear high and relatively precise. By contrast, sensitivity for
identifying either category is imprecise and differs by diagnostic
category. For major depression alone, point estimates are
equivalent to those found in primary care medical settings. For
major or minor depression, however, sensitivity is quite low. At
this time, these screens do not appear to be useful for
identifying patients in this broader category of illness.
Screening With Subsequent Intervention
KQ 3 concerned issues of whether screening ultimately leads
to improved patient outcomes. Although it is the most vital
question from the public health perspective, it is the one with
the most limited evidence. Indeed, the studies that we
identified were not designed to test whether screening for
depression (versus not screening) improved patient outcomes.
Such a design would randomize patients to be screened or not
to be screened and then compare subsequent outcomes. We
found no studies designed in this way.
Instead, we made use of studies in which women were
screened by formal depression screen or the presence of a risk
factor associated with perinatal depression to identify those at
risk of having a depressive illness; then, for those screening
positive, the investigators compared the outcomes of women
receiving a treatment intervention to those in a control group.
This design tests whether, among women identified as at risk of
depression by a screen, an intervention improves outcomes
compared to the outcomes in a control group. This is an
important intermediary step, but it does not directly test
whether screening itself improves outcome compared to not
screening.
For patients whose screening results identified them as at risk
of perinatal depression and for whom a subsequent
intervention was provided, we identified 15 studies. Four small
prenatal studies involved various psychosocial interventions.58-61
Quality was poor for three of these studies and fair for one.
Overall, the effects of the interventions in these perinatal
studies were not consistently superior to those in the control
groups.
The 11 postpartum studies were of overall fair quality and
had larger sample sizes than the prenatal trials.62-72 Study
populations still reflected only a limited racial and ethnic mix,
and both external validity and the power to demonstrate
statistically significant differences were generally poor. Again,
screening tools and interventions varied considerably; the latter
involved both psychosocial and pharmaceutical interventions.
Results were mixed. Of the nine trials that employed a
psychosocial intervention, six studies62-65,67,68 reported significant
benefit for depression outcomes in the experimental group
compared to those in the control group. The one RCT
involving pharmacologic intervention did not show benefit
relative to the control group.72 Overall, the evidence available is
not sufficient to draw conclusions about this key question.
These results, although limited, do suggest that providing some
form of psychosocial support to pregnant women at risk of
having a depressive illness may decrease depressive symptoms.
Return to Contents
Discussion
The available research suggests that depression is one of the
most common complications of the prenatal and postpartum
periods, and that fairly accurate and feasible screening measures
are available. The prenatal or postpartum periods are clearly
not times for nonpsychiatric clinicians to ignore depression
screening, which is routinely recommended for patients seen in
primary care settings.73,74
Specifics of the course of a depressive
illness with onset during the perinatal period, including the
severe physiologic and psychological challenges unique to this
period that complicate the identification and management of
perinatal depression, seem to suggest that this topic would have
a substantial degree of high-quality research. We were surprised
by the paucity of such evidence in this area. If one assumes
that perinatal depression is a significant mental health and
public health problem, then larger scale studies are needed that
involve each of these domains. The small number and small
size of relevant studies are not adequate to guide national
policy.
Reflecting on the three key questions addressed in this
report, we have concluded generally that the level of research
warrants both improvement and expansion.
For KQ 1,
prevalence studies need to better account for the racial and
ethnic mix of perinatal depression in the U.S. population. We
do not have good evidence on whether perinatal depression
rates differ among various ethnic groups and, if so, how. The
absence of information on populations other than the white
population was dramatic. A better understanding of racial and
ethnic variations could help clinicians know where to target
screening programs and researchers know where to target
studies on screening tools, and it could help researchers clarify
the need for more nationally representative perinatal depression samples. Furthermore, researchers need to clarify whether the
incidence of perinatal depression is greater than the incidence
of depression in nonchildbearing women of similar ages.
For KQ 2, the quality grades point to several areas in which
improvements in study design and conduct are needed. In
particular, future studies on the test characteristics of screeners
must be designed with sample size estimates that take
prevalence into account and that project a reasonably precise
estimate of sensitivity for the particular illness. Moreover,
samples should more closely mirror the target population;
specifically, subsequent studies need to provide a more
representative racial and ethnic mix. In addition, studies
should incorporate a range of other demographic variables that
could influence screening performance, such as socioeconomic
status measures, and assess the screening tools in these
subpopulations.
Furthermore, as Beck and Gable did,51 future research should
continue to assess and directly compare multiple screening
instruments. This design would provide a head-to-head
comparison to allow an evaluation of which screening
instrument is more accurate in the setting in which the
investigations are carried out. Moreover, studies evaluating the
cost-effectiveness of screening—specifically assessing the relative
costs of false-negative and false-positive designation, the degree
of provider burden, and patient acceptability—are needed to
provide insights on how to consider target sensitivity and
specificity when attempting to maximize cost-effectiveness.
Diagnosis is another area of concern. Subsequent studies
should carefully consider whether to target major depression
alone, for which beneficial treatments clearly exist, or a
combined category of major and minor depression, a
heterogeneous group for which treatment benefit is unclear.
Given that our results suggest that available screening tools
identify major depression alone more accurately, and noting
that the general benefit of interventions is more apparent for
major depression alone, we believe that an evidence-based
public health perspective recommends targeting major
depression alone.
Timing is another factor deserving more thought in future
studies. The issue involves both the need for more
epidemiology to confirm prevalence rates at different times as
well as the need to confirm what time point(s) would identify
the greatest number of depressed women. The bulk of the few
screening studies we identified had been conducted in the first 3 months postpartum. Our best estimates of prevalence
suggest that depression may remain high for several more
months.
More studies are needed to better delineate periods of
peak prevalence and incidence—to include not just 3 months
but also 6 weeks, 6 months, and 12 months—and subsequent
screening studies need to consider testing properties of
screening at these later time periods. The very small number of
adequate studies currently available hampers plans for screening
and intervention programs because the best time for screening,
and hence the best clinic location, is not clear. If peak
prevalence and incidence occur within the first 6 weeks, the
obstetrics clinic is a prime place to target resources for such a
program. If, however, it peaks after this time, most postpartum
women will have completed their followup care with an
obstetrician, so programs in an obstetrics clinic may be less
helpful. In this case, it is possible that programs targeting new
mothers in family medicine, internal medicine, or pediatric
clinics might be more effective.
For KQ 3, several similar or related issues emerged as well.
First, studies addressing the relationship between screening and
outcome need to recruit and retain sample sizes that are large
enough to yield adequate power to detect relevant differences.
Second, screening and outcome studies must include
populations with a racial and ethnic mix that is more
representative of the U.S. populations than the work we have
seen to date. Third, interventions involved should be more
consistent with what we know as evidence-based treatments for
depression,9 i.e., antidepressant medications75 and/or
psychotherapies such as cognitive behavioral therapy76 or
interpersonal psychotherapy.77
Another major issue is the types of screening measures to be
used henceforth. Of the three KQ 3 studies rated as good,62,65,72
only the one by Dennis and colleagues used a depression
screener (EPDS).65 Researchers should consider developing and
using standard screening measures and using similar cutoff
points, so that some elements of separate studies could more
readily be compared. Screening tools with the best supporting
evidence would seem to be the best candidates. While the
evidence base remains quite limited and any conclusions are
preliminary, at this time those instruments would appear to be
the EPDS or the PDSS. For major depression alone, an EPDS
cutoff of >13 or a PDSS cutoff of >81 are reasonably supported
by the evidence as thresholds to use. For major or minor
depression, we found the results too inconclusive to make even
a preliminary recommendation.
Finally, studies should be designed to address whether the
screening process itself leads to better access to proven
treatment and improved outcome relative to usual care. We
support additional research on interventions per se, but we
conclude that important questions remain about the impact of
the screening element. Reviewing studies that used screening as
a means of identifying women potentially at high risk and
enrolling them in interventional studies is not a sufficient
approach to answering issues about the effectiveness of
screening.
Return to Contents
Proceed to Next Section