Reliance on Scientific Evidence

The statements and conclusions throughout this report are documented by reference to studies published in the scientific literature. For the most part, this report cites studies of empirical—rather than theoretical—research, peer-reviewed journal articles including reviews that integrate findings from numerous studies, and books by recognized experts. When a study has been accepted for publication but the publication has not yet appeared, owing to the delay between acceptance and final publication, the study is referred to as “in press.” The report refers, on occasion, to unpublished research by means of reference to a presentation at a professional meeting or to a “personal communication” from the researcher, a practice that also is used sparingly in professional journals. These personal references are to acknowledged experts whose research is in progress.

Research Methods

Quality research rests on accepted methods of testing hypotheses. Two of the more common research methods used in the mental health field are experimental research and correlational research. Experimental research is the preferred method for assessing causation but may be too difficult or too expensive to conduct. Experimental research strives to discover cause and effect relationships, such as whether a new drug is effective for treating a mental disorder. In an experimental study, the investigator deliberately introduces an intervention to determine its consequences (i.e., the drug’s efficacy). The investigator sets up an experiment comparing the effects of giving the new drug to one group of people, the experimental group, while giving a placebo (an inert pill) to another group, the so-called control group. The incorporation of a control group rules out the possibility that something other than the experimental treatment (i.e., the new drug) produces the results. The difference in outcome between the experimental and control group—which, in this case, may be the reduction or elimination of the symptoms of the disorder—then can be causally attributed to the drug. Similarly, in an experimental study of a psychological treatment, the experimental group is given a new type of psychotherapy, while the control or comparison group receives either no psychotherapy or a different form of psychotherapy. With both pharmacological and psychological studies, the best way to assign study participants, called subjects, either to the treatment or the control (or comparison) group is by assigning them randomly to different treatment groups. Randomization reduces bias in the results. An experimental study in humans with randomization is called a randomized controlled trial.

Correlational research is employed when experimental research is logistically, ethically, or financially impossible. Instead of deliberately introducing an intervention, researchers observe relationships to uncover whether two factors are associated, or correlated. Studying the relationship between stress and depression is illustrative. It would be unthinkable to introduce seriously stressful events to see if they cause depression. A correlational study in this case would compare a group of people already experiencing high levels of stress with another group experiencing low levels of stress to determine whether the high-stress group is more likely to develop depression. If this happens, then the results would indicate that high levels of stress are associated with depression. The limitation of this type of study is that it only can be used to establish associations, not cause and effect relationships. (The positive relationship between stress and depression is discussed most thoroughly in Chapter 4.)

Controlled studies—that is, studies with control or comparison groups—are considered superior to uncontrolled studies. But not every question in mental health can be studied with a control or comparison group. Findings from an uncontrolled study may be better than no information at all. An uncontrolled study also may be beneficial in generating hypotheses or in testing the feasibility of an intervention. The results presumably would lead to a controlled study. In short, uncontrolled studies offer a good starting point but are never conclusive by themselves.

Levels of Evidence

In science, no single study by itself, however well designed, is generally considered sufficient to establish causation. The findings need to be replicated by other investigators to gain widespread acceptance by the scientific community.

The strength of the evidence amassed for any scientific fact or conclusion is referred to as “the level of evidence.” The level of evidence, for example, to justify the entry of a new drug into the marketplace has to be substantial enough to meet with approval by the U.S. Food and Drug Administration (FDA). According to U.S. drug law, a new drug’s safety and efficacy must be established through controlled clinical trials conducted by the drug’s manufacturer or sponsor (FDA, 1998). The FDA’s decision to approve a drug represents the culmination of a lengthy, research-intensive process of drug development, which often consumes years of animal testing followed by human clinical trials (DiMasi & Lasagna, 1995). The FDA requires three phases of clinical trials3 before a new drug can be approved for marketing (FDA, 1998).

With psychotherapy, the level of evidence similarly must be high. Although there are no formal Federal laws governing which psychotherapies can be introduced into practice, professional groups and experts in the field strive to assess the level of evidence in a given area through task forces, review articles, and other methods for evaluating the body of published studies on a topic. This Surgeon General’s report is replete with references to such evaluations. One of the most prominent series of evaluations was set in motion by a group within the American Psychological Association (APA), one of the main professional organizations of psychologists. Beginning in the mid-1990s, the APA’s Division of Clinical Psychology convened task forces with the objective of establishing which psychotherapies were of proven efficacy. To guide their evaluation, the first task force created a set of criteria that also was used or adapted by subsequent task forces. The first task force actually developed two sets of criteria: the first, and more rigorous, set of criteria was for Well-Established Treatments, while the other set was for Probably Efficacious Treatments (Chambless et al., 1996). For a psychotherapy to be well established, at least two experiments with group designs or similar types of studies must have been published to demonstrate efficacy. Chapters 3 through 5 of this report describe the findings of the task forces in relation to psychotherapies for children, adults, and older adults. Some types of psychotherapies that do not meet the criteria might be effective but may not have been studied sufficiently.

Another way of evaluating a collection of studies is through a formal statistical technique called a meta-analysis. A meta-analysis is a way of combining results from multiple studies. Its goal is to determine the size and consistency of the “effect” of a particular treatment or other intervention observed across the studies. The statistical technique makes the results of different studies comparable so that an overall “effect size” for the treatment can be identified. A meta-analysis determines if there is consistent evidence of a statistically significant effect of a specified treatment and estimates the size of the effect, according to widely accepted standards for a small, medium, or large effect.

3 The first phase is to establish safety (Phase I), while the latter two phases establish efficacy through small and then large-scale randomized controlled clinical trials (Phases II and III) (FDA, 1998).

