Skip Navigation

What Works Clearinghouse


Key Items To Get Right When Conducting a Randomized Controlled Trial in Education
Key Items To Get Right When Conducting a Randomized Controlled Trial in Education
December 2005

1. Key Items to Get Right in Planning the Study


Decide on (i) the specific intervention to be evaluated, and (ii) the key outcomes to be measured. These should include, wherever possible, the ultimate outcomes the intervention seeks to affect.

For example, a study of a third-grade remedial reading program should, to the extent possible, evaluate the program’s effect on ultimate outcomes such as reading comprehension, and not just surrogate outcomes such as word attack or word identification skills. Similarly, a study of a middle-school substance-abuse prevention program should, wherever possible, evaluate the program’s effect on ultimate outcomes such as initiation of drug, alcohol, or tobacco use, and not just surrogate outcomes such as attitudes toward drugs. The reason is that improvements in surrogate outcomes (e.g., word attack/identification skills, attitudes toward drugs) may not always translate into improvements in the ultimate outcomes of interest (reading proficiency, reduction in drug use).

Decide whether the study should randomly assign individuals (e.g., students), or groups (e.g., classrooms or schools), to determine the intervention’s effect.

Random assignment of individuals is usually the most efficient and least expensive approach. However, it may be necessary to randomly assign groups rather than, or in addition to, individuals in situations such as the following:

  1. The intervention may have sizeable "spillover" effects on individuals other than those who receive it.

    For example, if there is good reason to believe that a school-based substance-abuse prevention program may produce sizeable reductions in drug use not only among the students in the program, but also among their peers within the school (through peer influence), it will probably be necessary to randomly assign whole schools to intervention and control groups to determine the program’s effect. A study that only randomizes individual students within a school to intervention versus control groups will underestimate the program’s effect to the extent the program reduces drug use among both intervention and control-group students in the school.

    For interventions where this spillover effect is likely to be small, however, random assignment within a school—of individual students and/or of classrooms and teachers—may still be a viable approach, and the more cost-effective one.
     
  2. The intervention is delivered to groups such as classrooms or schools (e.g., a classroom curriculum or schoolwide reform program), and you want to distinguish the effect of the intervention from the effect of other group characteristics (e.g., quality of the classroom teacher).

    For example, in a study of a new classroom curriculum, classrooms in the sample will usually differ in two ways: (i) whether they use the new curriculum or not, and (ii) who is teaching the class. Therefore, if the study (for example) randomly assigns individual students to two classrooms that use the curriculum versus two classrooms that don’t, the study will not be able to distinguish the effect of the curriculum from the effect of other classroom characteristics, such as the quality of the teacher. Such a study will therefore probably need to randomly assign whole classrooms and teachers (a sufficient sample of each) to intervention and control groups, to ensure that the two groups are equivalent not only in student characteristics but also in classroom and teacher characteristics.

    For similar reasons, a study of a schoolwide reform program will probably need to randomly assign whole schools to intervention and control groups, to ensure that the two groups are equivalent not only in student characteristics but also school characteristics (e.g., teacher quality, average class size).

Conduct a statistical analysis to estimate the minimum number of individuals and/or groups to randomize in order to determine whether the intervention has a meaningful effect.1

The purpose of such an analysis—known as a "power" analysis—is to ensure that enough individuals or groups are randomized to be confident that the study will detect meaningful effects of the intervention should they exist. The analysis will require you to make a judgment about the minimum effect size you are seeking to detect—a judgment that you might base on such factors as (i) what previous studies suggest as the intervention’s likely effect size, and (ii) what effect size would justify the intervention’s cost, or make adoption of the intervention attractive to schools seeking a gain in student achievement or other outcomes of a certain magnitude. Useful resources for conducting a power analysis are referenced in the endnote.2

It is important that the power analysis take into account key features of the study design, including:

  1. Whether individuals (e.g., students) and/or groups (e.g., classrooms or schools) will be randomly assigned.
     
  2. Whether the sample will be sorted into groups prior to randomization—such as high-achievers, average-achievers, and low-achievers—with the random assignment taking place within each group. (Such sorting is known as "stratification" or "blocking.")
     
  3. Whether the study intends its estimates of the intervention’s effect (i) to apply only to the sites—e.g., schools—in the study, or (ii) to be generalizable to a larger population—e.g., to all schools participating in a federal program. (The two approaches are known respectively as "fixed-effects" versus "random-effects" models.)
     
  4. Whether, in analyzing study outcomes, statistical methods (e.g., multivariate regression analysis) will be used to increase the study’s ability to detect meaningful effects of the intervention.
PO Box 2393
Princeton, NJ 08543-2393
Phone: 1-866-503-6114