Skip Navigation

What Works Clearinghouse


WWC Tutorial on Mismatch between Unit of Assignment and Unit of Analysis
WWC Tutorial on Mismatch between Unit of Assignment and Unit of Analysis

The mismatch problem of concern here occurs when the units of assignment do not match the units of analysis in a study of an intervention and this feature of the study’s design is ignored in the study’s data analysis. For instance, a study may have assigned entire classrooms (the unit of assignment) to the intervention and control conditions. But the study analyzed data at the individual student level rather than at the classroom level or at both the classroom level and student level. Such analyses are common, but they are incorrect on statistical grounds.

This kind of mismatch leads to statistics with greater apparent precision than they actually have, because students are treated as independent units when they are not. By ignoring the design effect due to the clustering of students within classrooms (Kish, 1965), such analyses are likely to yield misleadingly high levels of statistical significance (p values that are too small) and misleadingly narrow confidence intervals for an observed difference between intervention and control conditions. In a well executed randomized trial, for example, the estimates of a difference will be statistically unbiased, but statements about statistical tests of hypotheses and about one’s confidence in results may not be correct.

In particular, a difference found to be statistically significant under an incorrect mismatch analysis could, under a correct analysis, turn out to be not statistically significant. A difference found to be not statistically significant under an improper mismatch analysis, on the other hand, would generally remain non-significant under a correct analysis.

Calculating effect sizes, confidence intervals, p values for statistical tests, and standardized effect sizes correctly, when groups are the units of assignment, requires information that is often not available in original reports when study authors analyzed the data incorrectly. In particular, to properly analyze the data, one needs to (a) know the intraclass correlation, which represents the degree to which individuals are dependent on each other within groups, or (b) employ methods such as hierarchical linear modeling that take this relationship into account. This intraclass correlation is rarely reported in studies with the mismatch problem. And hierarchical linear modeling and related approaches usually require access to and resources for reanalyzing original micro-record data. These are often not available.

Example: Consider a study in which 10 classrooms, each containing 20 students, were randomly allocated to an intervention and control conditions. Classes were then the units of assignment, with five classes in each condition. If students were independent of one another (i.e., intraclass correlation = 0), a statistical test that used students as the units of analysis would have an actual probability of rejecting the null hypothesis of .05. If the intraclass correlation among students was .05, the actual probability of rejection would be .16. If the intraclass correlation was .10, the probability of rejection would be .26.

Ignoring the intraclass correlation, therefore, will lead to specious declarations of statistical significance. The problem was recognized in the 1980s by Wolins (1982) among others, but its importance has become clear as a consequence of more recent work. See Hedges (2005) for technical detail and discussion of contemporary work.

PO Box 2393
Princeton, NJ 08543-2393
Phone: 1-866-503-6114