Skip Navigation

What Works Clearinghouse


Appendices


Appendix A1 Study Characteristics: Preschool Curriculum Evaluation Research Consortium, 2008 (randomized controlled trial)

Characteristic Description
Study citation Preschool Curriculum Evaluation Research (PCER) Consortium. (2008). Chapter 5. Curiosity Corner: Success for All Foundation. In Effects of Preschool Curriculum Programs on School Readiness (pp. 75–83). Washington, DC: National Center for Education Research, Institute of Education Sciences, U.S. Department of Education.
Participants In this study, 18 preschools were randomly assigned to intervention (10 schools) or comparison (8 schools) conditions. Prior to random assignment, schools were sorted into blocks on a number of conditions, including teacher experience, school location, and state report card score. Random assignment occurred within each block. From the schools, 31 preschool classrooms participated in the study (14 intervention classrooms and 17 control classrooms). Participants included 215 preschool-age children whose parents consented to their participation in the study. At baseline, children were an average 4.7 years old, half were male, half were African-American, and 14% were reported as having a disability. Although the intervention and comparison groups were similar in race and disability status, the treatment group had more boys (61%) than the comparison group (38%), a difference that was statistically significant. Attrition from the analysis sample (children with parent consent) was low: 2% at baseline, 5% at end-of-preschool posttest, and 10% at end-of-kindergarten follow-up. Response rates varied by measure but were comparable across treatment and control groups.
Setting The study was conducted in 18 schools (31 classrooms) in Florida, Kansas, and New Jersey.
Intervention Intervention group children participated in Curiosity Corner. A Success For All (SFA) implementation measure was used by SFA trainers, who visited each classroom at least three times during the year and rated the implementation of each classroom. Fidelity of the classrooms was rated on a four point scale, ranging from “Not at all” (0) to “High” (3). The average fidelity score of the intervention classrooms was 2.0.
Comparison The comparison condition varied across schools. Comparison schools in Florida primarily used the Creative Curriculum. The Kansas comparison schools participated in a blend of the Preschool and Language Stimulation curriculum and the Animated Literacy curriculum. Comparison schools in New Jersey used a teacher-developed curriculum. Comparison classrooms were visited twice a year by the trainers and rated using the same implementation measure as was used for the intervention classrooms. The average fidelity score of the comparison classrooms was 1.9.
Primary outcomes and measurement The primary outcome domains assessed were the children’s oral language, print knowledge, phonological processing, and math. Oral language was assessed with the Peabody Picture Vocabulary Test-III (PPVT-III) and the Test of Language Development-Primary III (TOLD-P:3) Grammatic Understanding subtest. Print knowledge was assessed with the Test of Early Reading Ability-III (TERA-3), Woodcock-Johnson III (WJ III) Letter-Word Identification subtest, and the WJ III Spelling subtest. Phonological processing was assessed with the Preschool Comprehensive Test of Phonological and Print Processing (Pre-CTOPPP) Elision subtest. Math was assessed with the WJ III Applied Problems subtest, the Child Math Assessment-Abbreviated (CMA-A), and the Building Blocks, Shape Composition task. For a more detailed description of these outcome measures, see Appendices A2.1–2.5.
Staff/teacher training Success for All staff provided an initial training session for the intervention teachers and ongoing implementation support, including three visits a year to conduct observations and provide feedback.

Top

Appendix A1.2 Study characteristics: Chambers, Chamberlain, Hurley, and Slavin, 2001 (quasi-experimental design)

Characteristic Description
Study citation Chambers, B., Chamberlain, A., Hurley, E. A., & Slavin, R. E. (2001). Curiosity Corner: Enhancing preschoolers’ language abilities through comprehensive reform. Paper presented at the Annual Meeting of the American Educational Research Association, Seattle, WA, April 2001.
Participants1 The study began with 448 low-income preschool children who ranged in age from two years, seven months to four years, eleven months. At posttest, 316 children were included in the study, with analysis samples ranging from 311 to 315. The three-year-olds were from private early childhood centers (n = 169), and the four-year-olds were from public preschools (n = 147). In the final sample, 68% of the children were African-American, 16% Caucasian, and 11% Hispanic; 49% were female. Eight preschools (public and private) were assigned to the Curiosity Corner intervention group, and eight preschools (public and private) matched on demographic characteristics were used as the comparison group.
Setting The study took place in 16 preschools (a mix of public and private) in four high-poverty, urban school districts in New Jersey. All of the preschools were in Abbott districts and working to meet Abbott guidelines.
Intervention The intervention group children participated in Curiosity Corner during the pilot year of the curriculum. Curiosity Corner was designed with 38 weekly thematic units. Additional information on duration, frequency, and intensity of implementation was not reported.
Comparison The comparison group children participated in the regular early childhood curriculum at their preschool centers.
Primary outcomes and measurement The primary outcome domains were children’s oral language and cognition. The study used three subtests of a standardized test (the Mullen Scales of Early Learning, American Guidance Services Edition): expressive language, receptive language, and visual reception. The study also used the Early Childhood Environment Rating Scale-Revised (ECERS-R) to evaluate classroom quality, but the measure is not included in this WWC review because it is not relevant to the topic review. For a more detailed description of these outcome measures, see Appendices A2.1 and 2.4.
Staff/teacher training The program provided teachers with detailed lesson instructions in the teacher’s manual and materials for instructional activities. Teachers, teaching assistants, and administrators were trained in two-day initial training sessions, followed by six in-class visits by a Success for All Foundation (SFA) trainer. In addition, teachers were observed, mentored, and supported by Curiosity Corner coaches from the school districts, who were trained by SFA staff over a two-year period. Coaches also offered workshops to help teachers implement the curriculum.
1 Information on total sample size and the number of schools in each condition was provided by the study authors upon WWC request.

Top

Appendix A2.1 outcome measures for the oral language domain

Outcome measure Description
Mullen Scales of Early Learning (MSEL) Expressive Language Scale A scale from a standardized measure of children’s expressive language skills, such as speaking and forming language (as cited in Chambers et al., 2001).
Mullen Scales of Early Learning Receptive Language Scale A scale from a standardized measure of children’s receptive language skills, such as auditory organization, sequencing, and use of spatial concepts (as cited in Chambers et al., 2001).
Peabody Picture Vocabulary Test-3rd Edition (PPVT-III) A standardized measure of children’s receptive vocabulary where children show understanding of a spoken word by pointing to a picture that best represents the meaning (as cited in PCER Consortium, 2008).
Test of Language Development-Primary III (TOLD-P:3) Grammatic Understanding subtest A standardized measure of children’s ability to comprehend the meaning of sentences by selecting pictures that most accurately represent the sentence (as cited in PCER Consortium, 2008).

Top

Appendix A2.2 Outcome measures for the print knowledge domain

Outcome measure Description
Test of Early Reading Ability III (TERA-3) A standardized measure of children’s developing reading skills with three subtests: alphabet, conventions, and meaning (as cited in PCER Consortium, 2008).
Woodcock-Johnson III (WJ III) Letter-Word Identification subtest A standardized measure of identification of letters and reading of words (as cited in PCER Consortium, 2008).
Woodcock-Johnson III Spelling subtest A standardized measure that assesses children’s prewriting skills, such as drawing lines, tracing, and writing letters (as cited in PCER Consortium, 2008).

Top

Appendix A2.3 Outcome measures for the phonological processing domain

Outcome measure Description
Preschool Comprehensive Test of Phonological and Print Processing (Pre-CTOPPP), Elision subtest A measure of children’s ability to identify and manipulate sounds in spoken words, using word prompts and picture plates for the first nine items and word prompts only for later items (as cited in PCER Consortium, 2008).

Top

Appendix A2.4 Outcome measures for the cognition domain

Outcome measure Description
Mullen Scales of Early Learning Visual Reception Scale A scale from a standardized measure of children’s cognitive ability to process visual patterns (as cited in Chambers et al., 2001).

Top

Appendix A2.5 Outcome measures for the math domain

Outcome measure Description
Woodcock-Johnson III
Applied Problems subtest
A standardized measure of children’s ability to solve numerical and spatial problems, presented verbally with accompanying pictures of objects (as cited in PCER Consortium, 2008).
Child Math Assessment-Abbreviated (CMA-A)
Composite Score
The average of four subscales: (1) solving addition and subtraction problems using visible objects, (2) constructing a set of objects equal in number to a given set, (3) recognizing shapes, and (4) copying a pattern using objects that vary in color and identity from the model pattern (as cited in PCER Consortium, 2008).
Building Blocks, Shape Composition task Modified for PCER from the Building Blocks assessment tools. Children use blocks to fill in a puzzle and are assessed on whether they fill the puzzle without gaps or hangovers (as cited in PCER Consortium, 2008).

Top

Appendix A3.1 Summary of study findings included in the rating for the oral language domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference4 (Curiosity Corner — comparison) Effect size5 Statistical significance6
(at α = 0.05)
Improvement index7
Chambers et al., 2001 (quasi-experimental design)8
MSEL Expressive Language Scale 3-year-olds 16/1679 39.26
(5.04)
37.54
(4.30)
1.72 0.36 ns +14
MSEL Expressive Language Scale 4-year-olds 12/1469 43.58
(4.55)
43.29
(4.01)
0.29 0.07 ns +3
MSEL Receptive Language Scale 3-year-olds 16/1689 37.76
(4.40)
37.52
(4.68)
0.24 0.05 ns +2
MSEL Receptive Language Scale 4-year-olds 12/1479 43.10
(4.32)
42.85
(3.78)
0.25 0.06` ns +2
Average for oral language (Chambers et al., 2001)10 0.13 ns +5
PCER Consortium, 2008 (randomized controlled trial)7
PPVT-III Preschoolers 18/201 nr nr –0.17 –0.01 ns 0
TOLD-P:3 Grammatic
Understanding subtest
Preschoolers 18/199 nr nr –0.38 –0.08 ns –3
Average for oral language (PCER, 2008)10 –0.05 ns –2
Domain average for oral language across all studies10 0.04 na +2

ns = not statistically significant
na = not applicable
nr = not reported
MSEL = Mullen Scales of Early Learning
PPVT-III = Peabody Picture Vocabulary Test-III
TOLD-P:3 = Test of Language Development-Primary, Third Edition

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the oral language domain. Follow-up findings from PCER Consortium (2008) are not included in these ratings but are reported in Appendix A4.1.
2 In the case of Chambers et al. (2001), posttest means are covariate-adjusted means. Chambers et al. (2001) included age and PPVT-III scores at pretest as covariates in the analysis.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
5 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the study authors.
6 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
7 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
8 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Chambers et al. (2001), a correction for clustering was needed, so the significance levels may differ from those reported in the original study. In the case of PCER Consortium (2008), no corrections were needed because the analysis corrected for clustering by using HLM and no impacts were statistically significant.
9 The sample size of schools was provided by the study authors at WWC request.
10 The WWC-computed average effect sizes for each study and for the domain across studies are simple averages rounded to two decimal places. The average improvement indices are calculated from the average effect sizes.

Top

Appendix A3.2 Summary of study findings included in the rating for the print knowledge domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial)7
TERA-3 Preschoolers 18/200 nr nr 0.83 0.10 ns +4
WJ III Letter Word
Identification subtest
Preschoolers 18/177 nr nr 2.42 0.09 ns +4
WJ III Spelling subtest Preschoolers 18/194 nr nr 0.97 0.04 ns +2
Domain average for print knowledge8 0.08 na +3

ns = not statistically significant
na = not applicable
nr = not reported
TERA-3 = Test of Early Reading Ability
WJ III = Woodcock-Johnson III

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the print knowledge domain. Follow-up findings from the same study are not included in these ratings but are reported in Appendix A4.2.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the study authors.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no corrections were needed because the analysis corrected for clustering by using HLM and no impacts were statistically significant.
8 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.3 Summary of study findings included in the rating for the phonological processing domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial)7
Pre-CTOPPP Elision subtest Preschoolers 18/204 nr nr –0.04 0.18 ns +7
Domain average for phonological processing8 0.18 na +7

ns = not statistically significant
na = not applicable
nr = not reported
Pre-CTOPPP = Preschool Comprehensive Test of Phonological and Print Processing

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the phonological processing domain. Follow-up findings from the same study are not included in these ratings but are reported in Appendix A4.3.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the study authors.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no corrections for clustering or multiple comparisons were needed because the analysis corrected for clustering by using HLM and no impacts were statistically significant.
8 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.4 Summary of study findings included in the rating for the cognition domain1

  Authors' findings
from the study
 
  Mean outcome2
(standard deviation)3
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference4 (Curiosity Corner — comparison) Effect size5 Statistical significance6
(at α = 0.05)
Improvement index7
Chambers et al., 2001 (quasi-experimental design)8
MSEL Visual Reception Scale 3-year-olds 16/1659 42.32
(3.54)
42.66
(4.04)
–0.34 –0.09 ns –4
MSEL Visual Reception Scale 4-year-olds 12/1469 45.49
(3.20)
45.61
(3.20)
–0.12 –0.04 ns –1
Domain average for cognition10 –0.06 na –3

ns = not statistically significant
na = not applicable
MSEL = Mullen Scales of Early Learning

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the cognition domain.
2 In the case of Chambers et al. (2001), posttest means are covariate-adjusted means. Chambers et al. (2001) included age and PPVT-III scores at pretest as covariates in the analysis.
3 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
4 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group.
5 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
6 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
7 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
8 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of Chambers et al. (2001), a correction for clustering was needed, so the significance levels may differ from those reported in the original study.
9 The number of schools was provided by study authors at WWC request. The sample size for the comparison group of 4-year-olds reported in the original study was incorrect and the correct sample size was provided by the study authors.
10 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A3.5 Summary of study findings included in the rating for the math domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial)7
WJ III Applied Problems Preschoolers 18/180 nr nr 1.90 0.10 ns +4
CMA-A Composite Preschoolers 18/204 nr nr 0.00 0.01 ns 0
Shape Composition Preschoolers 18/200 nr nr 0.15 0.16 ns +6
Domain average for math8 0.09 na +4

ns = not statistically significant
na = not applicable
nr = not reported
WJ III = Woodcock-Johnson III
CMA-A = Child Math Assessment-Abbreviated

1 This appendix reports findings considered for the effectiveness rating and the average improvement indices for the math domain. Follow-up findings from the same study are not included in these ratings but are reported in Appendix A4.4.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the authors.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools and for multiple comparisons. For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no corrections were needed because the analysis corrected for clustering by using HLM and no impacts were statistically significant.
8 This row provides the study average, which in this instance is also the domain average. The WWC-computed domain average effect size is a simple average rounded to two decimal places. The domain improvement index is calculated from the average effect size.

Top

Appendix A4.1 Summary of kindergarten follow-up findings for the oral language domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial)7
PPVT-III Kindergarteners 69/189 nr nr 2.42 0.14 ns +6
TOLD-P:3 Grammatic
Understanding subtest
Kindergarteners 69/190 nr nr 0.74 0.15 ns +6

ns = not statistically significant
nr = not reported
PPVT-III = Peabody Picture Vocabulary Test-III
TOLD-P:3 = Test of Language Development-Primary, Third Edition

1This appendix presents follow-up findings for measures that fall in the oral language domain. Posttest scores for preschoolers were used for rating purposes and are presented in Appendix A3.1.
2The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the authors.
5Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no correction for clustering was needed. The WWC does not make corrections for multiple comparisons for follow-up findings.

Top

Appendix A4.2 Summary of kindergarten follow-up findings for the print knowledge domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial; kindergarten follow-up)7
TERA-3 Kindergarteners 69/188 nr nr 3.50 0.43 Statistically significant +17
WJ III Letter Word
Identification subtest
Kindergarteners 69/189 nr nr 11.26 0.43 Statistically significant +17
WJ III Spelling subtest Kindergarteners 69/182 nr nr 5.60 0.20 ns +8

ns = not statistically significant
nr = not reported
TERA-3 = Test of Early Reading Ability
WJ III = Woodcock-Johnson III

1This appendix presents follow-up findings for measures that fall in the print knowledge domain. Posttest scores for preschoolers were used for rating purposes and are presented in Appendix A3.2.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no correction for clustering was needed. The WWC does not make corrections for multiple comparisons for follow-up findings.

Top

Appendix A4.3 Summary of kindergarten follow-up findings for the phonological processing domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial; kindergarten follow-up)7
CTOPP Elision subtest Kindergarteners 69/193 nr nr 0.94 0.25 ns +10

ns = not statistically significant
nr = not reported
CTOPP = Comprehensive Test of Phonological Processing

1 This appendix presents follow-up findings for measures that fall in the phonological processing domain. Posttest scores for preschoolers were used for rating purposes and are presented in Appendix A3.3.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the study authors. The effect size for the CTOPP measure was calculated using an ANCOVA model and is not comparable to the effect sizes for other outcomes in PCER, which were calculated with repeated measures models.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no correction for clustering was needed. The WWC does not make corrections for multiple comparisons for follow-up findings.

Top

Appendix A4.4 Summary of kindergarten follow-up findings for the math domain1

  Authors' findings
from the study
 
  Mean outcome
(standard deviation)2
WWC calculations
Outcome measure Study sample Sample size (schools/students) Curiosity Corner group Comparison group Mean difference3 (Curiosity Corner — comparison) Effect size4 Statistical significance5
(at α = 0.05)
Improvement index6
PCER Consortium, 2008 (randomized controlled trial; kindergarten follow-up)7
WJ III Applied Problems Kindergarteners 69/188 nr nr 5.11 0.26 ns +10
CMA-A Composite Kindergarteners 69/194 nr nr –0.01 –0.05 ns –2
Shape Composition Kindergarteners 69/194 nr nr 0.31 0.32 ns +13

ns = not statistically significant
nr = not reported
WJ III = Woodcock-Johnson III
CMA-A = Child Math Assessment-Abbreviated

1 This appendix presents follow-up findings for measures that fall in the math domain. Posttest scores for preschoolers were used for rating purposes and are presented in Appendix A3.5.
2 The standard deviation across all students in each group shows how dispersed the participants’ outcomes are: a smaller standard deviation on a given measure would indicate that participants had more similar outcomes.
3 Positive differences and effect sizes favor the intervention group; negative differences and effect sizes favor the comparison group. In the case of PCER Consortium (2008), the mean differences are covariate-adjusted.
4 For an explanation of the effect size calculation, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), the WWC used the effect sizes reported by the study authors.
5 Statistical significance is the probability that the difference between groups is a result of chance rather than a real difference between the groups.
6 The improvement index represents the difference between the percentile rank of the average student in the intervention condition and that of the average student in the comparison condition. The improvement index can take on values between –50 and +50, with positive numbers denoting results favorable to the intervention group.
7 The level of statistical significance was reported by the study authors or, where necessary, calculated by the WWC to correct for clustering within classrooms or schools (corrections for multiple comparisons were not done for findings not included in the overall intervention rating). For an explanation about the clustering correction, see the WWC Tutorial on Mismatch. For the formulas the WWC used to calculate statistical significance, see Technical Details of WWC-Conducted Computations. In the case of PCER Consortium (2008), no correction for clustering was needed. The WWC does not make corrections for multiple comparisons for follow-up findings.

Top

Appendix A5.1 Curiosity Corner rating for the oral language domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of oral language, the WWC rated Curiosity Corner as having no discernible effects. The remaining ratings (potentially negative effects and negative effects) were not considered, as Curiosity Corner was assigned the highest applicable rating.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. Two studies showed no statistically significant or substantively important effects, either positive or negative.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. Two studies showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. Neither of the two studies showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. Two studies showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. Two studies showed no statistically significant or substantively important negative effects.

Mixed effects: Evidence of inconsistent effects as demonstrated through EITHER of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. Two studies showed no statistically significant or substantively important effects, either positive or negative.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. Two studies showed no statistically significant or substantively important effects, either positive or negative.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.2 Curiosity Corner rating for the Print knowledge domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of print knowledge, the WWC rated Curiosity Corner as having no discernible effects. The remaining ratings (potentially negative effects and negative effects) were not considered, as Curiosity Corner was assigned the highest applicable rating.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. One study showed no statistically significant or substantively important effects, either positive or negative.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. One study showed no statistically significant or substantively important negative effects.

Mixed effects: Evidence of inconsistent effects as demonstrated through EITHER of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.3 Curiosity Corner rating for the phonological processing domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of phonological processing, the WWC rated Curiosity Corner as having no discernible effects. The remaining ratings (potentially negative effects and negative effects) were not considered, as Curiosity Corner was assigned the highest applicable rating.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. One study showed no statistically significant or substantively important effects, either positive or negative.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. One study showed no statistically significant or substantively important negative effects.

Mixed effects: Evidence of inconsistent effects as demonstrated through EITHER of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.4 Curiosity Corner rating for the cognition domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of cognition, the WWC rated Curiosity Corner as having no discernible effects. The remaining ratings (potentially negative effects and negative effects) were not considered, as Curiosity Corner was assigned the highest applicable rating.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. One study showed no statistically significant or substantively important effects, either positive or negative.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effect.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. One study showed no statistically significant or substantively important negative effects.

Mixed effects: Evidence of inconsistent effects as demonstrated through EITHER of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A5.5 Curiosity Corner rating for the math domain

The WWC rates an intervention’s effects for a given outcome domain as positive, potentially positive, mixed, no discernible effects, potentially negative, or negative.1

For the outcome domain of math, the WWC rated Curiosity Corner as having no discernible effects. The remaining ratings (potentially negative effects and negative effects) were not considered, as Curiosity Corner was assigned the highest applicable rating.

Rating received

No discernible effects: No affirmative evidence of effects.

  • Criterion 1: None of the studies shows a statistically significant or substantively important effect, either positive or negative.

    Met. No study showed a statistically significant or substantively important effects, either positive or negative.

Other ratings considered

Positive effects: Strong evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: Two or more studies showing statistically significant positive effects, at least one of which met WWC evidence standards for a strong design.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing statistically significant or substantively important negative effects.

    Met. No study showed a statistically significant or substantively important negative effects.

Potentially positive effects: Evidence of a positive effect with no overriding contrary evidence.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important positive effects.

    AND

  • Criterion 2: No studies showing a statistically significant or substantively important negative effect and fewer or the same number of studies showing indeterminate effects than showing statistically significant or substantively important positive effects.

    Not met. One study showed no statistically significant or substantively important negative effects.

Mixed effects: Evidence of inconsistent effects as demonstrated through EITHER of the following criteria.

  • Criterion 1: At least one study showing a statistically significant or substantively important positive effect, and at least one study showing a statistically significant or substantively important negative effect, but no more such studies than the number showing a statistically significant or substantively important positive effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

    OR

  • Criterion 2: At least one study showing a statistically significant or substantively important effect, and more studies showing an indeterminate effect than showing a statistically significant or substantively important effect.

    Not met. One study showed no statistically significant or substantively important effects, either positive or negative.

1 For rating purposes, the WWC considers the statistical significance of individual outcomes and the domain-level effect. The WWC also considers the size of the domain-level effect for ratings of potentially positive or potentially negative effects. For a complete description, see the WWC Intervention Rating Scheme.

Top

Appendix A6 Extent of evidence by domain

  Sample size
Outcome domain Number of studies Schools Students Extent of evidence1
Oral language2 2 34 527 Medium to large
Print knowledge2 1 18 211 Small
Phonological processing 1 18 211 Small
Cognition 1 16 316 Small
Math2 1 18 211 Small
Early reading and writing 0 na na na

na = not applicable

1 A rating of "medium to large" requires at least two studies and two schools across studies in one domain and a total sample size across studies of at least 350 students or 14 classrooms. Otherwise, the rating is "small."
2 Sample size varies by outcome measure.

Top


PO Box 2393
Princeton, NJ 08543-2393
Phone: 1-866-503-6114