Skip Navigation
small header image

Results from the 2003 National Assessment of Educational Progress (NAEP) assessments in reading and mathematics

Dr. Peggy G. Carr Hello, and welcome to today's StatChat on the NAEP 2003 reading and mathematics results for the nation and the states. I hope you've had time to look at the results on the website. I'm sure that you have many questions regarding today's release, so let's get right to them...

Julius from New Hope, PA asked:
Could you please explain the difference between the accommodations-permitted and the accommodations-not-permitted results? Why are there two separate trend lines for each grade?
Dr. Peggy G. Carr: ? Prior to 1998, NAEP did not allow assessment accommodations for students with disabilities or limited-English-proficient students who normally receive accommodations (such as extra time) on state assessments. Beginning in 1998, NAEP began allowing such accommodations. Because these accommodations allowed more students to be included and changed testing conditions, NAEP felt the most technically sound approach was to start new trend lines for the accommodations-allowed samples, and to overlap these trend lines in same years.

Audrey from Newark, DE asked:
Under the federal No Child left Behind legislation, all states have to measure students' progress. Significant sanctions are triggered if schools do not make AYP (Adequate Yearly Progress). At this point in time, most states have their own tests to measure student achievement against their own set of standards. Instead of everyone using different instruments, could the NAEP be used as a standard measure? The way it is now, it is as if we we are taking students' temperatures with a group of thermometers that are not even calibrated.
Dr. Peggy G. Carr: Because states have different curricula and standards, they have different assessments which are, in general, not comparable. In addition the legislation requires individual scores from at least 95% of students for Adequate Yearly Progress (AYP). Although NAEP produces a standard measure, by law it does not produce individual student scores and only assesses a representative sample of about 3,000 students per state.

Jared from Salem, Oregon asked:
Why do we rely on the NAEP assessments when major organizations, such as the National Academy of Sciences (NAS) and the National Academy of Education (NAE), say these tests are flawed, inconsistant, and produce unreasonable results?
Dr. Peggy G. Carr: The NAE and NAS, to our knowledge, have never questioned the quality of NAEP assessments. The NAEP tests are rigorously defined and carefully developed. Indeed, the NAEP tests are frequently recognized as the gold standard in assessment for these grades and subjects. It is true that NAS and NAE have raised questions about the methodology used to set the achievement levels (which represent ranges on the NAEP scale) and the reasonableness of the results based on these levels. NCES is aware of these issues and advises that the achievement levels be used with caution on a trial basis. However, both NCES and the National Assessment Governing Board (NAGB) believe the standards are useful for understanding trends in student achievement, as well as for understanding the types of skills and competencies exhibited by students at various places on the scale.

Jane Stock, Curriculum Specialist, Pearson Digital Learning from Mesa, Arizona asked:
What was the selection process for testing specific objectives? Were any national standards and/or selected state curriculum standards used? Were standardized test objectives used?
Dr. Peggy G. Carr: Each NAEP assessment comprises assessment questions constructed according to a blueprint documented in a Framework and Test Specifications. These documents are produced through a process involving a broadly representative group of experts, with input from state curriculum standards.

beverly from chattanooga, tenn asked:
1.What good is NAEP state data for local school districts? 2. Has NCLB required all states to participate?
Dr. Peggy G. Carr: NAEP data are not normally available for local school districts. On a trial basis, 2003 NAEP data will be available for nine large urban school districts in addition to the District of Columbia. These data will be released in about a month. No Child Left Behind requires all states to participate in NAEP mathematics and reading assessments at grades 4 and 8 (students are not required to participate). States are not required to participate in other subjects. You can find more information on NCLB requirements for NAEP at http://nces.ed.gov/nationsreportcard/nclb.asp.

Diana Reinhart from San Antonio, TX asked:
On the fourth grade reading test, Texas' four achievement levels total 101. Why?
Dr. Peggy G. Carr: The individual percents for each achievement level have been rounded to a whole number. Therefore the sum may be slightly less or slightly more than 100.

Allan from Centreville, MD asked:
The NAEP tests are being used to make judgements about the rigor of state testing programs. What data is there that confirms the validity of using NAEP test results for this purpose, especially where the NAEP tests are only administered to a sampling of students?
Dr. Peggy G. Carr: NAEP is not being officially used to assess the rigor of state assessments. However, case studies of NAEP state scores and state results show that NAEP can be effectively used to gauge the general progress of a state?s performance. Although NAEP is not designed to evaluate state testing programs, it has in the past and continues to be a serious discussion tool for how states are performing. NAEP does provide the one common yardstick on which state performances can be compared. NAEP is, as you state, given to a sample of students. However, the samples are rigorously designed to be representative of the public school students in the state.

Twila from Toledo, Ohio asked:
Please address some of the issues which have been brought up in the newly released book by the Thernstorm's NO excuses, closing the racial gap...
Dr. Peggy G. Carr: I am familiar with this book, but I just started to read it, and I am not in a position to comment at this time.

Heidi from Boston, MA asked:
Why is it important to have both the NAEP tests and the state-specific assessments?
Dr. Peggy G. Carr: Because state tests are designed to measure state-specific curricula, their results are not comparable across states. NAEP provides a single, common yardstick that allows for comparisons across states and over time. Both sources of information are important.

Diana Reinhart from San Antonio, TX asked:
How can Texas (4th grade reading) not be statistically different with its aveage score but be statistically different with its achievement levels?
Dr. Peggy G. Carr: I assume the comparison you are referring to is this: Texas scale scores were not significantly different from the national public school scale score results, but the percentage of students at or above Proficient in Texas was smaller than in the nation. The scale score is the average for all students; the achievement level percentage is the percentage of students with scale scores at or above a cut point on the NAEP scale. These two measures focus on different aspects of the student scale score distribution, and they have different standard errors. As a result, it can easily happen that the scale scores and percentages at or above achievement levels may have different statistical significance test results. NAEP provides a variety of metrics through which assessment results can be viewed, in order to focus attention of many aspects of the data.

Brian from Washington DC asked:
Atlanta, Chicago, DC, Houston, Los Angeles, and New York are already included in the current NAEP Trial Urban District Assessment. Will new urban districts be added in the future, specifically Boston, Long Beach, and Charlotte and Mecklenburg?
Dr. Peggy G. Carr: Boston and Charlotte-Mecklenburg were added to the 2003 Trial Urban District Assessment along with Cleveland and San Diego. These data are scheduled to be released later this year.

Scott from Solebury, PA asked:
Thanks for taking my question. It appears the "big news" here is that Math results are generally up, while Reading results are generally unchanged. Were the same students tested in both subjects? Or are the math results coming from a different bunch of students than the reading results? Thanks!
Dr. Peggy G. Carr: Different students were tested in each subject. In the same test session, one student will be taking a reading test while the student sitting in the next seat will be taking a math test.

Joan from New Hampshire asked:
Which results are you most pleased to see and which are the areas of most concern?
Dr. Peggy G. Carr: We are pleased to see progress wherever we do, but clearly in 2003 the progress in grade 4 and 8 mathematics is striking. It is for others to identify policy areas of concern, although clearly relatively few students are reaching the proficient level.

Peter from Towson, Maryland asked:
This is excellent data for our district to use. Thank you. Will NAEP results come out every year from now on? Or will it go back to every two years?
Dr. Peggy G. Carr: As specified in the No Child Left Behind Act of 2001, grade 4 and 8 mathematics and reading will be assessed every two years starting in 2003. Other subjects, including those assessed at the 12th grade, will be assessed less frequently. The urban districts are expected to continue to be a part of NAEP in 2005.

Dannie from Des Moines, Iowa asked:
What do you believe are the reasons for the considerable growth in mathematics found in many of the states?
Dr. Peggy G. Carr: NAEP design does not allow us to make cause and effect statements regarding student performance. However, I suggest that you examine the wealth of contextual student, teacher, and school background variables that accompany the results for viable hypotheses at nces.ed.gov/nationsreportcard. You can find these data in the NAEP Data Tool.

Maggi from Pittsburgh, PA asked:
Given that No Child Left Behind focuses on students achieving "proficiency," is it more important to look the "basic or better" results or the "proficient or better" results?
Dr. Peggy G. Carr: NCLB requires states to define their AYP goals in terms of the percentage of students reaching the proficient level on their own state tests. It is generally acknowledged that differences exist across the states with respect to the nature of their state-level proficient performance standards. Comparing state NAEP results with respect to both the Basic and Proficient achievement levels to those found by the states on their own assessments will hopefully encourage dialogue and efforts that will result in a better understanding of the nature of state performance differences.

Tony from Fairmont, WV asked:
How get I get a copy of the assessment of how students in my county did as opposed to others in my state?
Dr. Peggy G. Carr: NAEP assessments are conducted on a sample basis. We do not assess all students. Thus, we are able to report only at the national and state levels, and for some large urban school districts. Our samples of students are not representative of counties.

Eleanor Chute from Pittsburgh, PA asked:
I see that for eighth-grade reading scores in Pennsylvania, the racial achievement gap between white and black students has narrowed significantly, the only example of a narrowing in the nation. In 2002, Pennsylvania had one of the largest racial achievement gaps. How could it narrow so quickly? What does this mean? The racial achievement gap in the other three categories -- fourth-grade reading and fourth- and eighth-grade math -- has not changed significantly. Thank you.
Dr. Peggy G. Carr: Pennsylvania?s score gap between White and Black students did narrow in 8th grade reading, and the narrowing of such gaps is generally good news. However, the ideal narrowing of a gap results from both comparison groups improving, with the lower-performing group improving faster. In Pennsylvania, where there was a large gap in 2002, the Black students have gained 7 points. The White students? scale scores, in contrast, declined 3 points. Neither change in subgroup average scores was statistically significant, but the numbers are important to illustrate the rest of the answer. Note that non-significant results are not typically discussed in NAEP reports. While neither the 3- point decline nor the 7-point gain, in isolation, would have significantly closed the gap between the groups, the combined 10-point effect does. In the other cases you cite, the combined changes in group performance (both up, one-up-one-down, or both down) have not narrowed the gaps enough to be statistically significant. As a technical point, standard errors of gaps are generally larger than simple trend comparisons for a single group.

Sarah from Seattle asked:
My question has three parts. a) Do you have any ideas as to why students are making more rapid progress in mathematics than reading? b) Is the short time frame responsible for the relatively flat reading scores compared to 2002? c) Do you think the accommodated and non-accommodated samples are comparable? Thank you.
Dr. Peggy G. Carr: a) NAEP does not provide information on why performance has improved in one subject versus another. B) We wouldn?t expect to see much change over one year. The data indicate that subgroups of the population, such as Black and Hispanic students, and those eligible for the free/reduced-price lunch program, have made improvements in reading over longer time periods. C) Our research indicates the accommodated and non-accommodated samples do differ slightly such that accommodated samples are typically one point or two lower, however, for the most part, they are comparable. Also, please note that only a small proportion of the sample was accommodated.

Thanks for all the excellent questions. Unfortunately, I could not get to all of them, but please feel free to contact me or members of the NAEP staff if you need further assistance. I hope that you found this session to be helpful and the reports to be interesting. Later this winter, we will release results from the NAEP 2003 Trial Urban District Assessment results for reading and mathematics.

Back to StatChat Home


1990 K Street, NW
Washington, DC 20006, USA
Phone: (202) 502-7300 (map)