NHTSA Header Logo NHTSA Header Logo
Home Traffic Safety Vehicles & Equipment Laws & Regulations NCSA Vehicle Safety Research
Browse Topics
Latest Releases
NCSA Publications
Available Data
Data Requests
Analysis & Statistics
Crash Research
State Data Program
View Case Data
Regulatory Analysis
Regulatory Evaluation
State Traffic Safety Information (STSI)
Traffic Records
National Driver Register
About NCSA
Contact NCSA
HIPAA Info
Quick Clicks
Seat Check Saturday Locations for 9/20/08

Child Safety Seats

Locate a Child Seat Fitting Station

Child Seat "Ease of Use" Ratings

File a Complaint About Your Vehicle or Child Seat

Press Room

Newest Studies and Reports

Fuel Economy

Speed-Related Information

Recalls, Defects and Complaints Databases

Teen Drivers

About NHTSA

Contact NHTSA
 << NCSA

<< Back     View printable version Print Version 
Correlation Of NCAP Performance with Fatality Risk in Actual Head-On Collisions
NHTSA Report Number DOT HS 808 061 January 1994

Correlation Of NCAP Performance with Fatality Risk in Actual Head-On Collisions

Charles J. Kahane, Ph.D.

Abstract

The New Car Assessment Program (NCAP) has gauged the performance of vehicles in frontal impact tests since model year 1979. In response to Congressional direction, the National Highway Traffic Safety Administration studied the relationship between vehicle test scores in NCAP and the fatality risk in crashes of vehicles on the road. This study is based on head-on collisions, where the effect of crashworthiness can be separated from the effects of extraneous factors that influence fatality rates (drivers, roadways, mileage). Collisions between two 1979-91 passenger cars in which both drivers were wearing safety belts were selected from the Fatal Accident Reporting System. There were 396 collisions (792 cars) in which both cars were identical with or very similar to vehicles which had been tested in NCAP. In the analyses, adjustments were made for the relative weights of the cars, and for the age and sex of the drivers - factors which substantially affect fatality risk.

There are statistically significant correlations between NCAP scores for head injury, chest acceleration and femur loading and the actual fatality risk of belted drivers. A composite NCAP score, based on the test results for all three body regions, has excellent correlation with fatality risk: in a head-on collision between a car with good composite score and a car of equal weight with poor score, the driver of the car with the better NCAP score has, on average, a 20 to 25 percent lower risk of fatal injury. Slightly smaller, but still significant fatality reductions are obtained even when the NCAP scores for just one body region (just HIC, or chest g's, or femur load) are used to partition the fleet into "good" and "poor" performance groups. The borderline between good and poor NCAP scores that optimizes the differences in actual fatality risk is close to the criteria specified in Federal Motor Vehicle Safety Standard (FMVSS) 208 for each of the three body regions.

Cars built from 1979 through 1982 had, on the average, the poorest NCAP scores. Test performance improved substantially in 1983 through 1986 models, and continued to improve in 1987 through 1991 cars. In parallel, fatality risk for belted drivers in actual head-on collisions decreased by 20 to 25 percent in model years 1979-91, with the largest decreases just after 1982. The 35 mph test speed for NCAP is 5 mph higher than the test speed for FMVSS 208. By now, most passenger cars meet the FMVSS 208 criteria at the NCAP test speed. The study shows that achievement of this enhanced level of test performance has been accompanied by a significant reduction in actual fatality risk. However, being a statistical study, it does not address what portion of the fatality reduction was directly "caused" by NCAP. Also, these results do not guarantee that any individual make-model with low NCAP scores will necessarily have lower fatality risk than another make-model with higher NCAP scores.

Summary

The Appropriations Act for Fiscal Year 1992 directs "NHTSA to provide a study to the House and Senate Committees on Appropriations comparing the results of NCAP data from previous model years to determine the validity of these tests in predicting actual on-the-road injuries and fatalities over the lifetime of the models." In December 1993, the agency responded with a Report to the Congress that compared NCAP results and real-world crash experience, based on various analyses of accident data files. One set of analyses demonstrated a statistically significant correlation between NCAP performance and the fatality risk of belted drivers in actual head-on collisions. This technical report provides a more detailed exposition of the data sources, analytic approach and statistical findings in the analysis of head-on collisions.

NHTSA's goal was to see if cars with poor NCAP scores had more belted-driver fatalities than would be expected, given the weights of the cars, and the age and sex of the drivers involved in the crashes. Without adjustment for vehicle weight, driver age and sex, the large diversity of fatality rates in accident data mainly reflects the types of people who drive the cars, not the actual crashworthiness of the cars. For example, "high-performance" cars popular with young male drivers have an exceptionally high frequency of fatal crashes - because they are driven in an unsafe manner - even though they may be just as crashworthy as other models. NHTSA's analysis objective was to isolate the actual crashworthiness differences between cars, removing differences attributable to the way the cars are driven, the ages of the occupants, etc., and then to correlate NCAP performance with crashworthiness on the highway.

Analysis overview

Since NCAP is a frontal impact test involving dummies protected by safety belts, the agency limited the accident data to frontal crashes involving belted occupants. However, NHTSA did not consider all types of frontal crashes, but further limited the data to head-on collisions between two passenger cars, each with a belted driver, which resulted in a fatality to one or to both of the drivers. A head-on collision is a special type of highway crash ideally suited for studying frontal crashworthiness differences between two cars. Both cars are in essentially the same frontal collision. It doesn't matter if one of them had a "safe" driver and the other, an "unsafe" driver; at the moment they collide head-on, how safely they were driving before the crash is nearly irrelevant to what happens in the crash. Which driver dies and which survives depends primarily on the intrinsic relative crashworthiness of the two cars, their relative weights, and the age and sex (vulnerability to injury) of the two drivers.

If car 1 and car 2 weigh exactly the same, and both drivers are the same age and sex, the likelihood of a driver fatality in a head-on collision would be expected to be equal in car 1 and car 2. If car 1 and car 2 have different weights, etc., it is still possible to calibrate formulas predicting the expected fatality risk for each driver in a head-on collision between the two cars, as a function of each vehicle's weight and each driver's age and sex. The formulas measure the relative vulnerability to fatal injury of the two drivers, given that their cars had a head-on collision. The risk is greater in the lighter car than the heavier car, and a female or older driver is more vulnerable to injury than a male or younger driver. For example, given 100 fatal head-on collisions between 3000-pound-cars driven by belted, 20-year-old males and 2500 pound cars driven by belted, 50-year-old females, these formulas predict 10.8 times as many deaths among the older females in the lighter cars as among the young males in the heavier cars.

Cars with average crashworthiness capabilities will experience an actual number of fatalities very close to what is predicted by these formulas, which are calibrated from the collision experience of production vehicles. If a group of cars, however, consistently experiences more fatalities than expected in their head-on collisions, then the empirical evidence suggests that this group of cars is less crashworthy than the average car of similar mass. The gist of the analyses is to see if groups of cars with poor NCAP scores have significantly more belted-driver fatalities per 100 actual head-on collisions than expected (and there are several ways to define a "poor" score). The analyses measure the reduction in fatality risk, in actual head-on collisions, for a car with good NCAP scores relative to a car with poor NCAP scores. They measure the overall reduction in fatality risk, for belted drivers in head-on collisions, since model year 1979, when NCAP testing began, until 1991, the latest model year for which substantial accident data were available as of mid 1993.

The analyses require a data file of actual head-on collisions, with both drivers belted, resulting in a fatality to at least one of the drivers, indicating, for both cars, the curb weight, the driver's age and sex, and the HIC, chest g's and femur loads that were recorded for the driver dummy when that car was tested in NCAP. NHTSA's Fatal Accident Reporting System (FARS), complete through mid-1992, provided the basic accident data for the study. The FARS data were supplemented with accurate curb weights, derived from R. L. Polk's files and NHTSA compliance tests. Insufficient NCAP and FARS data were available to include light trucks, vans or sport utility vehicles in the analyses. Thus, the study is limited to collisions between two 1979-91 passenger cars.

NHTSA staff reviewed the cars involved in head-on collisions on FARS and identified, where possible, the NCAP test car that came closest to matching the FARS case. They found 396 head-on collisions, involving 792 cars, in which both drivers were belted and both cars match up acceptably with an NCAP case: (1) The make-models on FARS and NCAP are identical or true "corporate cousins" (e.g., Dodge Omni and Plymouth Horizon). (2) The model years on FARS and NCAP are identical, or the FARS model year is later than the NCAP model year, but that model was basically unchanged during the intervening years. The FARS cases were supplemented with the matching NCAP test results for each car. The sample is large enough for a statistical analysis of NCAP scores and fatality risk.

FARS data do not single out those head-on collisions that closely resemble an NCAP test: perfectly aligned collisions of two nearly identical cars, with minimal offset, a closing speed close to 70 mph, and both drivers 50th-percentile males. In addition, FARS cases may involve injury to the neck or abdomen: the potential for injury to these body regions is not specifically measured in NCAP. It is inappropriate to expect perfect correlation between NCAP test results and actual fatality risk in the full range of head-on collisions represented in the FARS sample. Moreover, if there is any significant correlation between the two, it suggests that the NCAP scores say something about actual crashworthiness in a range of crashes that goes far beyond the specific type tested in NCAP.

Correlation of NCAP scores and fatality risk

The goal of the analysis is to test if cars with poor scores on the NCAP test have higher fatality risk for belted drivers, in actual head-on collisions, than cars with good or acceptable scores. There are many ways to define "poor" and "good" scores and measure the difference in fatality risk. All of the methods tried out by NHTSA staff demonstrate a statistically significant relationship between NCAP scores and actual fatality risk, as shown in the accompanying table.

Collisions of Cars with "Good" NCAP Scores into Cars with "Poor" NCAP Scores
(N of crashes approximately 120 in each analysis)

  Performance in Actual Crashes
Good NCAP Performance Poor NCAP Performance N of Crashes Fatality Reduction for Good Car (%)
Chest g's < 56 Chest g's > 56 125 19*
HIC < 1000 HIC > 1200 113 14*
L Femu r < 1600 AND
R Femur < 1600 AND
L+R Femur < 2600
L Femur > 1600 OR
R Femur > 1600 OR
L+R Femur > 2600
132 20**
HIC < 1100 AND
Chest g's < 60
HIC > 1300 OR
Chest g's > 60
125 19*
Chest g's < 56 AND
L Femur < 1400 AND
R Femur < 1400 AND
L+R Femur < 2400
Chest g's > 60 OR
L Femur > 1700 OR
R Femur > 1700 OR
L+R Femur > 2700
134 22**
HIC < 900 AND
L Femur < 1400 AND
R Femur < 1400 AND
L+R Femur < 2400
HIC > 1300 OR
L Femur > 1700
R Femur > 1700
L+R Femur > 2700
121 19*
HIC < 900 AND
Chest g's < 56 AND
L Femur < 1400 AND
R Femur < 1400 AND
L+R Femur < 2400
HIC > 1300 OR
Chest g's > 60 OR
L Femur > 1700 OR
R Femur > 1700 OR
L+R Femur > 2700
118 21**
NCAPINJ < .6 NCAPINJ > .6 117 26**
* Statistically significant at the .05 level
**Statistically significant at the .01 level

A straightforward way to delineate "poor" and "good" scores is to partition the cars based on their NCAP score for a single body region - chest g's, HIC or femur load - and to consider only a subset of the 392 head-on crashes where one car has a score in the "poor" range and the other car has a score in a good or acceptable range. This subset should contain approximately 120 crashes, which is equivalent to defining the worst 20 percent of cars as "poor" performers and the remaining 80 percent as good or acceptable. Do the cars with the poor NCAP scores have significantly more driver fatalities than expected?

When chest g's are used to partition the cars into acceptable and poor performance groups, the cars with high chest g's almost always have significantly more fatalities than the cars with acceptable chest g's. For example, there are 125 actual head-on collisions (both drivers belted) in which one of models had more than 56 chest g's for the driver when it was tested in NCAP, and the other had 56 g's or less. In the 125 cars with chest g's > 56, 80 drivers died, whereas only 68.2 fatalities were expected, based on car weight, driver age and sex. In the 125 cars with chest g's < 56, there were 74 actual and 77.6 expected driver fatalities. That is a statistically significant fatality reduction of

1 - [(74/80) / (77.6/68.2)] = 19 percent

for the cars with the lower chest g's. The relationship between chest g's on the NCAP test and fatality risk over the range of head-on collisions experienced on the highway, although statistically significant, is not perfect. Merely having the lower NCAP score of the two cars in the collision does not guarantee survival, even if the two cars are of the same weight and the drivers of the same age and sex. Yet, on the average, in collisions between cars with < 56 chest g's on NCAP and cars with > 56 chest g's, the driver of the car with the better NCAP score had 19 percent less fatality risk than the driver of the car with the poorer NCAP score, after controlling for weight, age and sex.

Fifty-six chest g's are just one possible boundary value between "good" and "poor" performance. The fatality reduction for "good" performers can be magnified by using a higher boundary value or by replacing a single boundary value with a gap, putting some distance between the "good" and the "poor" groups. For example, in collisions of cars with chest g's < 60 into cars with chest g's > 60 (the pass-fail criterion in FMVSS 208), the fatality reduction in the "good" performers is 24 percent. However, there are only 92 crashes meeting those criteria. Many other boundary values between low and high chest g's will also produce statistically significant fatality reductions for the group with low chest g's, but the boundary value of 56 maximizes the fatality reduction for an accident sample close to 120 crashes.

The Head Injury Criterion (HIC) can be used to partition the cars into two performance groups. In 113 head-on collisions between a car with HIC < 1000 on the NCAP test and a car with HIC > 1200, the fatality risk was a statistically significant 14 percent lower in the cars with HIC < 1000. The femur loads measured on the NCAP tests can also, by themselves, differentiate safer from less safe cars. The "good" performers are defined to be the cars with < 1600 pounds on each leg, and the sum of the two loads < 2600 pounds. The "poor" performers are those with > 1600 pounds on either leg, or a sum > 2600 pounds. In 132 head-on collisions, the fatality reduction for the "good" NCAP femur load performers was a statistically significant 20 percent.

One reason that chest g's, HIC and femur load all "work" by themselves is that the three NCAP test measurements are not independent observations on isolated body regions. Cars with intuitively excellent safety design tend to have low scores on all parameters, while cars with crashworthiness problems tend to have high scores on one or more parameters, but it is not always predictable which one. Still, the reasons for the significant correlation between NCAP femur load and actual fatality risk are not completely understood at this time, since injuries to the lower extremities, by themselves, are generally not fatal.

Any two NCAP parameters, working together, can do an even more reliable job than any single parameter. In 125 actual head-on collisions between cars with driver HIC < 1100 and chest g's < 60 on the NCAP test and cars with either HIC > 1300 or chest g's > 60, the fatality risk was a statistically significant 19 percent lower in the cars with low HIC and chest g's. The accompanying table shows how chest g's and femur load, or HIC and femur load can be used to partition the cars, with statistically significant 19-22 percent fatality reductions for the "good" performers, in samples of 121-134 crashes.

NCAP scores for all three body regions, with an independent "pass-fail" criterion on each score, work about as well as scores for any two body regions. "Good" performance could be defined as HIC < 900 and chest g's < 56 and femur load < 1400 on each leg and < 2400, total, while HIC > 1300 or chest g's > 60 or femur load > 1700 on either leg or > 2700, total defines "poor" performance. The fatality risk in 118 actual head-on collisions between a good and a poor NCAP performer is a statistically significant 21 percent lower for the drivers of the cars with good NCAP scores, after controlling for vehicle weight, driver age and sex. These criteria can be varied by a moderate amount and the fatality reduction for the "good" performers will still be statistically significant, as long as the HIC cutoff is reasonably close to or slightly above the FMVSS 208 value of 1000, the chest g cutoff is not far from the FMVSS 208 value of 60 g's, and the femur load cutoff ranges from about 1400 pounds up to the FMVSS 208 value of 2250 pounds.

A highly efficient way to use the NCAP scores for the three body regions, however, is to combine them into a single composite score, wherein excellent performance on two body regions might compensate for moderately poor performance on the third. The composite score could be some type of weighted or unweighted average of the scores for the various body regions. For example, a weighted average measure of NCAP performance, NCAPINJ, was derived by a two-step process. First, the actual NCAP results for the driver dummy were transformed to logistic injury probabilities,

HEADINJ, CHESTINJ, LFEMURINJ and RFEMURINJ, each ranging from 0 to 1. The weighted average

NCAPINJ = .21 HEADINJ + 2.7 CHESTINJ + 1.5 (LFEMURINJ + RFEMURINJ)

has the empirically strongest relationship with fatality risk for belted drivers in the specific data set of actual head-on collisions described above (396 collisions, 792 cars). The accident data include 117 head-on collisions of a car with NCAPINJ < 0.6 into a car with NCAPINJ > 0.6. Fatality risk is a statistically significant 26 percent lower in the cars with NCAPINJ < 0.6. Since NCAPINJ is a weighted sum of NCAP scores for all of the body regions, the cars with NCAPINJ < 0.6 have, on the average, substantially lower HIC, chest g's and femur loads than cars with NCAPINJ > 0.6.

The purpose of defining NCAPINJ was to illustrate the strength of the overall relationship between NCAP performance and fatality risk. However, NCAPINJ is not a "magic bullet" or "ideal" way to combine the NCAP scores, resulting in far higher correlations than other methods. Many other weighted averages, or even an unweighted sum of the logistic injury probabilities, work almost as well for differentiating the safer from the less safe cars on the principal accident data set. On a more restricted alternative accident data set of 310 collisions and 620 cars, where the FARS vehicles are also required to have the same number of doors as their matching NCAP test vehicles, NCAPINJ is not the optimum weighted average (although it comes close to the optimum), and it is only slightly more correlated with fatality risk than an unweighted sum of the logistic injury probabilities. Moreover, on this alternative data set, HIC and femur load have about equally strong correlation with fatality risk.

Improvements in actual crashworthiness and NCAP performance during 1979-91

The performance of passenger cars on the NCAP test has greatly improved since the program was initiated in 1979. That was demonstrated in NHTSA's 1992-93 reports to the Congress and several other studies, which cite specific improvements in vehicle structures and occupant protection systems resulting in better NCAP performance. Has the historical trend of better performance on the NCAP test been matched by a reduction in the actual fatality risk of belted drivers in head-on collisions?

In general, it is not easy to compare the crashworthiness of cars of different model years. Fatality rates per 100 million vehicle miles have been declining for a long time. In any given year, the fatality rate per 100 million miles or per 100 crashes is lower for new cars than for old cars. Both trends create the impression that "cars are getting safer all the time," but, in fact, the declines in fatality rates to a large extent reflect changes in driving behavior, roadway environments, demographics or accident-reporting practices. A head-on collision between cars of two different model years, however, reveals their relative crashworthiness. Both cars are in essentially the same frontal collision, on the same road, in the same year, on the same accident report. The behavior of each driver, prior to the impact, has little effect on who dies during the impact. After adjustment for differences in car weight, driver age and sex, the model year with more survivors is more crashworthy.

There have been 241 actual head-on collisions between a model year 1979-82 car and a model year 1983-91 car, in which both drivers were belted. These collisions allow a comparison of cars built during the first four years of NCAP to subsequent cars, where manufacturers have had time to build in safety improvements. In the 241 older cars, 146 drivers died, whereas only 126.6 fatalities were expected, based on car weight, driver age and sex. In the newer cars, there were 132 actual and 147.1 expected driver fatalities. For the 1983-91 cars, that is a statistically significant fatality reduction of

1 - [(132/146) / (147.1/126.6)] = 22 percent

A more generalized analysis, which allows a larger sample size of 1189 crashes, applies to head-on collisions in which the "case" vehicle of interest is a 1979-91 car that matches up with an NCAP test, whose driver wore belts, but the "other" vehicle in the crash can be any 1976-91 passenger car with a belted driver. For any subset of crashes, a fatality risk index can be computed for the "case" vehicles, based on the ratio of actual to expected fatalities in the case and other vehicles. The lower the risk index, the more crashworthy the car (100 = average). The actual fatality risk indices can be compared in three model-year groups, 1979-82, 1983-86 and 1987-91. So can the NCAP test performance, as measured by a composite score such as NCAPINJ, or by the average values of the actual NCAP parameters for the three body regions:

  Model Years
1979-82 1983-86 1987-91
Fatality risk index in actual head-on collisions 119 95 91
Average value of NCAPINJ .59 .40 .37
Percent of cars with NCAPINJ>0.6 49 14 9
Average HIC 1,052 915 827
Average chest g's 54.9 46.8 46.5
Average left femur load 928 883 1002
Average right femur load 1,079 784 1018

The trends in the actual fatality risk and the average value of NCAPINJ are almost identical. The risk index decreased by a statistically significant 20 percent from 1979-82 to 1983-86, and by another 4 percent from then until 1987-91 (nonsignificant). In all, the actual fatality risk for belted drivers in head-on collisions decreased by a statistically significant 24 percent from model years 1979-82 to 1987-91. A composite NCAP score, such as NCAPINJ, nicely portrays the improvement in NCAP performance over time. Parallel to the reduction in the fatality risk index, NCAPINJ greatly improved from an average of 0.59 in model years 1979-82 to 0.40 in 1983-86, with an additional, modest improvement to 0.37 in 1987-91. If NCAPINJ = 0.6 is defined as the limit of "acceptable" NCAP performance, the passenger car fleet has truly progressed since the inception of NCAP: initially, 49 percent of the cars had NCAPINJ > 0.6, but that decreased to 14 percent in 1983-86 and 9 percent in 1987-91. Average HIC and chest g's declined substantially during the NCAP era; average femur loads stayed about the same, but well below the 2250 pounds permitted in FMVSS 208.

Principal findings, conclusions and caveats

  • There is a statistically significant correlation between the performance of passenger cars on the NCAP test and the fatality risk of belted drivers in actual head-on collisions. Since many head-on collisions differ substantially from NCAP test conditions, this suggests NCAP scores are correlated with actual crashworthiness in a wide range of crashes.

  • In a head-on collision between a car with "acceptable" NCAP performance and a car of equal mass with "poor" performance, the driver of the "good" car has, on the average, about 15-25 percent lower fatality risk.

  • A highly effective way to differentiate "good" from "poor" NCAP performance is by a single, composite NCAP score, such as a weighted combination of the scores for the three body regions. However, even the NCAP score for any single body region can be used to partition the fleet so that the cars with "good" scores have significantly lower fatality risk than the cars with "poor" scores. The borderline between "good" and "poor" NCAP scores that optimizes the differences in actual fatality risk is close to the FMVSS 208 criteria for each of the three body regions.

  • NCAP scores have improved steadily since the inception of the program in 1979, with the greatest improvement in the early years. By now, most passenger cars meet the FMVSS 208 criteria in the 35 mph NCAP test. This achievement has been paralleled by a 20-25 percent reduction of fatality risk for belted drivers in actual head-on collisions in model years 1979-91, with the largest decreases during the early 1980's.

  • This is a statistical study and it is not appropriate for conclusions about cause and effect. It shows that passenger cars became significantly safer in head-on collisions during 1979-91, as NCAP scores improved. It does not prove that the NCAP program was the stimulus for each of the vehicle modifications that saved lives during 1979-91. (For example, the automatic protection requirement of FMVSS 208 was another important stimulus.)

  • The correlation between NCAP scores and actual fatality risk is statistically significant, but it is far from perfect. On the whole, cars with poor NCAP scores have higher-than-average fatality risk in head-on collisions, but there is no guarantee that every specific make-model with poor NCAP scores necessarily has higher fatality risk than the average car. Conversely, there is no guarantee that a specific model with average or even excellent scores necessarily has average or lower-than-average fatality risk in head-on collisions.

  • The data show that cars with poor NCAP scores (e.g., above the FMVSS 208 criteria) have significantly elevated fatality risk in head-on collisions, but they do not show a significant difference between the fatality risk of cars with exceptionally good NCAP performance and those with merely average performance.
U.S. Department of Transportation USA Gov - Your First Click to the U.S. Government