Skip to contentUnited States Department of Transportation - Federal Highway Administration Go to TFHRC homeGo to FHWA websiteFeedback

Appendix B: Roundoff Errors in Large Sum

Errors in the Sum of a List of Numbers

Accumulating Errors after Two Operations

To illustrate the errors after two arithmetic operations, the approximate error in (a+b)+c is computed here:

Click for text description

Where the terms containing e1*e2, which are much smaller than the other terms, are thrown away.

There are two interesting aspects of this formula for the error of two additions:

An Example of Nonassociativity

To highlight the effect of order on a sum, it helps to use numbers that vary widely in size. This increases the chance that some error terms will arise in which a relatively large argument is multiplied by a relatively large roundoff error. Since the different orders of addition change the combinations of arguments and errors, these relatively large terms are unlikely to occur in both of the different-ordered additions.

To get a set of such numbers, take the floating point quotient of two random integers, where the numerator ranges over [0,RANDMAX] and the denominator over [1,RANDMAX], where RANDMAX is a C-implementation-dependent constant whose value does not matter, as long as a wide range of random values is permitted. In an actual application, double precision numbers would be used, but floating point numbers are used here to create noticeable errors in fewer actual computations, for illustrative purposes.

Adding various numbers of these random quotients in opposite orders produced the following results.

Table 5. Order Errors for Addition
Size Sum Up Sum Down Diff. Avg. Up Avg/ Down Diff/Size
10 33.325592 33.325592 0.000000 3.332559 3.332559 0.000000
100 495.900269 495.900146 0.000122 4.959002 4.959002 0.000001
1000 6318.028320 6318.027832 0.000488 6.318028 6.318028 0.000000
10000 67116.781250 67116.718750 0.062500 6.711678 6.711672 0.000006
100000 562199.375000 562194.625000 4.750000 5.621994 5.621946 0.000048

The C program producing these numbers appears in averrdemo.c (see below).

Computing the Maximum Error

Because all operations satisfy the same bounds for errors,

Click for text description

repeated application of the previous section's techniques provides a maximum error for a numerical computation.

If the precision is constant throughout the computation, the maximum error, |fl(x)-x| for a real number x in the range of the floating point number system is u*SumArgs, where SumArgs is the sum of the absolute values of the arguments for all arithmetic operations used to compute x.

This maximum error can be computed along with x, because:

Computing the Probable Error

While the small random error variables need not be random, they often behave so, and the observed error is much smaller than the maximum error. Therefore, it is useful to assume that the actual error is a weighted sum of random error variables and to compute the standard deviation of the error.

Assuming that the distribution of errors is uniform over a u radius interval around zero, the actual error is a weighted sum of the random error variables (the e's). Under reasonable assumptions about the probability density functions for the individual e's, such as assuming a uniform distribution of errors in the u interval, the Central Limit Theorem applies.

For e distributions for which the Central Limit Theorem applies, such as a uniform distribution for each of the e's:

Application: Adding N Numbers

To add, in order N, floating point numbers that are:

In this case, the intermediate results are:

Let m be the mean of the numbers. Using the assumption that the numbers are about equal, the partial sums are k*m, approximately. The sum of these partial sums is approximately ((N-2)/2)*(N*m), so the total sum of arguments is approximately ((N-1)/2)*(N*m), adding in the errors on the numbers themselves, or approximately N^^2/2*m.

For the usual double precision arithmetic, the variance of a single random error variable is

Click for text description

Because the standard deviation is the square root of the variance, to keep the error standard deviation less than 1 part in 10,000, the variance must be less than E-8. This requires N^^2/2*m < E-15. This shows that the error in most sums on a computer is negligible.

However, some data miners compute statistics from vast samples. If N = E5 and m = E6, for example, there might be a standard deviation greater than E-4.

Large sums also appear in numerical integration. However, the individual summands are the areas of very small approximate trapezoids, so for numerical integration, roundoff error usually can be ignored.

Restrictions on the Probable Error Computation

Note that these statistical estimates break down if the computation contains only a few operations. In this case, direct computation of the errors, as was done for two additions above, is preferred. It is also important to analyze separately those arguments occurring during the computation that have different precisions. These results for different precisions can be combined using elementary statistics in the same way that the above estimates for expected error were obtained.

Averrdemo.c

Random Quotients


Previous | Table of Contents | Next


FHWA
TFHRC Home | FHWA Home | Feedback

United States Department of Transportation - Federal Highway Administration