Misleading graph

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In statistics, a misleading graph, also known as a distorted graph, is a graph which misrepresents data, constituting a misuse of statistics and with the result that an incorrect conclusion may be derived from it. Graphs may be misleading through being excessively complex or poorly constructed. Even when well-constructed to accurately display the characteristics of their data, graphs can be subject to different interpretation.[1]

Misleading graphs may be created intentionally to hinder the proper interpretation of data, but can be also created accidentally by users for a variety of reasons including unfamiliarity with the graphing software, the misinterpretation of the data, or because the data cannot be accurately conveyed. Misleading graphs are often used in false advertising. One of the first authors to write about misleading graphs was Darrell Huff, who published the best-selling book How to Lie With Statistics in 1954. It is still in print.

The field of data visualization describes ways to present information that avoids creating misleading graphs.

Misleading graph methods[edit]

It [a misleading graph] is vastly more effective, however, because it contains no adjectives or adverbs to spoil the illusion of objectivity. There's nothing anyone can pin on you.

--How to Lie with Statistics (1954)[2]

There are numerous ways in which a misleading graph may be constructed.[3]

Excessive usage[edit]

The use of graphs where they are not needed can lead to unnecessary confusion/interpretation.[4] Generally, the more explanation a graph needs, the less the graph itself is needed.[4] Graphs do not always convey information better than tables.[citation needed]

Biased labeling[edit]

The use of biased or loaded words in the graph's title, axis labels, or caption may inappropriately prime the reader.[4][5]

Pie chart[edit]

  • Comparing pie charts of different sizes could be misleading as people cannot accurately read the comparative area of circles.[6]
  • The usage of thin slices which are hard to discern may be difficult to interpret.[6]
  • The usage of percentages as labels on a pie chart can be misleading when the sample size is small.[citation needed]
  • Making a pie chart 3D or adding a slant will make interpretation difficult due distorted effect of perspective.[7] Bar-charted pie graphs in which the height of the slices is varied may confuse the reader.[7]

3D Pie chart slice perspective[edit]

A perspective (3D) pie chart is used to give the chart a 3D look. Often used for aesthetic reasons, the third dimension does not improve the reading of the data; on the contrary, these plots are difficult to interpret because of the distorted effect of perspective associated with the third dimension. The use of superfluous dimensions not used to display the data of interest is discouraged for charts in general, not only for pie charts.[8] In a 3D pie chart, the slices that are closer to the reader appear to be larger than those in the back due to the angle at which they're presented.[9]

Misleading Pie Chart Comparison
Misleading Pie Chart Regular Pie Chart
Misleading Pie Chart.png Sample Pie Chart.png
In the misleading pie chart, Item C appears to be at least as large as Item A, whereas in actuality,
it is less than half as large.

Edward Tufte, a prominent American statistician noted why tables may be preferred to pie charts in The Visual Display of Quantitative Information:

Tables are preferable to graphics for many small data sets. A table is nearly always better than a dumb pie chart; the only thing worse than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between pies - Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used.[10]

Improper scaling[edit]

When using pictogram in bar graphs, they should not be scaled uniformly as this creates a perceptually misleading comparison.[11] The area of the pictogram is interpreted instead of only its height or width.[12] This causes the scaling to make the difference appear to be squared.[12]

Improper scaling of 2D pictogram in bar graph
Improper Scaling Regular Comparison
Improperly scaled picture graph.svg Picture Graph.svg Comparison of properly and improperly scaled picture graph.svg
Note how in the improperly scaled pictogram bar graph, the image for B is actually 9 times larger than A.
2D shape scaling comparison
Square Circle Triangle
Box scaling.svg Circle scaling.svg Triangle scaling.svg
Note how the perceived size increases when scaling.

The effect of improper scaling of pictogram is further exemplified when the pictogram has 3 dimensions, in which case the effect is cubed.[13]

Graph showing improper 3D pictogram scaling.svg
Note how the usage of improper scaling of a three-dimensional pictogram in this fictitious graph. It appears that home sales have gone up significantly in 2001 over the previous year. Additionally, because no frequency axis is supplied, readers are unable to quantify the change, and are only left with a misleading perception of the change. The scaling, which is 2x, causes the change to appear to be 2^3 or 8 times larger.

Additionally, an improperly scaled pictogram may leave the reader with the sense that the item itself has actually changed in size.[14]

Misleading Regular
Pictograph not aligned and different size.svg Pictograph aligned and similar size.svg
Assuming the pictures represent equivalent quantities, note how in the misleading graph, there appears to be more bananas
because the bananas occupy the most area and are furthest to the right.

Truncated graph[edit]

A truncated (also known as a torn or gee-whiz) graph has a y-axis that does not start at 0. These graphs can create the impression of important change where there is relatively little change.

Truncated graphs are useful in illustrating small differences.[15] Graphs may also be truncated to save space.[15] Commercial software such as MS Excel will tend to truncate graphs by default if the values are all within a narrow range, as in this example.

Truncated bar graph
Truncated bar graph Regular bar graph
Truncated Bar Graph.svg Bar graph.svg
Note that both of these graphs display identical data; however, in the truncated bar graph on the left,
the data appear to show significant differences, whereas in the regular bar graph on the right, these differences are hardly visible.
Indicating a y-axis break
Bar graph break.svg Y-axis break.svg
There are several ways to indicate a y-axis break.

Axis changes[edit]

Changing y-axis maximum
Original graph Smaller maximum Larger maximum
Line graph1.svg Line graph3.svg Line graph2.svg
Changing the y-axis maximum affects how the graph appears. A higher maximum will cause the graph to appear to have less-volatility, less-growth and a less steep line than a lower maximum.
Changing ratio of graph dimensions
Original graph Half width, twice height Twice width, half height
Line graph1.svg Line graph1-3.svg Line graph1-4.svg
Changing the ratio of a graph's dimensions will affect how the graph appears.

No scale[edit]

The scales of a graph are often used to exaggerate or minimize differences.[16][17]

Misleading bar graph with no scale
Less difference More difference
Example truncated bar graph.svg
Bar graph missing zero1.svg
Note the lack of a starting value for the y-axis, which makes it unclear if the graph is truncated. Additionally, note the lack of tick marks which prevents the reader from determining if the graph bars are properly scaled. Without a scale, the visual difference between the bars can be easily manipulated.
Misleading line graph with no scale
Volatility Steady, fast growth Slow growth
No scale line graph1.svg No scale line graph2.svg No scale line graph3.svg
Though all three graphs share the same data, and hence the actual slope of the (x,y) data is the same, the way that the data is plotted can change the visual appearance of the angle made by the line on the graph. This is because each plot has different scale on its vertical axis. Because the scale is not shown, these graphs can be misleading.

Improper intervals/units[edit]

The intervals and units used in a graph may be manipulated to create or mitigate the expression of change.[9]


Omitting data[edit]

Graphs created with omitted data remove information from which to base a conclusion.

Scatter plot with missing categories
Scatter plot with missing categories Regular scatter plot
Scatter Plot with missing categories.svg A scatter plot without missing categories.svg
Note how in the scatter plot with missing categories on the left,
the growth appears to be more linear with less variation.

In financial reports, negative returns, or data which does not correlate a positive outlook may be excluded to create a more favorable visual impression.[18]

In engineering applications, the omission of data can be fatal. In the Space Shuttle Challenger disaster, engineers failed to properly display data.[19][20][21]

Improper extraction[edit]

Graphs based on other graphs should be representative in their presentation.

Extraction has valid uses when searching for anomalies.

Original Graph Extracted Graph
LyingXaxis.png
Note how the extracted graph does not accurately represent the original graph.

3D[edit]

The use of a superfluous third dimension which does not contain information is strongly discouraged as it may confuse the reader.[7]

Complexity[edit]

Graphs are designed to allow for easier interpretation of statistical data. However, graphs with excessive complexity can obfuscate the data and make interpretation difficult.

Poor construction[edit]

Poorly constructed graphs can make data difficult to discern and thus interpret.

Measuring distortion[edit]

Several methods have been developed to determine whether graphs are distorted and to quantify this distortion.[22][23]

Lie factor[edit]

\text{Lie factor}=\frac{\text{size of effect shown in graphic}}{\text{size of effect shown in data}}

where

\text{size of effect}=\left| \frac{\text{second value − first value}}{\text{first value}} \right|

A graph with a high lie factor (>1) would exaggerate change in the data it represents, while one with a small lie factor (>0, <1) would obscure change in the data.[24] A perfectly accurate graph would exhibit a lie factor of 1.0.

Graph discrepancy index[edit]

\text{graph discrepancy index}=100 \left(\frac{a}{b} - 1\right)

where

a=\text{percentage change depicted in graph}
b=\text{percentage change in data}

The graph discrepancy index also known as the graph distortion index (GDI) was originally proposed by Paul John Steinbart in 1998. GDI is calculated as a percentage ranging from -100% to positive infinity with zero percent indicating that the graph has been properly constructed and anything outside the ±5% margin is considered to be distorted.[22] Research into the usage of GDI as a measure of graphics distortion has found it to be inconsistent and discontinuous making the usage of GDI as a measurement for comparisons difficult.[22]

Data-ink ratio[edit]

\text{data-ink ratio}=\frac{\text{`ink' used to display the data}}{\text{total `ink' used to display the graphic}}

The data-ink ratio should be relatively high, otherwise the chart may have unnecessary graphics.[24]

Data density[edit]

\text{data density}=\frac{\text{number of entries in data matrix}}{\text{area of data graphic}}

The data density should be relatively high, otherwise a table may be better suited for displaying the data.[24]

Usage in finance and corporate reports[edit]

Graphs are useful in the summary and interpretation of financial data.[25] Graphs allow for trends in large data sets to be seen while also allowing the data to be interpreted by non-specialists.[25][26]

Graphs are often used in corporate annual reports as a form of impression management.[27] In the United States, graphs do not have to be audited as they fall under AU Section 550 Other Information in Documents Containing Audited Financial Statements.[27]

Several published studies have looked at the usage of graphs in corporate reports for different corporations in different countries and have found frequent usage of improper design, selectivity, and measurement distortion within these reports.[27][28][29][30][31][32][33] The presence of misleading graphs in annual reports have led to requests for standards to be set.[18][34][35][36]

Research has found that while readers with poor levels of financial understanding have a greater chance of being misinformed by misleading graphs,[37] even those with financial understanding, such as loan officers, may be misled.[34]

Academia[edit]

The perception of graphs is studied in psychophysics, cognitive psychology, and computational visions.[38]

See also[edit]

References[edit]

  1. ^ Kirk, p. 52
  2. ^ Huff, p. 63
  3. ^ Nolan, pp. 49–52
  4. ^ a b c "Methodology Manual: Data Analysis: Displaying Data - Deception with Graphs". Texas State Auditor's Office. Jan 4, 1996. Retrieved 19 July 2012. 
  5. ^ Keller, p. 84
  6. ^ a b Whitbread, p. 150
  7. ^ a b c d Whitbread, p. 151
  8. ^ Few, Stephen (August 2007). "Save the Pies for Dessert". Visual Business Intelligence Newsletter. Perceptual Edge. Retrieved 28 June 2012. 
  9. ^ a b Rumsey, p. 156
  10. ^ Tufte, Edward R. (2006). The visual display of quantitative information (2nd ed., 4th print. ed.). Cheshire, Conn.: Graphics Press. p. 178. ISBN 9780961392147. 
  11. ^ Weiss, p. 60
  12. ^ a b Utts, pp. 146-147
  13. ^ Hurley, pp. 565-566
  14. ^ Huff, p. 72
  15. ^ a b Rensberger, Boyce (May 10, 1995). "Slanting The Slope of Graphs". The Washington Post. Retrieved 9 July 2012. (subscription required)
  16. ^ Smith, Karl J. (1 January 2012). Mathematics: Its Power and Utility. Cengage Learning. p. 472. ISBN 978-1-111-57742-1. Retrieved 24 July 2012. 
  17. ^ Moore, David S.; Notz, William (9 November 2005). Statistics: Concepts And Controversies. Macmillan. pp. 189–190. ISBN 978-0-7167-8636-8. Retrieved 24 July 2012. 
  18. ^ a b Burgess, Deanna Oxender; William N. Dilla, Paul John Steinbart, Todd M. Shank (May 2008). "Does Graph Design Matter To CPAs And Financial Statement Readers?". Journal of Business & Economics Research 6 (5). 
  19. ^ Wainer, p. 51-53
  20. ^ Robison, Wade (2002). "Representation and misrepresentation: Tufte and the Morton Thiokol engineers on the Challenger". Science and Engineering Ethics 8 (1): 59–81. doi:10.1007/s11948-002-0033-2. 
  21. ^ Visual Explanations, p. 38-53
  22. ^ a b c Mather, Dineli R.; Mather, Paul R.; Ramsay, Alan L. (July 2003). "Is the Graph Discrepancy Index (GDI) a Robust Measure?". SSRN Electronic Journal. doi:10.2139/ssrn.556833. 
  23. ^ Mather, Dineli; Mather, Paul; Ramsay, Alan (1 June 2005). "An investigation into the measurement of graph distortion in financial reports". Accounting and Business Research 35 (2): 147–160. doi:10.1080/00014788.2005.9729670. 
  24. ^ a b c Craven, Tim (November 6, 2000). "LIS 504 - Graphic displays of data". Faculty of Information and Media Studies. London, Ontario: University of Western Ontario. Retrieved 9 July 2012. 
  25. ^ a b Fulkerson, Cheryl Linthicum; Marshall K. Pitman, Cynthia Frownfelter-Lohrke (June 1999). "PREPARING FINANCIAL GRAPHICS". The CPA Journal. 
  26. ^ McNelis, L. Kevin (June 1, 2000). "Graphs, An Underused Information Presentation Technique.". The National Public Accountant. (subscription required)
  27. ^ a b c Beattie, Viviene; Jones, Mike (June 1, 1999). "Financial graphs: True and Fair?". Intheblack. (subscription required)
  28. ^ Beattie, Vivien; Jones, Michael John (1 September 1992). "The Use and Abuse of Graphs in Annual Reports: Theoretical Framework and Empirical Study". Accounting and Business Research 22 (88): 291–303. doi:10.1080/00014788.1992.9729446. 
  29. ^ Penrose, J. M. (1 April 2008). "Annual Report Graphic Use: A Review of the Literature". Journal of Business Communication 45 (2): 158–180. doi:10.1177/0021943607313990. 
  30. ^ Frownfelter-Lohrke, Cynthia; Fulkerson, C. L. (1 July 2001). "The Incidence and Quality of Graphics in Annual Reports: An International Comparison". Journal of Business Communication 38 (3): 337–357. doi:10.1177/002194360103800308. 
  31. ^ Isa, Rosiatimah Mohd (2006). "The incidence and faithful representation of graphical information in corporate annual report:a study of Malaysian companies". Technical Report. Institute of Research, Development and Commercialization, Universiti Teknologi MARA. Retrieved 9 July 2012. 
  32. ^ Beattie, Vivien; Jones, Michael John (1 March 1997). "A Comparative Study of the Use of Financial Graphs in the Corporate Annual Reports of Major U.S. and U.K. Companies". Journal of International Financial Management and Accounting 8 (1): 33–68. doi:10.1111/1467-646X.00016. 
  33. ^ Beattie, V.; Jones, M (2008). "Corporate reporting using graphs: a review and synthesis". Journal of Accounting Literature, 27 . pp. ISSN 27: 71–110. ISSN 0737-4607. 
  34. ^ a b Christensen, David S.; Albert Larkin (Spring 1992). "Criteria For High Integrity Graphics". Journal of Managerial Issues (Pittsburg State University) 4 (1): 130–153. 
  35. ^ Eakin, Cynthia Firey; Timothy Louwers, Stephen Wheeler (2009). "The Role of the Auditor in Managing Public Disclosures: Potentially Misleading Information in Documents Containing Audited Financial Statements". Journal of Forensic & Investigative Accounting 1 (2). 
  36. ^ Steinbart, P. (September 1989). "The Auditor’s Responsibility for the Accuracy of Graphs in Annual Reports: Some Evidence for the Need for Additional Guidance". Accounting Horizons: 60–70. 
  37. ^ Beattie, Vivien; Jones, Michael John (1 January 2002). "Measurement distortion of graphs in corporate reports: an experimental study". Accounting, Auditing & Accountability Journal 15 (4): 546–564. doi:10.1108/09513570210440595. 
  38. ^ Frees, Edward W; Robert B Miller (Jan 1998). "Designing Effective Graphs". North American Actuarial Journal 2 (2): 53–76. 
Books

Further reading[edit]

External links[edit]