25 May, Saturday
21° C

Proakatemian esseepankki

Deceiving the human mind with data

Kirjoittanut: Teemu Istolainen - tiimistä Avanteam.

Esseen tyyppi: Yksilöessee / 2 esseepistettä.

How to lie with statistics
Huff, Darrell
Esseen arvioitu lukuaika on 4 minuuttia.

Released in 1954, How to Lie with Statistics by Darrell Huff is still relevant for navigating in today’s infographic flood in social media. The book is

One of the main aspects which creates inaccuracy and invalidity is the sample. The example in the book is that an average Yale 1924 class graduate earns $25,111 a year according to Time magazine. Take inflation in consideration and that would be around $372,000 in 2020. But why is the number not correct and way too precise? That would be because of the sample. The average was biased from the start, since there were four categories of alumni; those who responded, those who did not, those whose addresses were unknown, and those who were dead. From the people who replied, how would we know that they answered truthfully? Did they exaggerate or underestimate? Which kind of person would not reply? We can deduct that the underachievers would not answer to the survey willingly, same as people who make significantly more than the average. Those two groups by themselves depress the average.

Continue with the average. Lying with the help of average is easy, since in statistics, there is three kinds of average.

  • Mean

The arithmetic average, mean is used when you want to get the traditional average, for example add ten different persons salary together and divide by ten to calculate the mean average.

  • Median

The median average is achieved by separating the higher half and the lower half of a sample, leaving you with the “middle” value.

  • Mode

Mode is the most frequently occurred value in a sample. Below you can find a quick example of all three averages from a sample.

Anna and Brian both have a yearly salary of $20,000. Charlie makes $30,000 a year. Danny earns $35,000 yearly. Emma’s yearly salary is $50,000, Francis’s $65,000. George earns $200,000 a year. In this case, the mean average would be $60,000, with the median being $35,000 and the mode would be $20,000. Now you just choose the one which is the most convenient and supports your claims.

The next chapter consists of utilizing a small group sample for advertising purposes. “Users report 23% fewer cavities with Doakes’ tooth paste” says the big type. However, the small type reveals the sample size of just a dozen persons. Letting any group keep count of cavities for a few months, then switch to Doakes’. One of three possibilities is bound to happen: distinctly more cavities, distinctly fewer, or about the same number. Sooner or later the sample will display distinctly fewer cavities, which is worthy from advertisement point of view. Flip a coin ten times and record if it is heads or tails (or if it lands on its side). I tried it and got 7 tails and 3 heads. According to my sample, flipped coins would come up tails 70% of the time. If I were to flip a coin a thousand times, the result would be closer to 50%.

Statistics are inevitably bound to have some kind of margin of error. For example, the Stanford-Binet test is one of the most accurate intelligence tests there is. According to the test, Peter’s IQ is 98 and Linda’s 101. Having a quick look at these numbers, any sane person would say that Linda is more intelligent and above average, it being 100. However, the probable error of the Stanford-Binet test has been found to be 3%. Taking the margin of error into account, Peter’s IQ is actually 98 ± 3 and Linda’s 101 ± 3. So, there is one in four chance that Peter’s accurate IQ is 101 and Linda’s 98. Comparing figures with small differences are then meaningless if there is a probable error included.

Next, you will see three different charts with exactly the same information to show you how easy it is to trick the human mind. This trick is known as “the Gee-Whiz Graph”. The graph shows how national income increased ten per cent in a year.

Figure 1: (Geis, 1954)

The first figure accurately depicts the 10% growth in national income in a year. Not so impressive.

Figure 2: (Geis, 1954)

The second graph is cropped and contains the same information as the one above. However, the graph is more visually pleasing and now the growth is half of the graph.

Figure 3: (Geis, 1954)

Figure 3 is then modified by modifying the mark-ups on the left for one-tenth as many dollars as before, stretching the graph to look even more impressive than before.

This book demonstrates how human mind can be deceived by providing statistics which support the providers cause. Even though the book is released 66 years ago, the content still applies to today’s statistics. After reading this book I will take each graph, diagram, and chart I see on social media with a pinch of salt. Nine out of ten dentists recommend Oral-B. But what does the tenth dentist recommend?


Huff, D & Geis, I (1954). How to lie with statistics. New York. Norton.