I don’t want to make this too emotive – I’ve already toned down the title from “Why do people lie?” But, why indeed? There are probably a thousand psychological reasons but I’m asking in just one specific context: opinion polls.

I’ve often said that it’s impossible to be interested in data visualisation without taking an interest in political opinion polls, particularly in the run-up to high-profile elections. Geographic data displayed in many different ways, changing trends over time, predictions based on existing data – many of the key elements in visualising data and showing simple insights.

There are so many fantastic visualisations from before election night, all showing the lie of the land at the time, all allowing the reader to decide whether the most likely winner would be Trump or Clinton.

First – here’s a wonderful interactive viz from Ryan Rowland: it’s captured the imagination so much that it’s had over a quarter of a million views (click through to interact).

tvc

Next, below is a simple visualisation from me last month which uses exactly the same dataset (click through for interactive version). I don’t claim it’s a great visualisation but it’s of a certain formula – a schematic map showing current predicted vote adjusted to take account of the college electoral system, a timeline of predictions based on previous polls up to the (then) present time, a prediction of final outcome, based on most recent polls, and an additional look at swing states to get some kind of idea of potential for change.

race-to-white-house-2

Looked great last month though I say so myself. Looks pretty stupid now.

I’m new to data visualisation but have worked with survey data for over twenty years. Polls have been criticised for as long as I can remember, even when their predictions have been generally accurate. The general truth is that in polls from most reputable institutions sample sizes are large enough to give a decent reflection of national opinion (even if nobody you know has ever been asked). If samples are weighted, they are done so scientifically and correctly, not with any bias toward red, blue or yellow.

So why did all the polls get it wrong? They got it badly wrong for the US election, badly wrong for Brexit and badly wrong for the UK general election. Or did they? I actually don’t think they did. The inference from my visualisation, based on poll data, is that if people vote the way they said they would in this poll on the given date, Clinton would win easily. Science, sampling and accurate questioning can account for most things, but there’s one thing they can’t account for. And that is people not voting for who they say they’ll vote for.

I don’t really want to veer too far into politics or psychology, since I should stay on safer ground with polling and data visualisation. But for what it’s worth, it’s clear that the unpredictability of polls when it comes to predicting major election results has become more prevalent in the age of mass media (especially social media). In all three recent surprise results, it’s been the less moderate, more right-wing, more anti-immigration option that has got more votes on the day. People feel the liberal media are telling them what to do, and they don’t like it. They won’t admit to it to an interviewer, a survey, or a pollster, but behind a curtain or in a sealed envelope, they will vote Tory, Brexit or Trump. And when I say “people” all I mean is a couple of percent of the population. Enough to make a difference. And enough to make well-designed polls look stupid. The more that everyone’s vote (and the implied inference of what kind of person that makes the voter) becomes under the proverbial microscope, the more people are withholding their true voting preference all the way through a long, protracted campaign.

In the UK at the last General Election they called it the “shy Tory” phenomenon. And in the US, I came across one good article that goes some way to explaining the phenomenon

http://www.latimes.com/politics/la-na-pol-donald-trump-american-voices-20161113-story.html

I’m not talking about propaganda and lies by politicians and media outlets. Depending on whether your politics are red, blue, yellow or any colour under the rainbow, you’ll probably think that’s going on everywhere at the hands of your political opponents. Here’s a debunking of a pro-Trump article after the election that I know to be untrue. Why? Because I know the people on that bus – they were data nerds like me en route from the Tableau Conference to the evening’s party!

http://www.snopes.com/anti-trump-protesters-bused-into-austin/

I’ll watch with interest to see if future polls try and take account of respondent behaviour and reticence to admit to certain beliefs. Because if not, the opinion polls may have taken one too many blows to their credibility, and that in turn will have effects on one of the most fast-moving, and up-to-date areas of data visualisation.

I learned last week that data visualisation is like a language – if your readers can’t understand it, you’ve done it wrong. I’m not sure what the revised analogy is here: if your readers don’t believe your visualisation, then you haven’t done it wrong per se, but its existence becomes pretty worthless.

People are making idiots of the pollsters and subsequently making visualisations look meaningless. I have grown weary and seen enough red and blue lines and distorted coloured cartograms to last me a long time. Which is a real shame, as these are exactly the kind of things that got me excited about data visualisation in the first place.

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s