While polls are often criticized for failing to reach the right voters or ask the right question, there's one crucial aspect that's often overlooked — the undecideds.
This oversight is particularly important because it's potentially a major source of errors in polls and could well lead to the Conservatives being underestimated in this election.
Polls always have undecided voters in their samples, though the exact definition can vary from one pollster to the next.
Some firms give options such as "don't know" or "will not vote," while others don't. Some ask multiple questions. For instance, they can ask who they intend to vote for and then, if someone doesn't express a preference, a second question like, "Are you leaning towards one party?"
Because of the differences in methodologies (questions asked, collect of data, etc), polls have very different numbers of undecided. Abacus, for instance, had as much as 24 per cent of its sample in this category. On the other hand, Forum polls tend to have a ridiculously low number of undecided (usually around five per cent). In average, it's about 10 per cent of respondents who can't choose a side.
How do polling firms deal with these undecided? They usually distribute them proportionally.
Here's an example: imagine you have three parties (A, B and C). The poll finds 40 per cent support for Party A, 30 per cent for Party B, and 10 per cent for Party C, as well as 20 percent of undecided (there is no small or fringe party in this scenario).
Among decided voters, Party A thus has half of the votes (40 out of 80 per cent). Most polling firms will thus distribute 10 per cent of the undecided to Party A (so half of them), giving it 50 per cent support in total.
Doing the same with the other two parties give us results that sum to 100 per cent.
The main problem is that distributing undecided support proportionally isn't really logical. This is actually equivalent to dropping or ignoring them altogether (like assuming they won't vote at all). Alternatively, you can rationalize this as having the undecided ultimately voting like the decided voters. You can see that both interpretations don't make much sense.
It also means that whoever is polled higher (among decided voters) will automatically get a bigger share of the undecided. In our fictional example, the final results (that would be published in the media) would be:
Party A: 50 percent
Party B: 37.5 percent
Party C: 12.5 percent
As you can see, the 10-point lead of Party A among decided voters is now a 12.5 points lead thanks to having been allocated more of the undecided.
Does it make sense that whoever is first among decided voters automatically gets more undecided support? We don't necessarily think so. And when there can be as much as 24 percent of undecided, what we do with them is crucial.
We believe it would sometimes be better for pollsters to report their number before distributing the undecided (to be fair, they all do in the detailed documents on their websites, we are talking here about how the polls are reported and published in the media).
On the other hand, don't see the percentage of undecideds as a reason to believe that everything is possible. In our example, Party B could indeed win by getting all of these undecided voters and therefore tying Party A, but this is so unlikely that it's borderline ridiculous to even think this way. But this is a topic for another day.
If we go back to the undecided and what to do with them, there isn't a perfect solution. After all, if you really have some people who haven't made up their mind (or refuse to tell you), there isn't much you can do about it. With that said, we can certainly do better than distributing them proportionally. Not least because the proportional distribution often lead to error, specifically an underestimation of the incumbent and an overestimation of the smaller parties.
In Quebec, poll expert Claire Durand has successfully allocated more undecided to the Liberals and less to the Parti Quebecois for a very long time. More generally, if we look at the past elections in this country, we observe a strong trend for the polls to underestimate the incumbent. We've seen this in Alberta in 2012, in B.C. in 2013, in Ontario last year.
Federally, the last two elections have seen the polls underestimating the Conservatives by about three points. Let's take a look back to 2011. I managed to collect most of the major polls of the end of the campaign. I averaged them by province and compared that to the actual results. The table below shows you the differences in percentage points. If it's positive, it means the polls were overestimating this party, if it's negative it means there was an underestimation.
As you can see, the underestimation of the Conservatives was systematic. The error was especially important in Ontario. This, right there, is the main reason why seat projections in 2011 mostly failed to predict a Tory majority. When polls miss the mark by more than five points in the most important province, you can expect to make mistake when using these polls to make predictions. Hopefully the polls will do better this year, although Ontario may not be the key.
Is there anything we could do? The method we use is to distribute undecided support similarly to what Claire Durand has been doing in Quebec for a while. First of all, we don't allocate any to the small parties (any party outside of the Tories, Liberals and NDP). When polls are published with the numbers after redistribution only, it means we need to work backward to remove the undecided allocated to the smaller parties.
Secondly, we allocate half of the undecided to the incumbent and we split the rest evenly between the NDP and Liberals (except in Quebec where the NDP and Liberals are receiving the biggest share).
How much of a difference does it make? The current unadjusted poll average would have the Tories at 28.9 percent, the NDP at 33.6 and the Liberals at 27.4. With the adjustments, we get 30.7, 32.8 and 27.2 respectively. As you can see, it does make a small difference. Specifically, it makes the current race a lot closer between Stephen Harper and Thomas Mulcair than what the polls would say if taken at face value.
In the recent Alberta election, this method allowed us to be almost spot on. You may think polls for this election were very accurate but that wasn't actually the case. If they did indeed correctly predict a NDP victory, they overestimated the NDP by two to three points and underestimated the Progressive Conservatives by three to four points. For the 2011 federal election, a similar method would have decreased the underestimation of the Tories from 3.4 to two points. Not perfect but closer.
This isn't a magical solution. Every election is different. It could be that Harper and his party are actually overestimated and the vast number of people who just want a change will ultimately have an impact on Election Day.
On the other hand, Harper remains the “safe" option. He's the incumbent, he has been in power for nine years now. Voting for him means you mostly know what you'll get. The possibility of him being underestimated by the polls is shared by others, so is the observation that incumbents are often underestimated as well.
For people who could make up their mind in the voting booth, going with the safe and known option could be quite significant.
Also on HuffPost