On the eve of the U.K. parliamentary election, the final polls are showing everything from a 12-percentage-point margin favoring Prime Minister Theresa May’s Conservative Party to a slim one-point edge over Jeremy Corbyn’s Labour Party. That variation stems mostly from massive differences in vote preferences by age and uncertainty about which voters will turn out.
SurveyMonkey’s election-eve poll conducted for The Sun gives the Conservative Party a tenuous 4-percentage-point lead, 42 to 38 percent, over Labour when allocating currently undecided voters. It’s an identical margin without the reallocation (39 to 35 percent; these data are from a survey of 11,853 respondents conducted June 4-6).
Pulling up the rear are the Liberal Democrats (6 percent), the United Kingdom Independence Party (UKIP, 4 percent), Scottish National Party (SNP, 3 percent), the Green Party (1 percent). An assortment of other parties together garner 5 percent.
Thus, a race that had looked to be a blow-out for the Tories when May called the special snap-election in April, at a moment when her party was “riding high in the opinion polls,” now appears much more competitive. The driver? May’s rating as Prime Minister is now underwater, with 45 percent of likely voters satisfied with her job performance and 53 percent dissatisfied.
That’s nearly a mirror image of the 52 percent satisfied, 47 percent dissatisfied with her just two weeks ago. Still, those numbers are far better than Corbyn’s; his numbers have improved somewhat but he’s still well underwater with 39 percent satisfied with him as Labour Leader, and 59 percent dissatisfied.
However, the most striking aspect of polling on the 2017 campaign is the massive difference in voter preferences by age. The Conservatives lead by better than four to one (61 to 15 percent) among voters over age 65, while Labour enjoys a lead nearly as large among 18- to 24-year olds (59 to 17 percent) and 25- to 34-year-olds (55 to 20 percent). The Tories hold a nearly two-to-one lead among all voters over 45 (49 to 26), while Labour leads by better than two-to-one (52 to 21 percent) among those under 44 (all percentages are based on the vote without undecideds allocated).
This exceptionally large age gap creates headaches for pollsters because of the typical differences in turnout. In the U.K., as in the United States and elsewhere around the world, younger voters typically turn out to vote at rates far lower than their elders. Since younger voters often overstate their intent to vote in public opinion polls, pollsters can easily exaggerate their proportion of the electorate. When differences in voter preferences by age are large, as they are in this year’s U.K. elections, the potential for an error in the horse race is much greater.
This pattern helps explain the considerable variation in the final polls. To help elucidate the uncertainty inherent in U.K. polling and provide appropriate transparency, we explain here our methods for weighting our results and modeling the probable electorate.
How we select and weight respondents. About 3 million people take user-generated surveys on the SurveyMonkey platform each day, including 2 million per week in the United Kingdom. We select a random sample of these respondents to take part in our election polls. After completing their initial survey, respondents see a “thank-you” page inviting them to take an additional survey.
While we reach a broad cross-section of the British population, the resulting raw sample tends to overstate those with ready access to the internet and those more apt to participate in surveys, especially the younger, college educated and politically interested. Fortunately, we are able to reach very large samples that allow the use of statistical adjustment, known as weighting, to correct for the sampling bias.
Our U.K. samples have been weighted for age, sex, education, country of birth, marital status, and region, using the U.K. Census 2011 demographic tables and British Election Study (BES) 2015 face-to-face sample to reflect the demographic composition of the nation. We also weight based on self-reported past voting in part to reduce an evident overrepresentation of more politically interested voters.
How we model the likely electorate – Simply adjusting the raw sample to match the full population is just the first step. But not all adults vote. Ideally, our sample should represent those who will be casting ballots in this year’s election, taking into account both their propensity to vote and their demographics. Unfortunately, many respondents exaggerate their self-reported likelihood to vote. Some people who tell pollsters that they’re certain to vote won’t end up doing so, and some of those who say they’re unlikely to cast a ballot, ultimately will. In particular, younger Britons are overrepresented among those who say they are likely to vote as compared with estimates from recent national elections.
To address this issue, we constructed a likely voter model using the post-election British Election Study from 2015. That survey, conducted in-person with face-to-face interviewers, featured an unusually high (56 percent) response rate and, most important, checked official records to determine whether each respondent had actually voted. We used these validated voter data to estimate each respondent’s propensity to vote based on their age, sex, education, religion, marital status, interest in politics, party identification, and previous general election vote history.
Running the BES-trained model on our current sample produces a turnout probability score for each respondent. The higher the score, the stronger the probability they will actually cast a ballot. To produce our final “likely voter” estimates we combine these probabilities with our survey weights. The effect is to include all respondents who report they have at least some chance of voting, but to significantly down-weight those who are less likely to vote. This assures that we do not overweight voters and demographic groups with a low probability of voting.
While we believe this model gets us closest to a sample that represents Britain’s probable electorate in 2017, the process is not magic. The biggest potential limitation is our reliance on the 2015 BES data to model the propensity to vote. Our horse race estimate may be in error if demographic turnout patterns in 2017 differ significantly from BES found in 2015.
Our data also illustrate how sensitive polls are in this election to the way pollsters design their likely voter selection methods, especially given the stark differences in voter preferences by age. Consider the following table, which shows how much the horse race result and age composition can vary based on slightly different alternative cuts of our data.
Again, our overall estimate of voter preferences gives the Conservatives a 4-point lead (39 to 35 percent) based on weighting to our modeled turnout probabilities. We would get essentially the same result if, instead of weighting on the probabilities, we had used them as a “cut-off,” treating respondents with a turnout probability of .55 or greater as “likely voters” and filtering out 627 respondents (unweighted) who were least likely to vote.
However, a slightly different demarcation between likely and unlikely voters could produce widely divergent results, ranging between a 9-point Conservative lead (42 to 33 percent) to 1-point Conservative lead (38 to 37 percent), based on varying the “cut-off” from .45 to .70. This big swing in the margin would come from dropping just 849 respondents out of 11,439 (keep in mind these scores are are modeled estimates of how likely each respondent is to vote, not a calibration to the overall level of voter turnout).
And if we were to drop our model altogether and define likely voters simply as those who say they are either “extremely” or “quite likely” to vote, or have already voted, the same survey would give Labour a 2 percentage point edge (39 to 37 percent). Notice that in each scenario, as Labour’s share of the vote increases, so does the percentage of 18- to 44-year-olds among all voters.
So how could Labour pull off a surprise, running even or perhaps slightly ahead? Our data indicate such a result is possible if the turnout of younger voters greatly surpasses levels recorded in 2015 and other recent elections. New registrations among 18- to 24-year-olds in 2017 reportedly surpassed those from 2015, so we may see an increase in the share of the younger vote this year.
How might the Conservatives win a bigger margin over Labour than our polls suggest? While our weighting and likely voter modeling strive to accurately represent the demographics and political interest level of the electorate, our efforts to remove the statistical bias favoring Labour in our totally unweighted data may have fallen short.
Consider the matter of the 7 percent who are still undecided about their vote choice. We allocate undecideds by dropping them from the calculation, on the implicit assumption that undecideds will either not vote or divide along the same proportions as the decided. It is still possible, however, that the undecided voters may swing overwhelmingly to one of the parties.
These still-undecided voters tell us, for example, that they prefer Theresa May to Jeremy Corbyn by a larger margin (50 to 31 percent) than all voters (53 to 43 percent). As such, a “break” of the undecided vote based entirely on their preference for Prime Minister could net the Tories an additional 1 to 2 percentage points.
Finally, all of this discussion centers on the total vote share, not on how the national vote translates into the 650 seats in the parliament, which like the Electoral College in the U.S, introduces yet another level of complexity.
The gist of this analysis is that the “real world” margin of error for this election is probably much larger than traditional error margins reported by pollsters (or the modeled error estimate of plus or minus 1.5 percentage points we reported for this survey). Conservatives may be “only a normal-sized polling error away from a hung parliament,” as FiveThirtyEight’s Nate Silver puts it, or a normal-sized error away from double-digit Conservative landslide. If such error occurs, the patterns by age will likely be among the most important explanations.
This article is cross-posted to the SurveyMonkey Election Tracking blog.