Now that all the votes are counted and certified, we can get an accurate look at how polls performed relative to actual outcomes in the 2016 election. While we’ve heard many arguments about why polls failed this year, the real problem isn’t that polls inaccurately predicted the winner. It’s that they all failed in the same direction.
In the polling aggregates that included all 50 states, poll errors favored Hillary Clinton over Donald Trump by a ratio of roughly 6 to 1 in Daily Kos estimates, about 5 to 1 in HuffPost Pollster’s aggregates and around 3 to 1 in FiveThirtyEight’s aggregates.
Polling aggregates are the simplest way get a macro view of how polls performed. Each aggregator uses different methods, and some exclude certain types of polls. But this year, that didn’t matter ― Clinton was systematically overestimated in all of them.
HuffPost Pollster’s estimates underestimated Trump’s performance by at least 1 percentage point in 36 states, FiveThirtyEight underestimated him by at least 1 point in 33 states and Daily Kos did the same in 37 states.
The chart above illustrates just how lopsided the aggregate estimates were. The vertical black line at 0 represents the actual vote margin. Anything to the right of that line indicates that the aggregate polls missed in a way that underestimated Trump’s vote. Anything to the left of the line indicates that they underestimated Clinton.
In the few states where there seems to be no bar ― Virginia and Colorado, for example ― aggregates were very close to the actual vote.
The graph clearly shows there were far more Trump underestimates. But it also shows that the president-elect was underestimated by more than Clinton. HuffPost Pollster underestimated Clinton most in California by about 6 points; FiveThirtyEight and Daily Kos underestimated her most in Hawaii by 8-9 points. Contrast that with the other end of the chart: HuffPost Pollster and Daily Kos underestimated Trump by more than 10 points in seven states, and FiveThirtyEight did the same in nine states.
The misses were smaller in battleground states, although still nearly uniform in underestimating Trump. More aggregators had estimates for these states, including RealClearPolitics and The New York Times Upshot.
Out of the 15 battleground states in the chart above, polls underestimated Clinton in just two. In the other 13, polls underestimated Trump, though they came very close to the actual result in Virginia.
Adding to the issue, in the states Trump was expected to win, polls had him ahead by a maximum of 2 to 3 percentage points. For Clinton, that maximum was north of 8 points.
It’s noteworthy ― but not unusual ― that the largest poll misses were actually in states where polls accurately predicted the winner. So while polls had the right candidate, they didn’t predict what a landslide the results would be in California and Hawaii on the Democratic side and West Virginia and Tennessee on the Republican side. This is a pattern we’ve known about for many election cycles. Due to undecideds and “other” categories, polls almost always underestimate the winner’s vote share when it’s an overwhelming win.
That tells us something interesting about polling imprecision: We ignore errors when they’re in the right direction, even when they’re really big. That has to stop if polls are to regain any of the credibility they lost this year. We can learn just as much by studying how polls underestimated Clinton by 8 points in California as we can from studying Trump’s 6-point underestimate in Wisconsin.
But all of these misses would be far less concerning if they weren’t so one-sided. In theory, polling errors should be normally distributed ― that means Clinton should have been underestimated just as much as Trump was. The charts show very clearly that it’s not even close to equal this year. That means there was a systemic problem in polling in 2016.
The claims that polling errors this year weren’t that bad offer little comfort in the face of evidence that the vast majority of the misses benefitted one side. Just because it’s an average-sized miss doesn’t mean it’s OK; it means we’ve been ignoring errors for too long. And many of these state-level misses are well beyond average, regardless of which polls are included in the aggregate or how the average is calculated.
It would be a mistake to keep pretending the election polling industry doesn’t have work to do. We can do better.