I didn’t think there was any way Republican Donald Trump would win. But my presidential and Senate forecasts for The Huffington Post badly differed from what played out on Tuesday night and Wednesday morning. To say I’m disappointed would be a huge understatement.
Our model predicted that Democrat Hillary Clinton had a 98 percent chance of being elected. That was more pro-Clinton than most other forecast models (although all of them predicted a Clinton win). Our model said five Senate seats would shift from Republican to Democratic, giving Democrats a likely majority.
According to poll averages that went into our forecast, Clinton should have won 323 electoral votes, compared with Trump’s 215. Instead, Trump will probably end up with 306 and Clinton will have 232. Florida, North Carolina, Pennsylvania, Wisconsin and Michigan all were wrong in our model. In the Senate, we predicted three races incorrectly: Indiana, Pennsylvania and Wisconsin.
The model relied completely on polls. No adjustments, no “fundamentals” (which include other things, like presidential approval or economic indicators, that help predict elections). Just polls. The underlying statistics are very sound ― the base model is a Bayesian Kalman filter time series that incorporates polls over time to find the underlying trend.
The model structure wasn’t the problem. The problem was that the data going into the model turned out to be wrong in several key places.
So, instead of Clinton’s coronation, we got a black swan event and President-elect Trump.
I remained optimistic about Clinton’s chances on election night until about a third of the votes in Michigan, Wisconsin and Pennsylvania were in. Then I began telling people to prepare for a Trump win. That’s not to brag that I was right ― anyone with an understanding of election returns and maps could read the writing on the wall at that point ― but because so many people had been leaning on me to tell them what was going on.
It gutted me to realize I had been wrong. I truly did my best and read the data the way I saw it. No one wants to feel like this ― to be so utterly and publicly mistaken. People on Twitter have been calling for me to be fired. But what happened is done. What I can do now is learn from it, and be open and honest about the process.
We don’t have a lot of answers about why the polls were wrong in the states where they missed. It’s still too early to tell. Rest assured, there will be a lot of analysis from pollsters and from the HuffPost Pollster team to figure that out. The American Association for Public Opinion Research already had a committee in place to evaluate pre-election polling prior to Election Day. That committee’s work will be hugely important to identifying where polls were wrong.
But there are some key lessons regarding the forecast model that I can address now.
Polls alone probably don’t make a reliable forecast model.
After a deep dive into my own biases, I’ve concluded that my biggest is that I put too much trust in polls. In the final days of the campaign, I went on record saying that if the polls go down, we’re going down, too. For the manager of a project like HuffPost Pollster, that seemed to be a reasonable approach. The problem was that I placed way too much faith in polls. I assumed they would be right.
In retrospect, it’s easy to second-guess that assumption. There’s more error in polls than most people realize. I know that, and have done substantial research on poll error (with more in the works). But I kept looking at the consistency of the polls. They wavered in the exact margins, sure, but always showed Clinton winning in the key states that she needed to win. I saw no reason to question that the polls would be accurate overall. So I defended and stood by the numbers ― as anyone who trusts their work does. That’s left me eating some crow. But that’s okay ― I’ll learn and get better at challenging my own biases.
Political science forecasts had a better night than poll-based ones. As John Sides pointed out, several very early forecast models constructed by political scientists pointed toward a Trump win, or at least a very close race. Some of those models focus on the cyclical nature of American politics ― it’s rare for the same party to hold the White House for three terms in a row ― and economic growth indicators. With only modest economic growth and a two-term Democratic president, those models showed Republicans with an advantage.
None of that was accounted for in the HuffPost models ― a deliberate choice I made to trust the polls. Some other forecast models included economic, campaign and other information, but the effects were minimal. Most of the emphasis in all of the media forecasts was on polling. That didn’t serve us well, especially if there really was a problem with “shy” Trump voters or a silent majority that just won’t answer pollster questions ― which now seems like a real possibility.
We may have excluded polls that mattered from our model.
HuffPost Pollster enforced stricter standards this cycle than in previous elections. We required pollsters to provide all of the basic information about how the survey was conducted. Most pollsters met those requirements ― although notably this hurt us in Indiana, where few pollsters were active. Some of the few Indiana polls weren’t released with full information, and pollsters declined to provide what we sought. That caused us to miss the switch in the Senate race from favoring Democratic candidate Evan Bayh to a win for Republican Todd Young.
Our decision to not include all-landline automated polls might have been more consequential for other races. As Real Clear Politics’ Sean Trende pointed out on Twitter, the electoral map using its polling averages wasn’t as far off the outcome as everyone else’s. Real Clear Politics uses those landline polls and doesn’t include some online polls. HuffPost Pollster uses more online polling and excludes landline polls. Steve Shepard also pointed out that many of the most accurate final polls were from all-landline automated pollsters.
The landline polls most likely just got lucky in that their biases toward older people in rural areas who still have landlines trend conservative, and the electorate was more conservative than anticipated. We’re going to look into whether that affected our averages substantially.
The picture likely won’t be as simple as one way of conducting surveys doing better than another, though. FiveThirtyEight talked to several pollsters about how their polls did, and one stark difference in tone stood out to me: Patrick Murray of the Monmouth University poll was addressing how its polls were wrong. Barbara Carvalho of the Marist College poll was discussing what her group got right. There are some differences in sampling and procedures between those two pollsters, but both are high-quality telephone polling operations. One worked better than the other. The big question is why.
National polls aren’t necessarily helpful.
This isn’t directly related to the HuffPost model, which didn’t use national polls in a way that had much effect on its estimates. But it’s worth noting that 2016 is the second time in five elections that we’ve seen an Electoral College outcome that’s different from the popular vote.
The national polls weren’t that far off from the actual outcome of Clinton leading by a slim margin in the popular vote. A few showed the race even, or within a single percentage point. Others showed Clinton up by 3 to 4 points. It looks like the actual vote count will be within those polls’ margins or error. So, while the national polls are a decent indicator of the popular vote, they’re not really helpful for an election that’s decided by the Electoral College. We need state polls for that ― and the state polls didn’t fare so well.
We should answer challenges with facts, not thunder.
This last point has absolutely nothing to do with the forecast model. It’s a personal note. In the last week, I gave in to the negativity and began hitting back when Nate Silver of ESPN’s FiveThirtyEight would say that 90+ percent probability forecasts were unreasonable. I still don’t think that was a fair assessment. But my criticisms of his comments (on Twitter) were equally unfair. It doesn’t matter whether you win the argument. Twitter fights on methodology are counterproductive and don’t help improve polling.
Silver was right to question uncertainty levels, and was absolutely correct about the possibility of systematic polling errors ― all in the same direction. Clearly, we disagree on how to construct a model, but he was right to sound the alarm.
As we move forward from this dramatic miss, the polling and forecasting industry owes it to the public to be completely transparent about what went wrong, what we’re finding, and how we’re going to improve. That work will take time and will come in small pieces as we learn things over the coming months.
HuffPost Pollster is taking on that work ― all of our processes are on the table ― and we’ll keep our readers informed every step of the way. We owe it to you.