WASHINGTON -- If you care about politics and spend a fair amount of time online, odds are good that sometime between 5 p.m. and 7:30 p.m. Eastern time on Tuesday, you will encounter someone sharing leaked "exit poll" numbers that purport to tell you who will win various battleground states.
It may be tempting to assume the numbers will tell you who is going to win the presidential election.
They probably won't.
Hard as it may be, you should try to ignore them, at least until the polls close. And even then, take the underlying vote estimates with big grains of salt.
But let's first stipulate that once weighted to match the actual outcome, exit polls are an incredibly valuable resource. Conducted every two years by Edison Research for the National Election Pool partnership of ABC, CBS, CNN, Fox, NBC and the Associated Press, the exit polls are easily the best measure available of who voted for each candidate and why.
So it was for good reason that many political junkies expressed disappointment when news spread that the National Election Pool will only conduct complete exit polls in 31 states this year, down from the usual 50. In recent years, these surveys have become far more expensive, as the rise in early voting created the need for parallel telephone surveys to measure the preferences of early voters.
But as tools to predict the outcome of close races just before the polls close, they are blunt instruments at best. Here's why:
First, an exit poll is just a survey. Like other polls, it is subject to random sampling error, so differences of a few percentage points between the candidates in any given state sample are not terribly meaningful.
Second, the networks almost never "call" truly competitive races on exit poll results alone. The decision desk analysts require very high statistical confidence (at least 99.5 percent) before they will consider calling a winner (the ordinary "margin of error" on pre-election polls typically uses a 95 percent confidence level). They usually only achieve that confidence for relatively close races after the exit pollsters obtain the actual vote results from the randomly selected precincts at which interviews were completed (and from other larger random samples of precincts) and combine all of the data into some very sophisticated statistical models.
Even then, if the models project that the leading candidates are separated by just a few percentage points, as pre-election polls suggest they will be in all of the key battleground states, the networks will usually wait until nearly all votes are counted to project a winner.
Third, the initial results of the exit poll interviews have had frequent problems with non-response bias, a consistent discrepancy favoring the Democrats that has appeared to some degree in every presidential election since 1988. Usually the bias is small, but in 2004 it was just big enough to convince millions of Americans who saw the leaked results on the Internet that John Kerry would defeat George W. Bush. It didn't work out that way.
The resulting uproar led the networks, beginning in 2006, to hold back the data from their news media clients in a sealed quarantine room on Election Day until 5 p.m. Eastern time. The quarantine means that any numbers purporting to be "exit polls" before 5 p.m. are almost certainly bogus.
To try to minimize the potential errors, the networks often weight the first official tabulations they post on their websites to an estimate of the outcome (called the "composite") that combines responses to the exit poll interviews with the averages of pre-election polls (like those reported by HuffPost Pollster).
But that process is imperfect and does not remove either the random error or the initial statistical bias that often favors Democrats. Four years ago, on Pollster.com, we gathered all of the official tabulations posted as polls closed and extrapolated the underlying estimates of the outcome for each state. When later compared against the final vote counts in each state, we found that the initial estimates had overstated Barack Obama's margins by an average of 4.7 percentage points.
Does that mean that we can just subtract four or five points from Obama-minus-Romney margin and get a more precise estimate of the outcome? Nope. First, 4.7 was an average. The errors in individual states varied widely from 16-point overstatement on the margin favoring Obama to a 5.5-point error favoring John McCain, with misses spread across the spectrum in between.
Also, there is no guarantee that the 2008 errors will repeat at the same magnitude or that the exit pollsters have not made some adjustment this year to correct for past problem. Keep in mind that these issues do not lead to missed calls, both because the decision desk analysts are aware of them and because they have the ability to estimate and correct the errors in near-real time, as they systematically compare the incoming vote returns from the sampled precincts to the exit poll responses gathered from them.
Those of us seeing leaked data, however, see neither the running calculations of the precinct errors nor the levels of statistical confidence associated with the vote numbers. We see only precise-looking percentages and are oblivious to the potential for error.
As one pundit put it four years ago, the exit polls "have become crack cocaine for political junkies looking to score on Election Day." We would be better off, he said, if we relied on the exit polls for "their original purpose, explaining who did what and why, rather than trying to forecast what will be widely known anyway in just a few hours."