Things are going well right now for Sen. Ted Cruz (R-Texas). On Tuesday, Quinnipiac declared that he was polling at 23 percent among likely Iowa Republican caucus participants -- just 2 points behind the front-runner, businessman Donald Trump. A lot of people got excited about this, because the Quinnipac poll has a 4 percent margin of error, and hey, two is less than four. So maybe this means that Trump and Cruz are kinda, sorta... tied?
Not really. “Within the margin of error” simply means that we can’t be 95 percent certain Trump is genuinely leading Cruz in this poll.
Pollsters use the margin of error to describe how much error there might be in a poll estimate due to random sampling. It’s almost always calculated at the 95 percent confidence level. This means that for a poll with a margin of error of 4 percentage points, you could conduct the same poll 100 more times with 100 different random samples and the results would still end up within a 4-point range in 95 of the polls. There are many other sources of potential polling error that are not included in a margin of error, but basic sampling error is the most easily calculated and the most often reported.
When two poll estimates fall within the margin of error -- such as Cruz’s 23 percent and Trump’s 25 percent -- all it means is that we’re somewhat less than 95 percent confident that Trump would still lead Cruz if the poll used a different random sample.
In fact, there’s about an 80 percent chance that Trump actually leads Cruz in the Quinnipiac poll. There is a slim chance they could be tied or that Cruz could be in the lead, but in 80 of 100 identical polls using different random samples, Trump ends up having the lead.
The table above shows more examples of the likelihood of Trump being ahead in hypothetical poll estimates. The probability that Trump is leading is based on calculations simulating a whole bunch of poll estimates, based on Trump’s and Cruz’s support estimates and the sample size of the poll itself. Even when there’s only a 1-point margin between the two candidates, Trump still holds a slim advantage.
There are two patterns that emerge in the table’s examples. First, we see that as sample sizes get smaller, the probability of Trump’s lead decreases. The margin of error for the entire poll gets smaller as the sample size gets larger.
Second, if Trump's and Cruz's estimates are both close to 50 percent, Trump's lead probability is slightly lower than when both candidates' vote shares are in the 20s. The margin of error for an individual number in the poll decreases as the percentage estimate moves away from 50 percent -- either falling toward zero, or climbing toward 100. So if Trump is polling at, say, 25 percent in one poll and 45 percent in another, his numbers have a smaller margin of error in the first poll. And similarly, if Trump is polling at 55 percent in one poll and 75 percent in another, his margin of error is smaller in the second poll -- the one that puts him further away from 50 percent.
Most pollsters report a single margin of error that is based on the sample size and an estimate of 50 percent, because it would be very confusing to put a different margin of error on each number. However, when considering whether a candidate is leading, it’s important to remember that numbers further away from 50 percent have smaller margins of error. With 14 Republicans still in the race, none are likely to get to 50 percent any time soon -- and for all of Cruz's sunny numbers lately, it's still a bit premature to say that he's polling in the same league as Trump.
Technical note: The estimates for probability of the poll’s leader being ahead were generated using an R function that calculates the standard error for the difference between the proportions. This standard error is used as the standard deviation for 1,000,000 simulations on a normal distribution, with the difference between candidates as the mean of the distribution. The proportion of times the leading candidate is ahead in those simulations is the probability that the candidate actually leads after accounting for the margin of error. The R code can be seen on GitHub.