Absolutely no one should have been surprised by Tuesday night’s results in New York. HuffPost Pollster’s polling averages had Donald Trump far ahead, and Hillary Clinton with a clear advantage over Sen. Bernie Sanders (D-Vt.) -- which means the polls that go into the averages were right.
The average error is under 5 percentage points for the eight pollsters who polled both the Democratic and Republican races in the last week before the primaries. Two of those polls -- the NBC/Wall Street Journal/Marist poll and the Emerson College Polling Society poll -- averaged about 2.6 percentage points of error across both Republican and Democratic primaries.
In a primary season that began rather ominously for pollsters when Iowa polls missed Sen. Ted Cruz’s (R-Texas) win, and had another snafu in Michigan when Sen. Bernie Sanders (I-Vt.) pulled off a surprise victory, these close numbers in New York are a substantial win for pollsters.
Of course, in between Iowa and Michigan there were plenty of polling successes too -- New Hampshire, South Carolina, Florida and several other states’ polls were generally correct. But the historic size of the miss in Michigan caused some to question if the New York polls would be right.
In some ways New York is a difficult place to poll. The imbalance between the New York City population center and the rest of the state puts tension on more than just the question of where “upstate” begins. About 40 percent of the population lives in the city, and about 65 percent of the state’s population is in the general metro area. Pollsters have to dial carefully to account for the city bias.
The closed primary, however, makes the state a bit easier for pollsters. Only registered partisans were eligible to vote in the Republican and Democratic primaries, leaving independents without a say. That’s good for pollsters -- they don’t have to try to figure out whether independents would vote, or in which party’s primary they would vote. Independents are notorious for throwing polling results off, since they can decide which party’s primary to vote in at the last minute.
Another benefit: Many of the pollsters in New York are very familiar with the state’s politics and population. Of the eight pollsters who polled in both primaries within the last week before the election, three are at colleges located in the state -- Marist College, Siena College and Baruch College. Two others are in neighboring states -- Emerson College in Massachusetts and Quinnipiac University in Connecticut. It’s safe to say the region has a strong tradition of college polling.
The college polls did very well. Of the eight polls, the NBC/Wall Street Journal/Marist poll had the lowest error, with only a 2.58 percent average deviation from the actual result margins between candidates -- and that includes their Republican poll and the Democratic poll. (Full disclosure: I used to work for the Marist Poll. But that didn’t influence my calculations, which you can see here.)
The Emerson College Polling Society’s last poll was close behind, with an average error of 2.63 points. Quinnipiac University was next, with an average error of 3.88 points, and the Baruch College/New York 1 poll averaged 5.23. Siena, Public Policy Polling (D) and CBS/YouGov each had 6.13 percentage points of error on average, and Gravis/One America News Network averaged 7.13 points of error. Details of those calculations are below.
Although this is a small subset of polls and pollsters, it’s interesting to look at the methods the polls used in comparison to their performance. What stands out most is that the second-most-accurate poll is an automated phone poll: Emerson’s polling relies completely on recorded voice technology polling, which due to its automated nature can’t be used to call cell phones. That’s tricky since nearly 50 percent of the national population is cell-phone only, meaning they don’t have landlines.
But in New York, the proportion of the population that’s only reachable by cell phone drops to 30 percent. And people who only use cell phones -- who tend to be less affluent, and more heavily concentrated in urban areas and among minority groups -- are less likely to vote in a primary election.
It’s impossible to say mode made a difference in how the polls performed, though. Gravis, the other automated phone pollster in the group, had the highest average error of the group. Whether the sample came from voter files or other sources doesn’t seem to matter, either -- some of the polls in this group relied on voter files, but there’s no difference between their error and other polls’ error. This is an example of why HuffPost Pollster doesn’t use mode or sampling source as criteria for determining what polls we include in our charts.
These pollsters should be doing victory laps. This article will attract much less traffic than it would if the polls had been wrong -- praising polls is much less interesting than denigrating them -- but Tuesday night was very good for pollsters.
And if you don't like polls, don’t worry -- I’m sure we’ll see another round of “polling industry in crisis” articles soon enough. The field does face considerable challenges.
Average error calculations: This is a very simplistic error calculation. I calculated the error on the Democratic primary polls by subtracting the poll’s margin between Clinton and Sanders from the actual vote margin between Clinton and Sanders. For the Republicans, I calculated the margin between Donald Trump and second-place finisher Ohio Gov. John Kasich, then the margin between Kasich and Cruz, and averaged the absolute value of the two margin errors together. To get the average error for both Republicans and Democrats, I averaged the total Republican error with the absolute value of the Democrat error.