Huffington Post Pollster does not include the USC/LA Times poll in their general election polling trend, but RealClearPolitics does. And, today, August 22, 2016, the USC/LA Times poll has Republican Donald Trump up 44.6 to 43.5. I do not believe the level of the poll (i.e., the head-to-head value), but I believe there is a lot of information in the movement of the poll. The sharp movement toward Trump over the last few days does mean a movement toward Trump among the respondents in their panel, but their actual value of Trump up by 0.9 percentage points could be way off.
The poll uses a panel of 3,200 people who answer their demographics at the start of July and then answer their voting intention once per week through Election Day. Each respondent in a seven day period is weighted so that the sample resembles the voting population from 2012. Then, they further weigh each respondent by their stated likelihood to vote. Then, they report the fraction voting for each candidate.
I love this type of experimental polling, but there are few serious concerns about their methods. This list is not in order of importance, but order of engagement:
1) Probability of Voter Intention: Natalie Jackson and Ariel Edwards-Levy specifically cite the voter intention question wording as a reason to not include the poll. This is not actually a big concern for me for two reasons. First, I assume (I have not seen the data yet) that a large portion of people answer 100% or 0%. And, I further assume that deviations from this are somewhat symmetrical for Trump and Clinton supporters. Second, I do not find it that much different from the standard bevy of questions that include leaning or not leaning towards the candidates.
2) Probability of Voting: I am more concerned with the companion probability of question on whether or not the respondent will vote. While some good work has been done in the past on asking probability of voting, it is not clear how well it will hold up in an election like this. It is possible that the standard method of inferring likely voting (from past voting records and other implicit questions) would actually be a more stable and realistic measure of likeliness of voting. Asking the respondents probably exaggerates shifts. Further, why derive this each week anew, when they have a full panel of data on the respondents? Surely they can model the likeliness to vote more efficiently with all of that response data they have for each respondent?
3) Party Identification: Each respondent answered a battery of demographic questions before the daily polling began. Along with the standard questions they asked about the 2012 presidential vote. The poll is weighing people by their 2012 vote as proxy for latent party identification. This is a bad proxy, because a person’s four year old vote is actually more susceptible to change than their current stated party identification. You read that right. People have a serious problem remembering if and for whom they voted for in past elections. Generally, people overstate their vote for the winning candidate. What that means is the “Romney voters” in their panel are probably a more hardcore sub-section of Romney voters than actual Romney voters.
4) Modeling: The poll is raking their weights, rather than modeling the data. Depending on how representative their sample is to begin with and the randomness of the dropouts over time, once the weights get lager they become quite an issue. Modeling the data with some form of hierarchical regression provides additional power. I am particularly concerned with African-American support for Clinton dropping from 90 to 80 percentage points (and Trump’s support rising from near 0 to 14.3 percentage points). Could smaller demographics groups, like African-Americans, have too big of weights due to under-representation in the poll?
Most likely, the party ID issue is making the poll a few points too favorable for Trump. That is, by far, my biggest concern about the poll.
But, does that not discount that the movement may still hold valuable information about the race tightening a little. They have a relatively steady group of people and show a 2.7 percentage point drop in support for Clinton from her peak and 2.6 percentage point increase for Trump from his bottom. Someone is moving towards Trump and someone is moving away from Clinton, but it is not clear from where and by how much.
This poll is innovative and interesting. I am worried that this important science may be diminished in some way by its questionable topline values in real-time. It should not. I think the panel will ultimately yield some valuable insight into the 2016 election and polling methodology. It just should not be included in polling averages.