What is it like doing data science for a presidential campaign? originally appeared on Quora: The best answer to any question.
I think the biggest difference between what was written about the Obama Data team in 2012 and the actual experience of it was that most people don't know how deeply broken the tools we had were. Everything was built to just last a few months, and as a consequence required a lot of "duct tape and hope," a phrase that was often used to describe the hacky solutions that were required to make everything work. That being said, things were very much on the cutting edge in terms of data centralization and field experiments.
I was with the Virginia state Data team, outside of the Chicago headquarters operation. That meant that my interaction with data science was mostly in the form of experiments to form our models, and then using the models that were built by Chicago to inform our microtargeting in the Field calling and door knocking done by volunteers. There was also a fair amount of reporting (probably the biggest lift of my work) though that's outside the scope of your question.
My favorite experiment was when we were building a model to find the most persuadable people in our universe. That is to say, the people that take the fewest number of calls or door knocks to convince them to vote for the President. Basically, a round of robodials went out to see where on a scale of 1-5 their support for the president was. Then we did live calls from volunteers out to those people to try and persuade them. Then another round of robodials made the same 1-5 assessment as before. Then all of that data was used to see what type of person was most effected by the live call from a volunteer. This may all sound simple, but remember that this is in the middle of a giant campaign happening. So all of the volunteers had to be trained to deliver the script, as well the control population had to be removed and not contacted by any part of the campaign (which required heavy monitoring of Field staff). As well the logistics of information cleanliness.
But in the end, everyone in Virginia that was registered to vote was given a score from 0 to 100 of how persuadable they were, and we were able to talk to the most persuadable people first, so that we weren't wasting our volunteers' time. That's just one example of the experiences you see on a campaign data team.
The actual building of the model however, I cannot speak to. That was done by a group that we had limited contact with in Chicago called the Analytics department (affectionately known as The Cave because of the unventilated, windowless room they worked out of).
Many Data staff on campaigns have some Field experience knocking on doors and recruiting volunteers on the phone. I worked four campaign cycles in Field before starting in Data. Understanding data management in a campaign requires that you understand how campaign recruitment and electioneering works. In some ways, that's more important than the technical skills like R and SQL that can be taught. So if you work in Data science and you're trying to see what campaign data science would look like, I recommend that you find a Field office for Hillary Clinton's campaign (there are other candidates but none with an operation as sophisticated) and volunteer to knock on doors and do data entry. You will likely be very surprised at both the depth of information available and the limitations behind data collection in that environment.