The History of Data Science

The History of Data Science
This post was published on the now-closed HuffPost Contributor platform. Contributors control their own work and posted freely to our site. If you need to flag this entry as abusive, send us an email.

If you had to explain what Data Science is to someone with no prior knowledge, how would you do it? originally appeared on Quora - the knowledge sharing network where compelling questions are answered by people with unique insights.

Answer by Neel Sundaresan, Partner Director, Data Platform, Data Science and Applications, C&E, Microsoft, on Quora.

There are many varying definitions for this depending on who you ask. But to me, Data Science is the ability to gain knowledge or learn from data and apply that knowledge to make decisions. This is fairly broad and encompasses mathematics, statistics, computer science, engineering and many fields within computer science like pattern recognition, information theory, machine learning, high performance computing and so on.

The phrase data science has been around for years starting with Peter Naur using the term in 1960 to mean data processing. There is a statistics talk by Jeff Wu in the 70s titled "Statistics = Data Science?" given in honor of Prof. Mahalnobis, the founder of ISI, Calcutta, India. Then in early 2000s Bill Cleveland, an environmental statistician, called for establishing Data Science as a field to expand the field of statistics in his article in the International Statistical Review.

The later part of that decade saw folks in silicon valley companies formalize the title. When I started the research labs at eBay over a decade ago it was hard to imagine any work without data. While I had a team of engineers and scientists who looked at all aspects of data from infrastructure to algorithms to machine learning to vision, HCI and economics, none of them had the title of a "data scientist" (it wasn't a "sexy" title then) I saw each one as an expert in their respective field but who worked (or had to learn to work) with large amounts of data. The overall team could have been thought of as a data science team - as in "the elephant and the blindfolded men" story, each one bringing in IQ along a certain dimension and all of them together completing the story. It opened up opportunities for collaboration, learning techniques from other fields, realizing that there are similar techniques from other fields. A machine learning person describing precision-recall to an empirical economist who tries to map it to type1-type2 error and a statistician chiming in with specificity/sensitivity. Everyone was in big data candy land so the distributed data engineers were the most loved as they offered services over the data that the scientists had not seen before.

This question originally appeared on Quora. - the knowledge sharing network where compelling questions are answered by people with unique insights. You can follow Quora on Twitter, Facebook, and Google+. More questions:

Popular in the Community


What's Hot