Big Data

As one who has spent a goodly portion of his adult life raving about the folly of amassing much more information than we can analyze and use effectively, it will come as little surprise to learn that I am skeptical about the so-called "big data" phenomenon.

Digital technology has endowed us with a massive capacity to accumulate data. Since the 1980s, our capacity to store information has roughly doubled every 40 months or so. As of 2012, every day some 2.5 new exabytes enter the system. An exabyte is one quintillion bytes. My eyes tend to glaze over when I encounter numbers like that. It's like reading that some constellation is 90 million light years from the Earth or that the dinosaurs roamed 100 million years ago.

Suffice it to say that an exabyte is a large number and when you assemble lots and lots of them, you have big data. We are talking about data sets so large and complex that traditional data processing applications are inadequate. There is a widespread and in my opinion pathetically optimistic assumption that the sheer volume of big data will lead to better decision making resulting in greater efficiency in a broad range of applications in business, science, government, public politics, service organizations, national defense, everything.

You can count me among the skeptics. Big data is by definition raw data - information gleaned by computers without rhyme or reason, verification or interpretation. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists (whatever they are), sociologists, and every other kind of "ist" are demanding access to the massive amounts of information being produced about people, things and their interactions. Everyone wants in on the action of the "next big thing."

But raw data is just what it sounds like - raw. It is just numbers reflecting actions and transactions without context or reasoned analysis. Precious few people know how to glean useful knowledge from the mass of information we are accumulating. No matter how comprehensive or well analyzed, said the Harvard Business Review, big data must be complemented by "big judgment" or it will do more harm than good. But big judgment is hard to find.

Researchers Danah Boyd and Kate Crawford say big data is less about data than about a capacity to search, aggregate and cross-reference large data sets. "Like other socio-technical phenomena, Big Data triggers both utopian and dystopian rhetoric," they said. "On one hand, Big Data is seen as a powerful tool to address various social ills, offering the potential of new insights into areas as diverse as cancer research, terrorism and climate change. On the other, Big Data is seen as a troubling manifestation of Big Brother, enabling invasions of privacy, decreased civil freedoms and increased state and corporate control."

I would go further. All information taken out of context is invariably misleading. By its very nature, the raw content of all big data is taken out of context.

