The world's largest big data and analytics project is tackling some of the most complex scientific questions, ones that have mystified humankind for centuries.
At the same time, this project is helping us usher in the era of cognitive computing, otherwise known as systems that will make sense of and learn from the world the way humans do.
The project in question is the Square Kilometre Array, or SKA, the largest, most sensitive radio telescope ever constructed. Its goal is to reach back to the time immediately after the Big Bang by collecting faint radio signals deep in space, in order for us to better understand the evolution of the universe, stars and galaxies, and solve the fundamental riddles of the universe and the nature of matter. Construction is expected to begin in 2018 and be completed by 2024.
Backed by 10 nations and thousands of scientists, the SKA is a massive feat, one that requires huge breakthroughs in the technology needed to build and run it.
Consider the following:
- This project will be made up of 3,000 dishes dotted over thousands of miles in southern Africa and western Australia. Every day, millions of antennas will collect a deluge of data, some 14 exabytes of data, or twice as much data as the Internet spits out daily. The data, in the form of radio waves stemming from the original Big Bang, is 13 billion years old, yet it is still accessible using highly sensitive antennas.
Only by basing the overall design on architectures that are beyond the current state-of-the-art will it be possible to handle the vast amounts of data produced by the millions of antenna systems of the SKA. These technologies are under development in a public-private collaboration that includes IBM, ASTRON, the Netherlands Institute for Radio Astronomy, and Square Kilometer Array (SKA) South Africa, a business unit of South Africa's National Research Foundation.
The technology runs gamut from computer system and chip design to information management and analytics. The scientists will jointly tackle the fundamental challenges such a massive project creates, including new approaches for efficiently and cost-effectively collecting the data, cleaning the data, storing it and pulling out the needed insights.
In the process, the advances that we and our partners make in cognitive computing and other technologies will be able to be applied to tackle a wide range of computing applications in industries ranging from health care to telecommunications. In our world today, we're collecting a wide variety of data, from videos to people's comments on the Web, to data from sensors on the bottom of the sea, and yes, radio waves. We'll need these new approaches to make sense of this deluge of so-called unstructured Big Data. Cognitive computing will come into play because we'll need to make massive advances in how the systems that think and learn by themselves over time.
Just consider the issue of storing and accessing that data. The 14 exabytes of data collected by the SKA will be whittled down to about a petabyte of usable data. Yet, even at that, it will add up quickly, becoming an exabyte of data within three years and three exabytes within a decade.
Storing all the data isn't economically feasible in terms of both material and energy costs. But by using cognitive computing, we can design systems that are smart about how to manage that data, learn what should be stored and where, whether it's on instantly accessible hard disks, backup magnetic tapes, or next-generation storage class memory like flash.
That's why SKA is the ultimate Big Data challenge. It will tackle questions that we've never been able to answer about the universe, with technology that was never before available.
If you want to learn more about cognitive computing, download a free chapter of Smart Machines, by IBM Research Director John E. Kelly III.
Follow IBM on Twitter @ibm.