An astronomer is sometimes pictured as a person standing on the top of a hill, gazing into a telescope aimed at the night sky. That rather romantic scene is now gradually being replaced by images of people sitting inside offices, working on computers.
The information revolution did not skip astronomy, and most astronomical data now come in digital format. In fact, the virtualization of astronomical observatories, such the Virtual Astronomical Observatory (VAO), made astronomical data available to everyone, allowing any person with a computer and an Internet connection to do the same research done at the most prestigious academic institutes. Access to special labs or costly research equipment is no longer a barrier.
But the application of information technology to astronomy also introduces new challenges. Robotic telescopes constantly collect astronomical data and generate enormous astronomical databases. For instance, Sloan Digital Sky Survey (SDSS) has imaged over 400 million galaxies since it saw first light in 2000. The Large Synoptic Survey Telescope (LSST) will collect that amount of data once every three days and will operate for 10 years. So the databases being generated by robotic telescopes are literally astronomical.
These huge databases are of almost no value unless there are tools that can analyze them and turn them into knowledge. Unfortunately, the astronomy community so far has not excelled in the field of big data and has not yet fully assimilated the idea that astronomy research depends on analysis of large astronomical databases.
But just a quick look into these databases can show the potential of what can be discovered in them. Obviously, the unarmed human eye is not an efficient tool to examine millions of celestial objects, a fact that echoes the need for computational tools that can "understand" certain tasks in astronomy research and automatically mine large astronomical datasets for scientific discoveries. For instance, while most galaxies in the universe have a well-defined shape characterized by the galaxy morphology scheme known as the Hubble sequence, some galaxies have a peculiar shape that does not fit in any of the known morphological classes. These galaxies are of paramount scientific importance as they carry precious information about the past, present, and future universe, but finding them among millions of regular galaxies is not an easy task to do manually. John Wallin and I developed a computer vision method that automatically analyzes galaxy images and searches for unusual galaxies. That computational method provided a new look at SDSS galaxy database, and can be conceptualized as a computational telescope that looks inside the information universe of digital sky surveys. The first peculiar galaxy found was a worm-like object that does not look like a known type of galaxy.
This object can be an edge-on ring galaxy. Ring galaxies are very rare, and edge-on rings are rare among ring galaxies. The algorithm also detected another similar object, about 413 million light-years away.
Here is another example of a strange object that the algorithm found. The triangular shape of the galaxy is probably the result of the gravitational interaction with the nearby round galaxy. This pair of galaxies is about 482 million lights-years away.
About 385 million light-years away, the algorithm found a strange celestial object that looks like a snail crawling into a hole.
More than 1.2 billion light-years away the algorithm found the following strange-looking ring galaxy.
The Pacman-like galaxy located 510 million light-years from Earth features some form of cosmic cannibalism.
The bright object above this distorted galaxy can be part of the system, or a much closer star that happened to be positioned in the same part of the sky.
Some galaxies can have more than one nucleus.
This galaxy is also checked.
These are just a few examples of what can be found by mining a modest dataset of just about four million galaxies. This example certainly does not mean that we can now fully mine databases thousands of times larger such as LSST, and in fact we are not even close. Research in the field moves rather slowly, and it will probably take years to get there, but the "scoreboard" galaxy shows the score so far.