Data's Big Bang: Applying Analytics to Astronomy

(Image: skeeze/Pixabay)

(Image: skeeze/Pixabay)

If you're an analytics geek with your head in the stars -- by which I mean you have a fascination for stars, planets, moons, galaxies, black holes, and other astronomical objects -- astronomical research may offer you a dream career path. It might require some background in the field of astronomy -- but it's an option for indulging the two interests of analytics and astronomy.

The astronomical research field is expanding by leaps and bounds. It should come as no surprise that astronomical science has been involved in the big-data/analytics revolution (actually, the tasks of gathering and processing large volumes of data has long been a hallmark of astronomical research). But what's jaw dropping is the payoff in terms of how fast and how enormously our knowledge of the universe is expanding.

The Atlantic magazine's article How Big Data Is Changing Astronomy (Again), noted five years ago that "This isn't your grandfather's stargazing," and explained that "the amount of data we have on our universe is doubling every year thanks to big telescopes and better light detectors."

Other whiz-bang systems, such as the Laser Interferometer Gravitational-Wave Observatory (LIGO), detected gravitational waves for the first time just over a year ago, emanating from a collision of black holes about 1.3 billion light years away. That discovery culminated roughly 40 years of observation and data collection.

The analytics revolution is intensely influencing the direction and future of astronomy, according to a paper by Michael Garrett of ASTRON (the Netherlands Institute for Radio Astronomy) and University of Leiden. Garrett presented the paper titled Big Data Analytics and Cognitive Computing: Future Opportunities for Astronomical Research at a 2014 radio-telescope astronomy conference. Here's an excerpt of what Garrett wrote:

"The days of the lone astronomer with his optical telescope and photographic plates are long gone: Astronomy in 2025 will not only be multi-wavelength, but multi-messenger, and dominated by huge data sets and matching data rates. Catalogues listing detailed properties of billions of objects will in themselves require a new industrial-scale approach to scientific discovery, requiring the latest techniques of advanced data analytics and an early engagement with the first generation of cognitive computing systems. Astronomers have the opportunity to be early adopters of these new technologies and methodologies: the impact can be profound and highly beneficial to effecting rapid progress in the field."

Kirk Borne, currently the principal data scientist at the Strategic Innovation Group at Booz-Allen Hamilton, weighed in on the topic in a 2014 Data Science Weekly Interview, back when he was professor of astrophysics and computational science at George Mason University. Borne recounts how he realized the importance of data in astronomical research in 1998 as he was working for Raytheon in NASA's Astrophysics Data Facility.

"... I realized that that the huge increase in data volumes [was] leading to huge potential for new discoveries. To achieve those discoveries, we needed the special machine learning algorithms that are used in data mining. I began devoting all of my research time to data mining research, initially on the very same colliding galaxies that I had previously studied 'one at a time' but now 'many at a time.'"

Borne also describes how his team "secured grants to discover unusual super-starbursting galaxies in large astronomy data sets." To provide the analytics, the team adapted a neural network model originally designed to identify wildfires in remote sensing satellite images of the Earth. These were converted into an analytical model deploying data mining algorithms operations on distributed data -- "one of the first successful examples of 'ship the code, not the data.'"

Examples of modern astronomical research projects illustrate the magnitude of typical datasets and some approaches to analytics. The Next Generation Virgo Cluster Survey (NGVS), described in a 2012 InfoWorld article, seeks to map the approximately 2,000 galaxies of the Virgo Cluster, the closest large star cluster to Earth's own Milky Way galaxy. InfoWorld reported that "the data collection process adds terabytes per week, yielding hundreds of terabytes to analyze." Facing this daunting task, project leaders turned to the Canadian Advanced Network for Astronomical Research (CANFAR) -- "the first dedicated cloud computing platform for astronomy, used to store, share, and analyze the data for astronomers worldwide." Researchers further "determined that machine learning, a type of advanced analytics with origins in artificial intelligence, would provide the most productive approach to accurately identify galaxies and generating the full Virgo Cluster map."

Another major project, the Square Kilometre Array (SKA), aiming to increase the speed of sky surveying by 10,000 times, has enlisted the Murchison Widefield Array (MWA) radio telescope, located in Murchison Shire (some 800 km north of Perth, Australia). As described in a 2015 news report titled Radio astronomy backed by big data projects, the MWA is "designed to look back in time, to study the formation of the first stars and galaxies in the universe, less than one billion years after the Big Bang" (about 13.8 billion years ago). So what sizes of datasets are involved?

In just 18 months alone, the MWA collected over 4 petabytes of data. The Murchison systems produce data streams at approximately 60 gigabits per second, processed on-site in real time using GPU-based (graphics processing unit) signal processing "as the first stage in a hierarchical data processing strategy." Output data streams are transmitted over a dedicated 10 Gbps optical fiber network to the Pawsey Supercomputing Centre in Perth.

These examples represent just a minute sampling of the multitude of truly exciting astronomical research projects across the globe that are utilizing analytics to process ever-larger datasets and broaden humanity's knowledge of our own universe. The possible data analytics-related career opportunities could be significant. Candidates for technical positions would likely need to have some background in the given technical field -- but this might have a lot of different forms and levels.

Some positions might require college-level degrees, minors, or significant hours in astronomy-related studies, perhaps also some additional hours in data analytics. Or for example a degree in data analytics with a minor in an astronomy-related field. Possibly there are also opportunities for consultants -- for example, helping to build more effective systems for large-volume data processing -- as well as implications for college-level courses and contractual training services to acquaint personnel involved in astronomical research with advanced data analytics methods for processing big data on a mind-boggling scale.

Do you think analytics professionals would be attracted to careers in astronomical research? Can you envision other ways analytics expertise can be connected? Share your thoughts in the comments below.

Lyndon Henry, Writer/Editor & Transportation Consultant

Lyndon Henry, a writer, editor, journalist, and transportation consultant, holds a Master of Science in community and regional planning (transportation focus), 1981, and a Bachelor of Arts in History, 1964, both from the University of Texas at Austin. In 1973 he presented the original proposals and feasibility studies of light rail for Austin, which led to planning for rail transit in the region. From 1981 to 1985 he served as a transportation consultant to the Hajj Research Centre in the Kingdom of Saudi Arabia. He has also served as a consultant on various transit projects in the US. In 1984, as a member of the Austin-Travis County Transit Task Force, his recommendation to form a transit authority for the Austin area with a full funding mechanism was accepted, and the authority, now called Capital Metro, was created in 1985. From 1989 to 1993, Henry served as a board member and vice-chairman of the agency. From 1990 to 1992 he taught a course in public policy at St. Edwards University (Austin). From 2002 to 2011 he served as a data analyst for Capital Metro. Among other pursuits, his background also includes work as a programmer and systems analyst, investigative journalist, and creative fiction writer. Currently he is a blog columnist for Railway Age magazine.

Uber, Lyft, and the Austin Dispute

The governance dispute that led Uber and Lyft to abandon Austin could have implications for ride-hailing services in other cities.

Automated Vehicles Tackle the Big Jobs

Perhaps a precurser to automated vehicles on the highway, massive driverless trucks are already at work in the mining sector.

Re: Data discovery
  • 3/20/2017 3:25:11 PM

I think astronomy and space exploration is so important for humanity for no other reason than it gives us hope and inspiration. 

Re: Data discovery
  • 3/20/2017 3:11:09 PM

What a great use of analytics. I am sure you are right about the next big discoveries.

Data discovery
  • 3/20/2017 10:49:25 AM

Thought provoking blog, Lyndon... I think you make it abundantly clear that the next big breakthrough or discovery in astro-geophysics will come from advanced calculations made within data, and not from a star gazer poised at the end of a gigantic telescope.

<<   <   Page 2 / 2