How Big Data Is Transforming What We Know About The Universe

2026-03-16 10:05:50

(MENAFN- The Conversation) Science in the modern era is increasingly reliant on enormous datasets and automated analysis. In astronomy, the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) – a ten-year survey covering the entire southern sky almost a thousand times over the next decade – will test the limits of this reliance.

The Rubin observatory, located on a mountaintop called Cerro Pachón in Chile, is expected to catalogue the night sky in exquisite detail. The observatory aims to answer a number of questions about the universe by studying different phenomena in the sky, including supernovae (exploding stars), asteroids, dark matter and the properties of our own galaxy.

What it will also answer is a question dominating all areas of science in the 21st century: how is discovery viewed in the age of big data?

Although primarily funded by the US Department of Energy and National Science Foundation (NSF), the Rubin telescope is the product of a collaborative effort by astronomers spanning six continents and over a dozen countries.

Assistance in setting up its data processing systems was provided by the UK, France, Spain, Italy, Japan, Brazil, Australia, South Africa and Canada, among others. These in-kind contributions provide researchers from these countries with data rights for the LSST.

Alerts providing scientific data are forwarded to seven“brokers” scattered around the world. The brokers are websites or software that astronomers use to access the data from LSST.

The alerts provide information on a new astronomical object, such as its likelihood of being real, its type, the galaxy it belongs to and how its brightness has changed over time. With this data, astronomers are able to select the best candidates for follow-up research.

However, even with the efforts of the software teams and brokers, there is still too much transient data for any research team to sift through. The final stage of data processing from the Rubin telescope will involve scientists using machine learning and AI techniques to identify the best data.

These techniques may be for identifying real cosmic objects among the terabytes of false alerts received, or for classifying the ones most interesting to scientists.

Astronomy is increasingly code-heavy and focused on in-house development. Given the huge amounts of data generated with every night of telescope observations, it is, unsurprisingly, one of the first sciences to turn to machine learning as a solution.

LSST's Informatics and Statistics Science Collaboration (ISSC), for example, is a group of over 150 data scientists who work on developing tools for astronomy, focusing on the survey's data science goals.

Astronomy has led the charge in regard to big data, with funding provided by companies such as Amazon and Microsoft for a number of major projects. Indeed, the namesake of the 8.4-metre Simonyi Survey Telescope at the Rubin observatory, Charles Simonyi, is known for software development in the early days of Microsoft, as well as his philanthropic work.

The volume of data produced by the observatory will not only produce opportunities for scientists, software developers and tech workers, but also for volunteers with an interest in astronomy via citizen science projects.

LSST's partnership with the citizen science platform Zooniverse will ask volunteers to look through data and provide additional context to what they're shown – identifying interesting objects, discarding garbage data and classifying various types of phenomena.

Future lessons

What does the Rubin observatory tell us about modern astronomy? The 20th century saw a greater push for international collaboration in exploring the skies. The increased sophistication of the resulting observatories means that more and more astronomers are working in the service of enabling science, rather than making discoveries themselves.

The huge amounts of data generated by the survey, and the huge number of personnel required to analyse it, is not novel to Rubin. Other contemporary surveys such as Euclid and the Ligo-Virgo-Kagra collaboration, as well as the next decade's even larger Square Kilometer Array, each consist of thousands of collaborators worldwide leveraging huge amounts of data.

What is clear is that AI will dominate the scientific discovery space of the Rubin observatory to meet these big data challenges. With more funding from industry to develop AI tools to analyse astronomy data, astronomy is becoming deeply embedded within the tech-sphere that dominates modern life.

Rubin will produce 10 terabytes of data every night, with the aim of a final database size of 15 petabytes at the end of its ten-year survey. With the majority of the 10 million alerts produced each night expected to be false, advanced machine learning and AI tools are required to filter out all but the most promising candidates for follow-up.

By reducing the amount of time spent by astronomers reviewing this data, more time can be spent carrying out new and exciting astrophysics research.

Ownership of both the tools of discovery and the discovery itself is now disseminated among scientists, big tech and the citizens who label data. The unresolved question is whether the cosmos will remain a shared public frontier, or become a domain shaped by the priorities of Silicon Valley.

MENAFN16032026000199003603ID1110867250

Institution:Queen's University Belfast

Legal Disclaimer:
MENAFN provides the information “as is” without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the provider above.