ATLAS releases 65 TB of open data for research
1 July 2024 | By
The ATLAS Experiment at CERN has made two years’ worth of scientific data available to the public for research purposes. The data include recordings of proton–proton collisions from the Large Hadron Collider (LHC) at a collision energy of 13 TeV. This is the first time that ATLAS has released data on this scale, and it marks a significant milestone in terms of public access and utilisation of LHC data.
“Open access is a core value of CERN and the ATLAS Collaboration,” says Andreas Hoecker, ATLAS Spokesperson. “Since its beginning, ATLAS has strived to make its results fully accessible and reusable through open access archives such as arXiv and HepData. ATLAS has routinely released open data for educational purposes. Now, we’re taking it one step further — inviting everyone to explore the data that led to our discoveries.”
Released under the Creative Commons CC0 waiver, ATLAS has made public all the data collected by the experiment during the 2015 and 2016 proton–proton operation of the LHC. This is approximately 65 TB of data, representing over 7 billion LHC collision events. In addition, ATLAS has released 2 billion events of simulated “Monte Carlo” data, which are essential for carrying out a physics analysis.
Today's release underscores the ATLAS Collaboration's long-standing commitment to open access principles.
External researchers, in particular, are encouraged to explore the ATLAS open data. “Along with the data, we have provided comprehensive documentation on several of our analyses, guiding users through our process step-by-step,” says Zach Marshall, ATLAS Computing co-Coordinator. “These guides provide first-hand experience of working on a real ATLAS result, allowing anyone to test our tools, and evaluate the systematic uncertainties associated with the result for themselves.”
ATLAS traditionally collaborates with non-ATLAS scientists through short-term associations, granting them full access to ATLAS data, internal tools, and information. Through the open data, ATLAS researchers hope to further nurture this dialogue and collaboration. “In particular,” adds Zach, “we’d like to encourage phenomenologists and also computer scientists to explore our datasets, instead of relying on mock-ups.”
Today’s release builds upon previous open data releases for educational use (in 2016 and 2020). “All of our open data releases are now available through the ATLAS open data website,” says Dilia Portillo, ATLAS Outreach and Education co-Coordinator. “The website includes multi-level documentation, video tutorials and online tools aimed at the full-spectrum of users, from high school students to senior particle physics researchers. In addition, the software used to create the education-use open data has been released. This provides a seamless transition from the research open data to all the tutorials for outreach and education, including newly updated Higgs-boson discovery documentation. With a bit of time and dedication, you can go from being a relative novice to carrying out your own analysis.”
The ATLAS open data website also serves as a hub for the community, which includes teachers, students, enthusiasts and, now, scientists. Anyone diving into the open data can also directly engage with ATLAS physicists, who are available to respond to user feedback and take suggestions.
This release marks the start of more to come, with ATLAS’ first release of lead-lead-nuclei collision data up next. The ATLAS Collaboration, along with the other main LHC experiment collaborations, has committed to making all of its data publicly accessible after a certain time. Openness is deeply ingrained in the culture of high-energy physics, enabling greater accessibility, reproducibility and better science.
Begin your journey with ATLAS open data by following the tutorial below.
Get started with ATLAS open data:
- Quick Start: beginners guide to the 13 TeV dataset released for education. No downloads required.
- Deep Dive: more in-depth tutorial designed for students or teachers who want to use ATLAS open data resources over multiple sessions and download data.
- Researcher Toolkit: documentation for expert users looking to perform detailed analyses.