In November, Fermilab became a research member of CERN openlab – a public-private partnership between CERN and major ICT companies established in 2001 to meet the demands of particle-physics research. Fermilab researchers will now collaborate with members of the LHC’s CMS experiment and the CERN IT department to improve technologies related to physics data reduction, which is vital for gaining insights from the vast amounts of data produced by high-energy physics experiments.
The work will take place within an existing CERN openlab project with Intel on big-data analytics. The goal is to use industry-standard big-data tools to create a new tool for filtering many petabytes of heterogeneous collision data to create manageable, but still rich, datasets of a few terabytes for analysis. Using current systems, this kind of targeted data reduction can often take weeks, but the Intel-CERN project aims to reduce it to a matter of hours.
The team plans to first create a prototype capable of processing 1 PB of data with about 1000 computer cores. Based on current projections, this is about one twentieth of the scale of the final system that would be needed to handle the data produced when the High-Luminosity LHC comes online in 2026. “This kind of work, investigating big-data analytics techniques is vital for high-energy physics — both in terms of physics data and data from industrial control systems on the LHC,” says Maria Girone, CERN openlab CTO