Topics

Facing up to the exabyte era

13 October 2017

The high-luminosity Large Hadron Collider (HL-LHC) will dramatically increase the rate of particle collisions compared with today’s machine, boosting the potential for discoveries. In addition to extensive work on CERN’s accelerator complex and the LHC detectors, this second phase in the LHC’s life will generate unprecedented data challenges.

The increased rate of collisions makes the task of reconstructing events (piecing together the underlying collisions from millions of electrical signals read out by the LHC detectors) significantly more complex. At the same time, the LHC experiments are planning to employ more flexible trigger systems that can collect a greater number of events. These factors will drive a huge increase in computing needs for the start of the HL-LHC era in around 2026. Using current software, hardware and analysis techniques, the required computing capacity is roughly 50–100 times higher than today, with data storage alone expected to enter the exabyte (1018 bytes) regime.

It is reasonable to expect that technology improvements over the next seven to 10 years will yield an improvement of around a factor 10 in both processing and storage capabilities for no extra cost. While this will go some way to address the HL-LHC’s requirements, it will still leave a significant deficit. With budgets unlikely to increase, it will not be possible to solve the problem by simply increasing the total computing resources available. It is therefore vital to explore new technologies and methodologies in conjunction with the world’s leading information and communication technology (ICT) companies.

CERN openlab, which was established by the CERN IT department in 2001, is a public–private partnership that enables CERN to collaborate with ICT companies to meet the demands of particle-physics research. Since the start of this year, CERN openlab has carried out an in-depth consultation to identify the main ICT challenges faced by the LHC research community over the coming years. Based on our findings, we published a white paper in September on future ICT challenges in scientific research.

The paper identifies 16 ICT challenge areas that need to be tackled in collaboration with industry, and these have been grouped into four overarching R&D topics. The first focuses on data-centre technologies to ensure that: data-centre architectures are flexible and cost effective; cloud-computing resources can be used in a scalable, hybrid manner; new technologies for solving storage-capacity issues are thoroughly investigated; and long-term data-storage systems are reliable and economically viable. The second major R&D topic relates to the modernisation of code, so that the maximum performance can be achieved on the new hardware platforms available. The third R&D topic focuses on machine learning, in particular its potentially large role in monitoring the accelerator chain and optimising the use of ICT resources.

The fourth R&D topic in the white paper identifies ICT challenges that are common across research disciplines. With ever more research fields such as astrophysics and biomedicine adopting big-data methodologies, it is vital that we share tools and learn from one another – in particular to ensure that leading ICT companies are producing solutions that meet our common needs.

In summary, CERN openlab has identified ICT challenges that must be tackled over the coming years to ensure that physicists worldwide can get the most from CERN’s infrastructure and experiments. In addition, the white paper demonstrates the emergence of new technology paradigms, from pervasive ultra-fast networks of smart sensors in the “internet of things”, to machine learning and “smart everything” paradigms. These technologies could revolutionise the way big science is done, particularly in terms of data analysis and the control of complex systems, and also have enormous potential for the benefit of wider society. CERN openlab, with its unique collaboration with several of the world’s leading IT companies, is ideally positioned to help make this a reality.

• openlab.cern.

bright-rec iop pub iop-science physcis connect