After a short, but intense, “intermezzo” as editor of the CERN Courier, I’m stepping down as I head off to new challenges. I would like to thank the CERN Courier Advisory Board and the many contributors who are the backbone of the magazine. I would also like to thank Lisa Gibson, the production editor at IOP Publishing. Most importantly, I would like to say that the magazine would not be what it is without its faithful readership, whose feedback I have appreciated greatly during these months. I’m sure that they will continue to provide their support to the new editor, Matthew Chalmers. Antonella Del Rosso, CERN.
With the increase in the centre-of-mass energy provided by the Run 2 LHC collisions, the production cross-sections of many new-physics processes are predicted to rise dramatically compared with Run 1, in contrast to those of the background processes. However, this increase in the cross-section is not the only way to enhance search sensitivities in Run 2. The higher energy leads to particle production that is more highly boosted. The large boosts result in the collimation of the decay products of the boosted object, which therefore overlap in the detector. For example, a Z boson that decays to a quark and an antiquark will normally produce two jets if it has a low boost. The same decay of a highly boosted Z boson will – in contrast – produce a single massive jet, because the decay products of the quark and antiquark will merge. Using jet-substructure observables, such as the jet mass or the so-called N-subjettiness, the search sensitivity for boosted objects like boosted top (t) quarks or W, Z and Higgs bosons can be enhanced.
CMS has retuned and optimised these techniques for Run 2 analyses, implementing the latest ideas and algorithms from the realm of QCD and jet-substructure phenomenology. It has been a collaboration-wide effort to commission these tools for analysis use, relying on experts in jet reconstruction and bottom-quark tagging, and on data-analysis techniques from many groups in CMS. These new algorithms significantly improve the identification efficiency of boosted objects compared with Run 1.
Several Run 2 CMS studies probing the boosted regime have already appeared, using the 2015 data set. While searches for boosted entities are pursued by many CMS analysis groups, the Beyond 2 Generations (B2G) group focuses specifically on final states composed of one or more boosted objects. Signal processes of interest in the B2G group include W´ → tb and diboson (VV/VH/HH) resonances, where W´ represents a new heavy W boson, “V” a W or Z boson, and H a Higgs boson. Other B2G studies focus on searches for pair- or singly produced vector-like quarks T and B through the decays T → Wb and B → tW. The search range for these novel particles generally lies between 700 GeV and 4 TeV, yielding many boosted objects when these particles decay.
Another study in the B2G group is the search for a more massive version (Z´) of the elementary Z boson, decaying to a top-quark pair (Z´ → tt). This search is performed in the semileptonic decay channel, for which the final state consists of a boosted top-quark candidate, a lepton, missing transverse momentum, and a tagged bottom-quark jet. Here, the boosted topology not only affects the reconstruction of the top-quark candidate, but also the lepton, whose isolation can be spoiled by the nearby bottom-quark jet. Again, special identification criteria are implemented to maintain a high signal acceptance. This analysis excludes Z´ masses up to 3.4 (4.0) TeV for signal widths equal to 10% (30%) of the Z´ mass, already eclipsing Run 1 limits. A complementary analysis, in the all-hadronic topology, is now under way – an event display showing two boosted top-quark candidates is shown in the figure. The three-subjet topology seen for each boosted top-quark candidate is as expected for such decays.
With these new boosted-object reconstruction techniques now implemented and commissioned for Run 2, CMS anxiously awaits possible discoveries with the 2016 LHC data set.
Eighty physicists gathered in Manchester on 6 and 7 April to discuss the future of the LHCb experiment. The LHCb collaboration is currently constructing a significant upgrade to its detector, which will be installed in 2019/2020. The Manchester workshop, entitled Theatre of Dreams: Beyond the LHCb Phase-I Upgrade, explored the longer-term future of the experiment in the second-half of the coming decade, and thereafter.
In the mid-2020s, the LHC and the ATLAS and CMS experiments will be upgraded for high-luminosity LHC operation. These activities will necessitate a long shutdown of at least 2.5 years. The Manchester meeting discussed enhancements to the LHCb experiment, dubbed a “Phase-Ib upgrade”, which could be installed at this time. Although relatively modest, these improvements could bring significant physics benefits to the experiment. Possibilities discussed included an addition to the particle-identification system using an innovative Cherenkov light-based time-of-flight system; placing detector chambers along the sides of the LHCb dipole to extend the physics reach by reconstructing lower-momentum particles; and replacing the inner region of the electromagnetic calorimeter with new technology, therefore extending the experiment’s measurement programme with photons, neutral pions and electrons.
Around 2030, the upgraded LHCb experiment that is currently under construction will reach the end of its foreseen physics programme. At this time, a Phase-II upgrade of the experiment may therefore be envisaged. During the meeting, the experimental-physics programme, the heavy-flavour-physics theory perspectives, and the anticipated reach of Belle II and the other LHC experiments were considered. The goal would be to collect an integrated luminosity of at least 300 fb–1, with an instantaneous luminosity a factor 10 above the first upgrade. Promising high-luminosity scenarios for LHCb from the LHC machine perspective were shown that would potentially allow this goal to be reached, and first thoughts were presented on how the experiment might be modified to operate in this new environment.
Many interesting ideas were exchanged at the workshop and these will be followed up in the forthcoming months to identify the requirements and R&D programmes needed to bring these concepts to reality.
The meeting was sponsored by the Science and Technology Facilities Council, the Institute of Physics, the Institute for Particle Physics Phenomenology and the University of Manchester.
Astrophysics and cosmology have established that about 80% of the mass in the universe consists of dark matter. Dark matter and normal matter interact gravitationally, and they may also interact weakly, raising the possibility that collisions at the LHC may produce pairs of dark-matter particles.
With low interaction strength, dark-matter particles would escape the LHC detectors unseen, accompanied by Standard Model particles. These particles, such as single jets, photons, or W, Z or Higgs bosons, could either be produced in the interaction with the dark matter or radiated from the colliding partons. One result would be “mono-X” signals, named because the Standard Model particle, X, would appear alone, without other visible particles balancing their momentum in the transverse plane of the detector.
During Run 1 of the LHC, ATLAS developed a broad programme of searches for mono-X signals. Now, new results from the ATLAS collaboration in the mono-jet and mono-photon channels are the first of these searches in the proton–proton collision data collected in 2015 after increasing the LHC collision energy to 13 TeV. With only 3.2 fb–1 of collisions, six times fewer than studied in Run 1, these first Run 2 results already achieve comparable sensitivity to beyond-the-Standard-Model phenomena. In each search, the data with large missing transverse momentum are compared with data-driven estimates of Standard Model backgrounds. As an example, the background to the mono-jet search is known to 4–12%, an estimate nearly as precise as that obtained in the final Run 1 analysis. ATLAS has also released preliminary Run 2 results in the mono-Z, mono-W and mono-H channels.
If dark-matter production is observed, ATLAS has the potential to characterise the interaction itself. To produce dark matter in LHC collisions, the interaction must involve the constituent partons within the proton. If the interaction is mediated by s-channel exchange of a new boson, a decay back to the Standard Model partons could also occur.
The ATLAS collaboration has also released new results from the dijet search channel, where new phenomena could modify the smooth dijet invariant mass distribution. With 3.6 fb–1 of data, the search already surpasses the sensitivity of Run 1 dijet searches for many kinds of signals. The dijet results are presented on a simplified model of dark-matter production, where the dark boson has axial-vector couplings to quarks and Dirac dark matter.
The results of the mono-photon, mono-jet and dijet searches are shown in figure 1, assuming a version of the axial-vector dark boson whose couplings to dark matter are four times stronger than those to Standard Model quarks. In this scenario, ATLAS dijet results exclude the existence of mediating particles with masses from about 600 GeV to 2 TeV. The mono-jet and mono-photon channels exclude the parameter space at lower mediator and dark-matter masses. For even larger ratios of the dark-matter-to-quark coupling values, dijet constraints quickly weaken, and mono-X searches play a more powerful role.
On the verge of new data-taking in 2016, with the LHC expected to deliver an order of magnitude more luminosity, mono-X and direct mediator searches at ATLAS are set to probe this and other models with unprecedented sensitivity.
The first p–Pb data-taking campaign at the LHC was undertaken as a test of the initial state of heavy-ion collisions (CERN Courier March 2014 p17 and CERN Courier October 2013 p17), and surprisingly revealed an enhancement of (identified) particle pairs with small relative azimuthal angle (CERN Courier January/February 2013 p9 and CERN Courier March 2013 p6) similar to that observed in Pb–Pb collisions where these results are associated with collective effects, such as elliptic flow. A deeper insight into the dynamics of p–Pb collisions is expected to come from measurements classifying events according to the collision centrality.
Hadrons carrying heavy flavour (charm or beauty quarks) are produced in initial hard scatterings and their production rates in the absence of nuclear effects can be calculated using perturbative QCD. Therefore, they are well-calibrated probes that provide information on the nuclear effects at play in the initial and final state of the collision, such as the modification of the parton-distribution functions in nuclei or the energy loss from rescattering between the produced particles, as well as into the dynamics of the heavy-ion collision.
The centrality dependence of prompt D-meson production in p–Pb collisions was studied by the ALICE collaboration by comparing their yields in p–Pb collisions for various centrality classes with those of binary scaled pp collisions at the same centre-of-mass energy via the nuclear modification factor, QpPb, evaluated as the ratio of these quantities. QpPb is equal to unity in the absence of nuclear effects. The D-meson QpPb in collisions classified in percentiles of the energy of slow neutrons detected by the ZNA calorimeter at very large rapidity in the Pb-going direction (figure 1, above) are consistent within uncertainties with unity, i.e. with binary collision scaling of the yield in pp collisions, independent of the geometry of the collision. There is no evidence for a modification of the spectrum shape for pT ≥ 3 GeV/c in the initial or final state of p–Pb collisions.
The D-meson yields in p–Pb collisions were also studied as a function of the relative charged-particle multiplicity at mid-rapidity and at large rapidity (Pb-going direction), by evaluating the yields in multiplicity intervals with respect to the multiplicity integrated ones, Y/〈Y〉. While with QpPb, particle production was examined in samples of 20% of the analysed events, this observable explores events from low to extremely high multiplicities corresponding to only 5% (1%) of the analysed events in p–Pb (pp) collisions. These measurements are sensitive to the contribution of multiple-parton interactions in pp and p–Pb collisions. The D-meson yield (figure 1, right) increases with a faster-than-linear trend as a function of the charged-particle multiplicity at mid-rapidity. This behaviour is similar to that of the measurements in pp collisions at 7 TeV. By contrast, the increase of the D-meson yields as a function of charged-particle multiplicity in the Pb-going direction is consistent with linear growth as a function of multiplicity. EPOS3 calculations describe the p–Pb results within uncertainties. The results at high multiplicity are better reproduced by the calculation including a viscous hydrodynamical evolution of the collision.
Charmed-meson measurements in p–Pb collisions have revealed intriguing features. The ALICE collaboration is looking forward to the higher-statistics p–Pb data sample to be collected by the end of 2016, which will allow for higher-precision measurements, bring information on the initial state of heavy-ion collisions and provide further constraints to small-system dynamics.
In April, the MoEDAL collaboration submitted its first physics-research publication on the search for magnetic monopoles utilising a 160 kg prototype MoEDAL trapping detector exposed to 0.75 fb–1 of 8 TeV pp collisions, which was subsequently removed and monitored by a SQUID magnetometer located at ETH Zurich. This is the first time that a dedicated scalable and reusable trapping array has been deployed at an accelerator facility.
The innovative MoEDAL detector (CERN Courier May 2010 p19) employs unconventional methodologies designed to search for highly ionising messengers of new physics such as magnetic monopoles or massive (pseudo-)stable electrically charged particles from a number of beyond-the-Standard-Model scenarios. The largely passive MoEDAL detector is deployed at point 8 on the LHC ring, sharing the intersection region with LHCb. It employs three separate detector systems. The first is comprised of nuclear track detectors (NTDs) sensitive only to new physics. Second, it is uniquely able to trap particle messengers of physics from beyond the Standard Model, for further study in the laboratory. Third, MoEDAL’s radiation environment is monitored by a TimePix pixel-detector array.
Clearly, a unique property of the magnetic monopole is that it has magnetic charge. Imagine that a magnetic monopole traverses the superconducting wire coil of a superconducting quantum interference device (SQUID). As the monopole approaches the coil, its magnetic charge drives an electrical current within the superconducting coil. The current continues to flow in the coil after the monopole has passed because the wire is superconducting, without electrical resistance. The induced current depends only on the magnetic charge and is independent of the monopole’s speed and mass.
In the early 1980s, Blas Cabrera was the first to deploy a SQUID device (CERN Courier April 2001 p12) in an experiment to directly detect magnetic monopoles from the cosmos. The MoEDAL detector can also directly detect magnetic charge using SQUID technology, but in a different way. Rather than the monopole being directly detected in the SQUID coil à la Cabrera, MoEDAL captures the monopoles – in this case produced in LHC collisions – in aluminium trapping volumes that are subsequently monitored by a single SQUID magnetometer.
No evidence for trapped monopoles was seen in data analysed for MoEDAL’s first physics publication described here. The resulting mass limit for monopole production with a single Dirac (magnetic) charge (1gD) is roughly half that of the recent ATLAS 8 TeV result. However, mass limits for the production of monopoles with the higher charges 2gD and 3gD are the LHC’s first to date, and superior to those from previous collider experiments. Figure 1 shows the cross-section upper limits for the production of spin-1/2 monopoles by the Drell–Yan (DY) mechanism with charges up to 4gD. Additionally, a model-independent 95% CL upper limit was obtained for monopole charge up to 6gD and mass reaching 3.5 TeV, again demonstrating MoEDAL’s superior acceptance of higher charges.
Despite a relatively small solid-angle coverage and modest integrated luminosity, MoEDAL’s prototype monopole trapping detector probed ranges of charge, mass and energy inaccessible to the other LHC experiments. The full detector system containing 0.8 tonnes of aluminium trapping detector volumes and around 100 m2 of plastic NTDs was installed late in 2014 for the LHC start-up at 13 TeV in 2015. The MoEDAL collaboration is now working on the analysis of data obtained from pp and heavy-ion running in 2015, with the exciting possibility of revolutionary discoveries to come.
A new luminosity record at the charm-tau energy region was recently broken again by the Beijing Electron–Positron Collider (BEPCII). The new record, 1 × 1033 cm–2s–1 at 1.89 GeV beam energy, is also the design luminosity for this collider at its design beam energy.
BEPCII, the upgrade project of BEPC (CERN Courier September 2008 p7), is a double-ring collider working at 1–2.3 GeV beam energy with a design luminosity of 1 × 1033 cm–2s–1 at an optimised beam energy of 1.89 GeV. Because of its performance, BEPCII can be seen as a charm-tau factory. The same as BEPC, BEPCII is characterised as “one machine, two purposes”. Indeed, the machine not only provides beam for high-energy physics experiments, it also provides synchrotron-radiation (SR) light to users in parasitic and dedicated modes.
BEPCII is installed in the tunnel that hosted its predecessor, BEPC. Its electron and positron rings, called BER and BPR, respectively, have a circumference of 237.5 m. BER and BPR run in parallel and have a crossing angle of 22 mrad at their interaction point (IP). On the opposite point to the IP, BER and BPR cross with a vertical bump created for each beam by local correctors as its original design. The third ring, BSR, resulting from the connection of the two half rings of BER and BPR, has a circumference of 241.1 m and can be run as a dedicated synchrotron light source at 2.5 GeV energy and a maximum beam current of 250 mA.
Installation of BEPCII was completed in 2006. Since then, the machine has passed the national check and other tests, together with its new detector, BESIII. In mid-July 2009, the luminosity reached 3.2 × 1032 cm–2s–1. The data-taking for high-energy physics started in August 2009. Besides running at 1.89 GeV design energy from 2010 to 2011, BEPCII has been run at other beam energies, from 1 GeV to 2.3 GeV, for different high-energy physics experiments.
Enhancing measures
In the past seven years, some measures have been taken to enhance the peak and integrated luminosity:
• A longitudinal feedback system was installed to suppress the longitudinal multibunch instability in 2010. During the high-energy-physics data-taking, the horizontal betatron tunes of two rings were moved to very close to half integers – 0.504 or 0.505. The luminosity at the design energy reached 5.21 × 1032 cm–2s–1 in 2010, and 6.49 × 1032 cm–2s–1 in 2011 with 720 mA/88 bunches/beam.
• In 2011, the vacuum chambers and eight magnets near the north crossing point were moved by 15 cm to mitigate the parasitic beam–beam interaction. The movement changed the layout of the machine and the beam separation from vertical to horizontal.
• The betatron tunes were changed from the region of (6.5, 5.5) to (7.5, 5.5), reducing the momentum compaction and shortening the bunch length. A luminosity of 7.08 × 1032 cm–2s–1 with 735 mA/130 bunches/beam was achieved in 2013.
• The emittance was increased from 100 nm· rad to 128 nm· rad to increase the single-bunch current. The luminosity reached 8.53 × 1032 cm–2s–1 with 700 mA/92 bunches/beam in late 2014.
High beam current is the main feature of this type of collider, which is also a big challenge for BEPCII. The direct feedback of the radiofrequency system was turned on, which helps higher beam current to be more stable. The transverse-feedback system was another big challenge. Beam collision helps to suppress the multibunch instability in the positron ring. The bunch pattern was optimised carefully to increase the luminosity. Finally, thanks to the efforts of all of the accelerator team of BEPCII, the design luminosity, 1 × 1033 cm–2s–1 with 850 mA/120 bunches/beam at its design energy, which is 100 times higher than the luminosity of BEPC at the same beam energy, was reached at 22.29 p.m. on 5 April 2016. The breakthrough from BEPC to BEPCII is now completed.
The High Energy Stereoscopic System (HESS) – an array of Cherenkov telescopes in Namibia – has detected gamma-ray emission from the central region of the Milky Way at energies never reached before. The likely source of this diffuse emission is the supermassive black hole at the centre of our Galaxy, which would have accelerated protons to peta-electron-volt (PeV) energies.
The Earth is constantly bombarded by high-energy particles (protons, electrons and atomic nuclei). Being electrically charged, these cosmic rays are randomly deflected by the turbulent magnetic field pervading our Galaxy. This makes it impossible to directly identify their source, and led to a century-long mystery as to their origin. A way to overcome this limitation is to look at gamma rays produced by the interaction of cosmic rays with light and gas in the neighbourhood of their source. These gamma rays travel in straight lines, undeflected by magnetic fields, and can therefore be traced back to their origin.
When a very-high-energy gamma ray reaches the Earth, it interacts with a molecule in the upper atmosphere, producing a shower of secondary particles that emit a short pulse of Cherenkov light. By detecting these flashes of light using telescopes equipped with large mirrors, sensitive photodetectors, and fast electronics, more than 100 sources of very-high-energy gamma rays have been identified over the past three decades. HESS is the only state-of-the-art array of Cherenkov telescopes that is located in the southern hemisphere – a perfect viewpoint for the centre of the Milky Way.
Earlier observations have shown that cosmic rays with energies up to approximately 100 tera-electron-volts (TeV) are produced by supernova remnants and pulsar-wind nebulae. Although theoretical arguments and direct measurements of cosmic rays suggest a galactic origin of particles up to PeV energies, the search for such a “Pevatron” accelerator has been unsuccessful, so far.
The HESS collaboration has now found evidence that there is a “Pevatron” in the central 33 light-years of the Galaxy. This result, published in Nature, is based on deep observations – obtained between 2004 and 2013 – of the surrounding giant molecular cloud extending approximately 500 light-years. The production of PeV protons is deduced from the obtained spectrum of gamma rays, which is a power law extending to multi-TeV energies without showing a high-energy cut-off. The spatial localisation comes from the observation that the cosmic-ray density decreases with a 1/r relation, where r is the distance from the galactic centre. The 1/r profile indicates a quasi-continuous central injection of protons during at least about 1000 years.
Given these properties, the most plausible source of PeV protons is Sagittarius A*, the supermassive black hole at the centre of our Galaxy. According to the authors, the acceleration could originate in the accretion flow in the immediate vicinity of the black hole or further away, where a fraction of the material falling towards the black hole is ejected back into the environment. However, to account for the bulk of PeV cosmic rays detected on Earth, the currently quiet supermassive black hole would have had to be much more active in the past million years. If true, this finding would dramatically influence the century-old debate concerning the origin of these enigmatic particles.
L’informatique du CERN prête à relever les défis de l’Exploitation 2 du LHC
Pour l’Exploitation 2, le LHC va continuer à ouvrir la voie à de nouvelles découvertes en fournissant aux expériences jusqu’à un milliard de collisions par seconde. À plus haute énergie et intensité, les collisions sont plus complexes à reconstruire et analyser ; les besoins en capacité de calcul sont par conséquent plus élevés. La deuxième période d’exploitation doit fournir deux fois plus de données que la première, soit environ 50 Po par an. Le moment est donc propice pour faire le point sur l’informatique du LHC afin de voir ce qui a été fait durant le premier long arrêt (LS1) en prévision de l’augmentation du taux de collision et de la luminosité lors de la deuxième période d’exploitation, ce qu’il est possible de réaliser aujourd’hui, et ce qui est prévu pour l’avenir.
2015 saw the start of Run 2 for the LHC, where the machine reached a proton–proton collision energy of 13 TeV – the highest ever reached by a particle accelerator. Beam intensity also increased and, by the end of 2015, 2240 proton bunches per beam were being collided. This year, in Run 2 the LHC will continue to open the path for new discoveries by providing up to one billion collisions per second to ATLAS and CMS. At higher energy and intensity, collision events are more complex to reconstruct and analyse, therefore computing requirements must increase accordingly. Run 2 is anticipated to yield twice the data produced in the first run, about 50 petabytes (PB) per year. So it is an opportune time to look at the LHC’s computing, to see what was achieved during Long Shutdown 1 (LS1), to keep up with the collision rate and luminosity increases of Run 2, how it is performing now and what is foreseen for the future.
LS1 upgrades and Run 2
The Worldwide LHC Computing Grid (WLCG) collaboration, the LHC experiment teams and the CERN IT department were kept busy as the accelerator complex entered LS1, not only with analysis of the large amount of data already collected at the LHC but also with preparations for the higher flow of data during Run 2. The latter entailed major upgrades of the computing infrastructure and services, lasting the entire duration of LS1.
Consolidation of the CERN data centre and inauguration of its extension in Budapest were two major milestones in the upgrade plan achieved in 2013. The main objective of the consolidation and upgrade of the Meyrin data centre was to secure critical information-technology systems. Such services can now keep running, even in the event of a major power cut affecting CERN. The consolidation also ensured important redundancy and increased the overall computing-power capacity of the IT centre from 2.9 MW to 3.5 MW. Additionally, on 13 June 2013, CERN and the Wigner Research Centre for Physics in Budapest inaugurated the Hungarian data centre, which hosts the extension of the CERN Tier-0 data centre, adding up to 2.7 MW capacity to the Meyrin-site facility. This substantially extended the capabilities of the Tier-0 activities of WLCG, which include running the first-pass event reconstruction and producing, among other things, the event-summary data for analysis.
Building a CERN private cloud (preview-courier.web.cern.ch/cws/article/cnl/38515) was required to remotely manage the capacity hosted at Wigner, enable efficient management of the increased computing capacity installed for Run 2, and to provide the computing infrastructure powering most of the LHC grid services. To deliver a scalable cloud operating system, CERN IT started using OpenStack. This open-source project now plays a vital role in enabling CERN to tailor its computing resources in a flexible way and has been running in production since July 2013. Multiple OpenStack clouds at CERN successfully run simulation and analysis for the CERN user community. To support the growth of capacity needed for Run 2, the compute capacity of the CERN private cloud has nearly doubled during 2015, now providing more than 150,000 computing cores. CMS, ATLAS and ALICE have also deployed OpenStack on their high-level trigger farms, providing a further 45,000 cores for use in certain conditions when the accelerator isn’t running. Through various collaborations, such as with BARC (Mumbai, India) and between CERN openlab (see the text box, overleaf) and Rackspace, CERN has contributed more than 90 improvements in the latest OpenStack release.
As surprising as it may seem, LS1 was also a very busy period with regards to storage. Both the CERN Advanced STORage manager (CASTOR) and EOS, an open-source distributed disk storage system developed at CERN and in production since 2011, went through either major migration or deployment. CASTOR relies on a tape-based back end for permanent data archiving, and LS1 offered an ideal opportunity to migrate the archived data from legacy cartridges and formats to higher-density ones. This involved migrating around 85 PB of data, and was carried out in two phases during 2014 and 2015. As an overall result, no less than 30,000 tape-cartridge slots were released to store more data. The EOS 2015 deployment brought storage at CERN to a new scale and enables the research community to make use of 100 PB of disk storage in a distributed environment using tens of thousands of heterogeneous hard drives, with minimal data movements and dynamic reconfiguration. It currently stores 45 PB of data with an installed capacity of 135 PB. Data preservation is essential, and more can be read on this aspect in “Data preservation is a journey” .
Databases play a significant role with regards to storage, accelerator operations and physics. A great number of upgrades were performed, both in terms of software and hardware, to rejuvenate platforms, accompany the CERN IT computing-infrastructure’s transformation and the needs of the accelerators and experiments. The control applications of the LHC migrated from a file-based archiver to a centralised infrastructure based on Oracle databases. The evolution of the database technologies deployed for WLCG database services improved the availability, performance and robustness of the replication service. New services have also been implemented. The databases for archiving the controls’ data are now able to handle, at peak, one million changes per second, compared with the previous 150,000 changes per second. This also positively impacts on the controls of the quench-protection system of the LHC magnets, which has been modernised to safely operate the machine at 13 TeV energy. These upgrades and changes, which in some cases have built on the work accomplished as part of CERN openlab projects, have a strong impact on the increasing size and scope of the databases, as can be seen in the CERN databases diagram (above right).
To optimise computing and storage resources in Run 2, the experiments have adopted new computing models. These models move away from the strict hierarchical roles of the tiered centres described in the original WLCG models, to a peer site model, and make more effective use of the capabilities of all sites. This is coupled with significant changes in data-management strategies, away from explicit placement of data sets globally to a much more dynamic system that replicates data only when necessary. Remote access to data is now also allowed under certain conditions. These “data federations”, which optimise the use of expensive disk space, are possible because of the greatly improved networking capabilities made available to WLCG over the past few years. The experiment collaborations also invested significant effort during LS1 to improve the performance and efficiency of their core software, with extensive work to validate the new software and frameworks in readiness for the expected increase in data. Thanks to those successful results, a doubling of the CPU and storage capacity was needed to manage the increased data rate and complexity of Run 2 – without such gains, a much greater capacity would have been required.
Despite the upgrades and development mentioned, additional computing resources are always needed, notably for simulations of physics events, or accelerator and detector upgrades. In recent years, volunteer computing has played an increasing role in this domain. The volunteer capacity now corresponds to about half the capacity of the CERN batch system. Since 2011, thanks to virtualisation, the use of LHC@home has been greatly extended, with about 2.7 trillion events being simulated. Following this success, ATLAS became the first experiment to join, with volunteers steadily ramping up for the last 18 months and a production rate now equivalent to that of a WLCG Tier-2 site.
In terms of network activities, LS1 gave the opportunity to perform bandwidth increases and redundancy improvements at various levels. The data-transfer rates have been increased between some of the detectors (ATLAS, ALICE) and the Meyrin data centre by a factor of two and four. A third circuit has been ordered in addition to the two dedicated and redundant 100 Gbit/s circuits that were already connecting the CERN Meyrin site and the Wigner site since 2013. The LHC Optical Private Network (LHCOPN) and the LHC Open Network Environment (LHCONE) have evolved to serve the networking requirements of the new computing models for Run 2. LHCOPN, reserved for LHC data transfers and analysis and connecting the Tier-0 and Tier-1 sites, benefitted from bandwidth increases from 10 Gbps to 20 and 40 Gbps. LHCONE has been deployed to meet the requirements of the new computing model of the LHC experiments, which demands the transfer of data among any pair of Tier-1, Tier-2 and Tier-3 sites. As of the start of Run 2, LHCONE’s traffic represents no less than one third of the European research traffic. Transatlantic connections improved steadily, with ESnet setting up three 100 Gbps links extending to CERN through Europe, replacing the five 10 Gbps links used during Run 1.
With the start of Run 2, supported by these upgrades and improvements of the computing infrastructure, new data-taking records were achieved: 40 PB of data were successfully written on tape at CERN in 2015; out of the 30 PB from the LHC experiments, a record-breaking 7.3 PB were collected in October; and up to 0.5 PB of data were written to tape each day during the heavy-ion run. By way of comparison, CERN’s tape-based archive system collected in the region of 70 PB of data in total during the first run of the LHC, as shown in the plot (right). In total, today, WLCG has access to some 600,000 cores and 500 PB of storage, provided by the 170 collaborating sites in 42 countries, which enabled the Grid to set a new record in October 2015 by running a total of 51.1 million jobs.
Looking into the future
With the LHC’s computing now well on track with Run 2 needs, the WLCG collaboration is looking further into the future, already focusing on the two phases of upgrades planned for the LHC. The first phase (2019–2020) will see major upgrades of ALICE and LHCb, as well as increased luminosity of the LHC. The second phase – the High Luminosity LHC project (HL-LHC), in 2024–2025 – will upgrade the LHC to a much higher luminosity and increase the precision of the substantially improved ATLAS and CMS detectors.
The requirements for data and computing will grow dramatically during this time, with rates of 500 PB/year expected for the HL-LHC. The needs for processing are expected to increase more than 10 times over and above what technology evolution will provide. As a consequence, partnerships such as those with CERN openlab and other programmes of R&D are essential to investigate how the computing models could evolve to address these needs. They will focus on applying more intelligence into filtering and selecting data as early as possible. Investigating the distributed infrastructure itself (the grid) and how one can best make use of available technologies and opportunistic resources (grid, cloud, HPC, volunteer, etc), improving software performance to optimise the overall system.
Building on many initiatives that have used large-scale commercial cloud resources for similar cases, the Helix Nebula the Science Cloud (HNSciCloud) pre-commercial procurement (PCP) project may bring interesting solutions. The project, which is led by CERN, started in January 2016, and is co-funded by the European Commission. HNSciCloud pulls together commercial cloud-service providers, publicly funded e-infrastructures and a group of 10 buyers’ in-house resources to build a hybrid cloud platform, on top of which a competitive marketplace of European cloud players can develop their own services for a wider range of users. It aims at bringing Europe’s technical development, policy and procurement activities together to remove fragmentation and maximise exploitation. The alignment of commercial and public (regional, national and European) strategies will increase the rate of innovation.
To improve software performance, the High Energy Physics (HEP) Software Foundation, a major new long-term activity, has been initiated. This seeks to address the optimal use of modern CPU architectures and encourage more commonality in key software libraries. The initiative will provide underlying support for the significant re-engineering of experiment core software that will be necessary in the coming years.
In addition, there is a great deal of interest in investigating new ways of data analysis: global queries, machine learning and many more. These are all significant and exciting challenges, but it is clear that the LHC’s computing will continue to evolve, and that in 10 years it will look very different, while still retaining the features that enable global collaboration.
R&D collaboration with CERN openlab
CERN openlab is a unique public–private partnership that has accelerated the development of cutting-edge solutions for the worldwide LHC community and wider scientific research since 2001. Through CERN openlab, CERN collaborates with leading ICT companies and research institutes. Testing in CERN’s demanding environment provides the partners with valuable feedback on their products, while allowing CERN to assess the merits of new technologies in their early stages of development for possible future use. In January 2015, CERN openlab entered its fifth three-year phase.
The topics addressed in CERN openlab’s fifth phase were defined through discussion and collaborative analysis of requirements. This involved CERN openlab industrial collaborators, representatives of CERN, members of the LHC experiment collaborations, and delegates from other international research organisations. The topics include next-generation data-acquisition systems, optimised hardware- and software-based computing platforms for simulation and analysis, scalable and interoperable data storage and management, cloud-computing operations and procurement, and data-analytics platforms and applications.
As an organisation with more than 60 years of history, CERN has created large volumes of “data” of many different types. This involves not only scientific data – by far the largest in terms of volume – but also many other types (photographs, videos, minutes, memoranda, web pages and so forth). Sadly, some of this information from as recently as the 1990s, such as the first CERN web pages, has been lost, as well as more notably much of the data from numerous pre-LEP experiments. Today, things look rather different, with concerted efforts across the laboratory to preserve its “digital memory”. This concerns not only “born-digital” material but also what is still available from the pre-digital era. Whereas the latter often existed (and luckily often still exists) in multiple physical copies, the fate of digital data can be more precarious. This led Vint Cerf, vice-president of Google and an early internet pioneer, to declare in February 2015: “We are nonchalantly throwing all of our data into what could become an information black hole without realising it.” This is a situation that we have to avoid for all CERN data – it’s our legacy.
Interestingly, many of the tools that are relevant for preserving data from the LHC and other experiments are also suitable for other types of data. Furthermore, there are models that are widely accepted across numerous disciplines for how data preservation should be approached and how success against agreed metrics can be demonstrated.
Success, however, is far from guaranteed: the tools involved have had a lifetime that is much shorter than the desired retention period of the current data, and so constant effort is required. Data preservation is a journey, not a destination.
The basic model that more or less all data-preservation efforts worldwide adhere to – or at least refer to – is the Open Archival Information System (OAIS) model, for which there is an ISO standard (ISO 14721:2012). Related to this are a number of procedures for auditing and certifying “trusted digital repositories”, including another ISO standard – ISO 16363.
This certification requires, first and foremost, a commitment by “the repository” (CERN in this case) to “the long-term retention of, management of, and access to digital information”.
In conjunction with numerous more technical criteria, certification is therefore a way of demonstrating that specific goals regarding data preservation are being, and will be, met. For example, will we still be able to access and use data from LEP in 2030? Will we be able to reproduce analyses on LHC data up until the “FCC era”?
In the context of the Worldwide LHC Computing Grid (WLCG), self-certification of, initially, the Tier0 site, is currently under way. This is a first step prior to possible formal certification, certification of other WLCG sites (e.g. the Tier1s), and even certification of CERN as a whole. This could cover not only current and future experiments but also the “digital memory” of non-experiment data.
What would this involve and what consequences would it have? Fortunately, many of the metrics that make up ISO 16363 are part of CERN’s current practices. To pass an audit, quite a few of these would have to be formalised into official documents (stored in a certified digital repository with a digital object identifier): there are no technical difficulties here but it would require effort and commitment to complete. In addition, it is likely that the ongoing self-certification will uncover some weak areas. Addressing these can be expected to help ensure that all of our data remains accessible, interpretable and usable for long periods of time: several decades and perhaps even longer. Increasingly, funding agencies are requiring not only the preservation of data generated by projects that they fund, but also details of how reproducibility of results will be addressed and how data will be shared beyond the initial community that generated it. Therefore, these are issues that we need to address, in any event.
A reasonable target by which certification could be achieved would be prior to the next update of the European Strategy for Particle Physics (ESPP), and further updates of this strategy would offer a suitable frequency of checking that the policies and procedures were still effective.
The current status of scientific data preservation in high-energy physics owes much to the Study Group that was initiated at DESY in late 2008/early 2009. This group published a “Blueprint document” in May 2012, and a summary of this was input to the 2012 ESPP update process. Since that time, effort has continued worldwide, with a new status report published at the end of 2015.
In 2016, we will profit from the first ever international data-preservation conference to be held in Switzerland (iPRES, Bern, 3–6 October) to discuss our status and plans with the wider data-preservation community. Not only do we have services, tools and experiences to offer, but we also have much to gain, as witnessed by the work on OAIS, developed in the space community, and related standards and practices.
High-energy physics is recognised as a leader in the open-access movement, and the tools in use for this, based on Invenio Digital Library software, have been key to our success. They also underpin more recent offerings, such as the CERN Open Data and Analysis Portals. We are also recognised as world leaders in “bit preservation”, where the 100+PB of LHC (and other) data are proactively curated with increasing reliability (or decreasing occurrences of rare but inevitable loss of data), despite ever-growing data volumes. Finally, CERN’s work on virtualisation and versioning file-systems through CernVM and CernVM-FS has already demonstrated great potential for the highly complex task of “software preservation”.
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.