CMS gears up for the LHC data deluge

12 August 2016

Trigger upgrade helps CMS tame collision environment of LHC Run 2

ATLAS and CMS, the large general-purpose experiments at CERN’s Large Hadron Collider (LHC), produce enormous data sets. Bunches of protons circulating in opposite directions around the LHC pile into each other every 25 nanoseconds, flooding the detectors with particle debris. Recording every collision would produce data at an unmanageable rate of around 50 terabytes per second. To reduce this volume for offline storage and processing, the experiments use an online filtering system called a trigger. The trigger system must remove the data from 99.998% of all LHC bunch crossings but keep the tiny fraction of interesting data that drives the experiment’s scientific mission. The decisions made in the trigger, which ultimately dictate the physics reach of the experiment, must be made in real time and are irrevocable.

The trigger system of the CMS experiment has two levels. The first, Level-1, is built from custom electronics in the CMS underground cavern, and reduces the rate of selected bunch crossings from 40 MHz to less than 100 kHz. There is a period of only four microseconds during which a decision must be reached, because data cannot be held within the on-detector memory buffers for longer than this. The second level, called the High Level Trigger (HLT), is software-based. Approximately 20,000 commercial CPU cores, housed in a building on the surface above the CMS cavern, run software that further reduces the crossing rate to an average of about 1 kHz. This is low enough to transfer the remaining data to the CERN Data Centre for permanent storage.

The original trigger system served CMS well during Run 1 of the LHC, which provided high-energy collisions at up to 8 TeV from 2010–2013. Designed in the late 1990s and operational by 2008, the system allowed the CMS collaboration to co-discover the Higgs boson in multiple final-state topologies. Among hundreds of other CMS measurements, it also allowed us to observe the rare decay Bsμμ with a significance of 4.3σ.

In Run 2 of the LHC, which got under way last year, CMS faces a much more challenging collision environment. The LHC now delivers both an increased centre-of-mass energy of 13 TeV and increased luminosity beyond the original LHC design of 1034 s–1 cm–2. While these improve the detector’s capability to observe rare physics events, they also result in severe event “pile-up” due to multiple overlapping proton collisions within a single bunch crossing. This effect not only makes it much harder to select useful crossings, it can drive trigger rates beyond what can be tolerated. This could be partially mitigated by raising the energy thresholds for the selection of certain particles. However, it is essential that CMS maintains its sensitivity to physics at the electroweak scale, both to probe the couplings of the Higgs boson and to catch glimpses of any physics beyond the Standard Model. An improved trigger system is therefore required that makes use of the most up-to-date technology to maintain or improve on the selection criteria used in Run 1.

Thinking ahead

In anticipation of these challenges, CMS has successfully completed an ambitious “Phase-1” upgrade to its Level-1 trigger system that has been deployed for operation this year. Trigger rates are reduced via several criteria: tightening isolation requirements on leptons; improving the identification of hadronic tau-lepton decays; increasing muon momentum resolution; and using pile-up energy subtraction techniques for jets and energy sums. We also employ more sophisticated methods to make combinations of objects for event selection, which is accomplished by the global trigger system (see figure 1).

These new features have been enabled by the use of the most up-to-date Field Programmable Gate Array (FPGA) processors, which provide up to 20 times more processing capacity and 10 times more communication throughput than the technology used in the original trigger system. The use of reprogrammable FPGAs throughout the system offers huge flexibility, and the use of fully optical communications in a standardised telecommunication architecture (microTCA) makes the system more reliable and easier to maintain compared with the previous VME standard used in high-energy physics for decades (see Decisions down to the wire).

Decisions down to the wire

Overall, about 70 processors comprise the CMS Level-1 trigger upgrade. All processors make use of the large-capacity Virtex-7 FPGA from the Xilinx Corporation, and three board variants were produced. The first calorimeter trigger layer uses the CTP7 board, which highlights an on-board Zync system-on-chip from Xilinx for on-board control and monitoring. The second calorimeter trigger layer, the barrel muon processors, and the global trigger and global muon trigger use the MP7, which is a generic symmetric processor with 72 optical links for both input and output. Finally, a third, modular variant called the MTF7 is used for the overlap and end-cap muon trigger regions, and features a 1 GB memory mezzanine used for the momentum calculation in the end-cap region. This memory can store the calculation of the momentum from multiple angular inputs in the challenging forward region of CMS where the magnetic bending is small.

The Level-1 trigger requires very rapid access to detector information. This is currently provided by the CMS calorimeters and muon system, which have dedicated optical data links for this purpose. The calorimeter trigger system – which is used to identify electrons, photons, tau leptons, and jets, and also to measure energy sums – consists of two processing layers. The first layer is responsible for collecting the data from calorimeter regions, summing the energies from the electromagnetic and hadronic calorimeter compartments, and organising the data to allow efficient processing. These data are then streamed to a second layer of processors in an approach called time-multiplexing. The second layer applies clustering algorithms to identify calorimeter-based “trigger objects” corresponding to single particle candidates, jets or features in the overall transverse-energy flow of the collision. Time-multiplexing allows data from the entire calorimeter for one beam crossing to be streamed to a single processor at full granularity, avoiding the need to share data between processors. Improved energy and position resolutions for the trigger objects, along with the increased logic space available, allows more sophisticated trigger decisions.

The muon trigger system also consists of two layers. For the original trigger system, a separate trigger was provided from each of the three muon-detector systems employed at CMS: drift tubes (DT) in the barrel region; cathode-strip chambers (CSC) in the endcap regions; and resistive plate chambers (RPC) throughout the barrel and endcaps. Each system provides unique information useful for making a trigger decision; for example, the superior timing of the RPCs can correct the time assignment of DTs and CSC track segments, as well as provide redundancy in case a specific DT or CSC is malfunctioning.

In Run 2, we combine trigger segments from all of these units at an earlier stage than in the original system, and send them to the muon track-finding system in a first processing layer. This approach creates an improved, highly robust muon trigger that can take advantage of the specific benefits of each technology earlier in the processing chain. The second processing layer of the muon trigger takes as input the tracks from 36 track-finding processors to identify the best eight candidate muons. It cancels duplicate tracks that occur along the boundaries of processing layers, and will in the future also receive information from the calorimeter trigger to identify isolated muons. These are a signature of interesting rare particle decays such as those of vector bosons.

A feast of physics

Finally, the global trigger processor collects information from both the calorimeter and muon trigger systems to arrive at the final decision on whether to keep the data from a given beam crossing – again, all in a period of four microseconds or less. The trigger changes made for Run 2 allow an event selection procedure that is much closer to that traditionally performed in software in the HLT or in offline analysis. The global trigger applies the trigger “menu” of the experiment – a large set of selection criteria designed to identify the broad classes of events used in CMS physics analyses. For example, events with a W or Z boson in the final state can be identified by the requirement for one or two isolated leptons above a certain energy threshold; top-quark decays by demanding high-energy leptons and jets in the same bunch crossing; and dark-matter candidates via missing transverse energy. The new system can contain several hundred such items – which is quite a feast of physics – and the complete trigger menu for CMS evolves continually as our understanding improves.

The trigger upgrade was commissioned in parallel with the original trigger system during LHC operations in 2015. This allowed the new system to be fully tested and optimised without affecting CMS physics data collection. Signals from the detector were physically split to feed both the initial and upgraded trigger systems, a project that was accomplished during the LHC’s first long shutdown in 2013–2014. For the electromagnetic calorimeter, for instance, new optical transmitters were produced to replace the existing copper cables to send data to the old and new calorimeter triggers simultaneously. A complete split was not realistic for the barrel muon system, but a large detector slice was prepared nevertheless. The encouraging results during commissioning allowed the final decision to proceed, with the upgrade to be taken in early January 2016.

As with the electronics, an entirely new software system had to be developed for system control and monitoring. For example, low-level board communication changed from a PCI-VME bus adapter to a combination of Ethernet and PCI-express. This took two years of effort from a team of experts, but also offered the opportunity to thoroughly redesign the software from the bottom up, with an emphasis on commonality and standardisation for long-term maintenance. The result is a powerful new trigger system with more flexibility to adapt to the increasingly extreme conditions of the LHC while maintaining efficiency for future discoveries (figure 2, previous page).

Although the “visible” work of data analysis at the LHC takes place on a timescale of months or years at institutes across the world,  the first and most crucial decisions in the analysis chain happen underground and within microseconds of each proton–proton collision. The improvements made to the CMS trigger for Run 2 mean that a richer and more precisely defined data set can be delivered to physicists working on a huge variety of different searches and measurements in the years to come. Moreover, the new system allows flexibility and routes for expansion, so that event selections can continue to be refined as we make new discoveries and as physics priorities evolve.

The CMS groups that delivered the new trigger system are now turning their attention to the ultimate Phase-2 upgrade that will be possible by around 2025. This will make use of additional information from the CMS silicon tracker in the Level-1 decision, which is a technique never used before in particle physics and will approach the limits of technology, even in a decade’s time. As long as the CMS physics programme continues to push new boundaries, the trigger team will not be taking time off.

bright-rec iop pub iop-science physcis connect