The CMS Trigger Upgrade for the HL-LHC

The CMS experiment has been designed with a two-level trigger system: the Level-1 Trigger, implemented on custom-designed electronics, and the High Level Trigger, a streamlined version of the CMS offline reconstruction software running on a computer farm. During its"Phase-2"the LHC will reach a luminosity of $7.5\times10^{34}\,\textrm{cm}^{-2}\,\textrm{s}^{-1}$ with a pileup of 200 collisions, integrating over 3000 fb$^{-1}$ over the full experimental run. To fully exploit the higher luminosity, the CMS experiment will introduce a more advanced Level-1 Trigger and increase the full readout rate from 100 kHz to 750 kHz. CMS is designing an efficient data-processing hardware trigger (Level-1) that will include tracking information and high-granularity calorimeter information. The current conceptual system design is expected to take full advantage of advances in FPGA and link technologies over the coming years, providing a high-performance, low-latency computing platform for large throughput and sophisticated data correlation across diverse sources. The higher luminosity, event complexity and input rate present an unprecedented challenge to the High Level Trigger, that aims to achieve a similar efficiency and rejection factor as today despite the higher pileup and more pure preselection. In this presentation we will discuss the ongoing studies and prospects for the online reconstruction and selection algorithms for the high-luminosity era.


The HL-LHC and the CMS Phase-2 Upgrade
Full exploitation of the LHC remains the highest priority of the European Strategy for Particle Physics. The High-Luminosity LHC (HL-LHC) [1] is the natural upgrade path for the accelerator and was approved by the CERN Council in 2015. The ultimate configuration of the accelerator will lead to pp collisions at the design energy of √ s = 14 TeV, and an instantaneous luminosity of 7.5 ×10 34 . At that luminosity, the pileup (occurrence of multiple pp interactions in the same or neighbouring bunch crossings) will reach an average of PU = 200. Those will be the harshest conditions to date in a hadron collider, with an experimental difficulty similar to that experienced in the Tevatron-LHC transition. With that configuration, the accelerator will be able to deliver an integrated luminosity of 3000 fb -1 by 2035.
Coterminous with the HL-LHC era, the CMS Phase-2 Upgrade will bring the CMS detector capabilities up to the task [2,3]. Its goal is to maintain the experimental performance in efficiency, resolution and background rejection for all physics observables. One of the key CMS Phase-2 improvements will be the installation of an all-new tracker, which will have the capability of reading out stubs -matched pairs of hits on each side of a double layer, compatible with the passage of particles above a p T threshold -at the bunch crossing frequency of 40 MHz. Those stubs will be used for track finding at the back end, and will be part of the L1 Track Trigger capabilities as described in Sec. 2. The barrel calorimeters (ECAL and HCAL) will have upgraded electronics, allowing ECAL full granularity (η × φ = 0.02 × 0.02) readout at 40 MHz and compatibility with a L1 triggering rate of 750 kHz. CMS will also have an all-new High Granularity Endcap Calorimeter (HGCAL), which can withstand the HL-LHC radiation dose and allows longitudinal layer-by-layer readout. A MIP Timing Detector (MTD), allowing the measurement of MIPs' production time with 30-60 ps resolution will be installed between the Tracker and Calorimeter systems. For in-depth information of the CMS Phase-2 upgrade, we refer the reader to the CMS subsystems upgrade TDRs 1 .
In order to address the challenging HL-LHC conditions, the CMS Trigger and Data Acquisition System -TriDAS -will also have to undergo an upgrade [4]. As is widely known, an online selection of the collision data is needed in order to reduce the rate of events written to disk to a manageable level. The Trigger part of the system will still be divided in a Level-1 Trigger (L1T), implemented in a set of FPGA boards, and a High-Level Trigger (HLT) implemented as a suite of algorithms running in an online farm of commercial processors. Table 1 gives some of the operating parameters of the new TriDAS; notice that, even assuming a rate reduction of 1/100 at the HLT, the online farm would need to have 18 times (27 times) more processing power (storage capacity) to maintain the current system performance. It follows that a simple upscaling of the current paradigm is not cost-effective and innovative solutions must be pursued.  Table 1. During the HL-LHC era, the CMS Trigger and Data Acquisition system will be operating under much more difficult conditions than the original design. Both the system's processing power and its storage capacity will have to increase by an order of magnitude to face that new challenge.

The Level-1 Trigger Upgrade
A simplified scheme of the upgraded L1T can be seen in Fig. 1, and a detailed description is available in [5]. The overall latency afforded for the system is 12.5 µs, and its maximum output rate is set at 750 kHz. The Outer Tracker, the three calorimeter systems and the muon systems will all provide input to the L1T. Tracker input at the L1T will be fundamental to achieve the target output rate, and the L1 Track Trigger has two processing layers to deliver that: the DAQ, Trigger and Control (DTC) layer and the Track Finding Processor (TFP) layer. The DTC is responsible for the control and readout of the front-end modules. It handles both the main data stream to the central DAQ as well as the stub stream to the TFPs. The TFP system is both time (1/18) and spacemultiplexed (1/9 in the φ coordinate). The track-finding algorithm uses the stubs as input and executes a hybrid algorithm that combines a tracklet seed and road search algorithm with a Hough Transform and 3D Kalman filter tracking approach [6,7].  The presence of the track trigger and the improved calorimetry system, together with the advent of more powerful FPGAs, will allow to perform Particle Flow (PF) reconstruction at Level-1 during CMS Phase-2. That in turn will open up the possibility of identifying the majority of the particles produced by the collisions and deploying advanced pileup mitigation techniques like PileUp Per Particle Identification (PUPPI), which assigns weights to the reconstructed particles roughly proportional to their probability of coming from a PU interaction. Figure 2 shows the improvement using PF+PUPPI on event energy sum quantities (p miss T and H T ≡ scalar p T sum of all jets in the event) when compared to calorimeter-only reconstruction and to a simpler usage of track trigger information. The Level-1 Trigger Technical Design Report, an update on the results presented on the Interim Document, is expected early in 2020. For completeness, we reproduce in Table 2 the L1 menu that was publicly available at the time of this conference. The final report will showcase the enhanced physics performance of the upgraded L1T. It will include the presence of leptonic paths with extended pseudorapidity coverage, paths for reconstruction of light mesons and of displaced objects, paths that make usage of machine learning and of timing information, amongst other improvements. Table 2. Preliminary Level-1 Trigger menu, adapted from Ref. [5]. An improved menu, showcasing the enhanced physics performance of the upgraded system, should be available in early 2020.

The High-Level Trigger Upgrade
The HLT upgrade is slated to deliver a working system by the beginning of Run 4 in 2027.
That system will have to deal not only with the increased pileup conditions but with an input rate that is 7.5 times larger. Additionally, the CMS Phase-2 subdetectors are much more complicated than their current counterparts; the HGCAL comprises close to six million channels, to be compared with approximately 236,000 channels of the full calorimetry system at the start of the experiment. The timing of the algorithm suite remains one of the most difficult issues to be addressed. Figure 3 shows that the dependency of the average processing time of the HLT with the average instantaneous luminosity grows faster than linearly. Studies are ongoing to understand the dependency with average pileup and with pileup density (vertices/cm). The evolution of the absolute rate of the HLT system to Phase-2 conditions is also being studied. The architecture of the system is such that the rate is distributed amongst many different data-taking paths. Algorithms responsible for acquiring events where a single lepton (electron or muon) was produced represent approximately 25% of the HLT rate; those scale almost linearly with the pileup. On the other hand, the rate of p miss T algorithms again grows faster than linearly with the pileup, as can be seen in Fig. 4. From a computing point of view, there are a few approaches to tackle those challenges. The fact that CMSSW is fully multithreaded allows for full usage of the multicore systems that have become the norm in computing. Additionally, code modernisation and streamlining campaigns help uncover inefficient sections of the algorithm suite. Finally, one can expect that the industry will keep developing hardware that improves the price/performance ratio. From 2008 to 2018, this assumption was approximately true for computing nodes equipped with Intel R processors: 14 nm, 2018 processor-equipped nodes are ten times more powerful, in terms of the HEP-SPEC06 benchmark, than 45 nm, 2008 processor nodes 2 .
Another thing that can be done is relaxing some of the design assumptions of the High-Level Trigger. The scouting approach gives up on recording the full raw data from the detector; instead, one saves only the output objects of the online reconstruction at the desired level. The limitation of scouting with calorimetric-based objects is essentially given by the L1 reconstruction thresholds. On the other hand, scouting with full particle-flow objects is limited by the HLT timing budget. Both approaches have already been successfully used at CMS [9][10][11]; a new approach for scouting with L1-based objects was recently proposed [12].
The deployment of heterogeneous architectures at the High-Level Trigger is yet another approach to tackle the HL-LHC data-taking challenge. The combination of specialised hardware (GPUs, FPGAs, ARM cores, etc.) with standard x86-64 CPUs allows to offload some tasks to those accelerators. The Patatrack project is currently addressing that problem in CMS; the speed improvement gained by offloading the pixel reconstruction to GPUs has been successfully demonstrated [13], and a preliminary implementation is scheduled to be deployed already in Run 3.

Conclusions
The High-Luminosity LHC will bring the harshest conditions in a high-energy collider experiment to date. The jump in computing needs will be even bigger than at the start of the LHC era. CMS is undergoing a full upgrade program in order to meet the challenge, with completely new subsystems (HGCAL, MTD) being built and enhancements being made to the existing ones. Trigger and Data Acquisition remains one of the hardest problems to tackle, due to the sheer amount and the complexity of the events to be collected. Sophisticated reconstruction at Level-1, with Track Trigger, Particle Flow and pileup mitigation techniques is being developed. At the High-Level Trigger, the deployment of heterogeneous hardware (FPGAs, GPUs), the optimisation of the reconstruction algorithms, and the adoption of alternative data-taking strategies like scouting are the keys to success. Overcoming this challenge is mandatory to unlock the physics potential of the 3000 fb -1 of collision data that will be delivered by the HL-LHC.