CERN Accelerating science

CMS Trigger: Lessons from LHC Run 2

The CMS experiment has a two-level trigger system [1], to reduce the number of events stored from the LHC bunch crossing rate of 40 MHz to around 1 kHz. The first level (the Level-1) uses custom electronics to reduce the rate of events to 100 kHz, and the second level (the High Level Trigger), based on commodity computing hardware, provides a further reduction to 1 kHz.

The Level-1 trigger takes as input coarse information from the CMS calorimeter systems and muon chambers, to reconstruct physics objects such as electrons, muons and jets, and then uses these objects, often in combination, to determine whether the event is worthy of further analysis by the Higher Level Trigger. Up to 512 conditions, making up a menu, are evaluated in this decision. If the event is accepted, the trigger signals this to the detector systems and the full granularity event data is sent to the High Level Trigger. The Level-1 trigger must make its decision within 4 microseconds, as this is how long data may be stored in the detectors, waiting for the accept signal.

The High Level Trigger (HLT) runs the same multithreaded software as the CMS offline reconstruction, optimised to take its decision within an average of a few hundred milliseconds. The number of conditions evaluated in the HLT is not fixed, and in Run 2 was typically around 500, similar to the maximum of the Level-1 trigger. The HLT receives the full event data, and performs on-demand reconstruction using algorithms and calibrations as close as possible to those used offline, ensuring that the quantities used to select events are very close to the quality of those used in final analyses.

Preparations for Run 2

Preparations for Run 2 (2015-18) started while Run 1 (2010-12) was still in progress. The CMS trigger was built to accommodate the design specification of the LHC, of instantaneous luminosities of up to 1x1034 cm-2 s-1 with around 20 simultaneous proton-proton interactions per bunch crossing (pile-up). After the start-up of the LHC the luminosity rose steeply and it became clear that to fully profit from the excellent performance of the LHC, the CMS trigger would need to be upgraded.

The Level-1 (L1) trigger was completely upgraded during the shutdown between Run 1 and Run 2 [2]. Everything was replaced, from the clock distribution system, all the electronics boards and fibres, right down to the databases used to store configuration data. In addition to providing improved performance, to fully benefit from the higher luminosity delivered by the LHC, the L1 trigger system was also made more robust. Legacy electronics based on the venerable VME standard were fully replaced with micro-TCA electronics, a modern telecoms standard, and state-of-the-art Field Programmable Gate Arrays were installed (see Fig. 1). Parallel galvanic links between processing components were replaced with faster (up to 10 Gb/s), more reliable and lower maintenance serial optical links. In order to not place CMS data-taking at risk from this ambitious upgrade, the decision was made to duplicate inputs to the L1 trigger from the calorimeters and a subset of the inputs from the muon chambers, to allow commissioning of the new system to proceed in parallel with reliable data-taking using the legacy trigger system.

Figure 1: An example of one of the micro-TCA processing cards developed for the Level-1 trigger upgrade for Run 2 (left) and an example of the new micro-TCA electronics, which replaced the VME standard for Run 2 (right).

The High Level Trigger is upgraded approximately annually through the addition of new computing nodes, with the latest generation of CPUs. In addition to upgrading the hardware, the CMS software is continually being improved to deliver higher performance. The increased luminosity from Run 1 to Run 2, meant the HLT was required to process events with increased pile-up, leading to longer processing times per event, and therefore requiring enhanced performance.

The main change for the HLT for Run 2 was to switch to a multithreaded version of the CMS software. This allows the analysis of multiple events concurrently within a single process, sharing non-event data while running over multiple CPU cores. This has an overall lower memory footprint, which in turn allows the HLT to run a larger number of jobs and take advantage of the Intel HyperThreading technology, to gain almost 20% higher performance. The HLT ran with up to 30 000 cores in Run 2 after the final upgrade, with an approximately equal mix of Intel Haswell, Broadwell and Skylake CPUs.

Performance and highlights

The LHC started Run 2, colliding protons at a centre-of-mass energy of 13 TeV for the first time, in 2015. The legacy L1 trigger was used for the start-up, transitioning to an intermediate upgrade using some new electronics, in particular for the heavy ion run towards the end of 2015. For this run CMS had a much improved trigger compared to Run 1, including specific heavy ion quantities such as the collision centrality. Commissioning of the full L1 upgrade with a fully parallel path for the calorimeter trigger and a slice of the muon detectors for the muon trigger was completed late in 2015. The multi-threaded software for HLT was commissioned, and CMS was ready for the LHC to turn up the luminosity!

In 2016 the fully upgraded L1 trigger became the primary trigger for CMS. Parallel running in 2015 meant that the hardware systems were fully debugged but with a multitude of new and more sophisticed algorithms, optimising and calibrating the new trigger was still a big challenge. After a somewhat rocky start to the run, successive iterations of optimisation led to smooth and efficient running, and perhaps surprisingly a very successful year in terms of uptime and data quality.

The challenges for the start of the 2017 run originated with CMS detector upgrades. In particular the installation of a brand new pixel detector at the heart of CMS, which necessitated a fresh look at the track finding software to reap the benefits of an extra layer of silicon very close to the collision point. The HLT group also took the opportunity to reset the physics trigger menu. Starting from scratch and building up the physics selection anew. Problems with the new pixel detector required urgent modifications to the tracking software at HLT, to mitigate the effect of dead areas of the detector. After much hard work this was completed and despite having a larger fraction of inactive channels than expected the new pixel detector gave much improved performance.

The LHC accelerator also saw problems in 2017. The presence of air accidentally allowed inside a vacuum chamber (amusingly nicknamed the Gruffalo) caused problems. Interactions between the air and the proton beams led to large beam losses and subsequent beam dumps. The LHC bunch structure was changed to reduce the number of filled bunches and include more empty bunches, to mitigate the problem. The knock-on effect for ATLAS and CMS was fewer filled bunches for a given luminosity and therefore higher pile-up values. CMS ran, with the LHC levelling the instantaneous luminosity by offsetting the proton beams, at around 55 pile-up, significantly higher than had been planned for. The trigger configuration was quickly adapted to this new mode of running and smooth, if not entirely comfortable, operation was regained.

The CMS trigger started to see significant effects from the expected radiation damage to parts of the CMS detector in 2017. In particular the most forward elements of the electromagnetic calorimeter, where radiation damage degraded the transparency of the lead-tungstate crystals, requiring large corrections to the response. This resulting in increased noise and large trigger rates, especially for missing energy triggers. Noise rejection thresholds were optimised for 2018, mitigating the effect on the trigger.

After the challenges of commissioning a new trigger system and adapting to LHC conditions in 2017, the final year of Run 2 in 2018 was smooth and highly successful. The long proton-proton part of the run yielded the highest luminosity yet from the LHC, peaking at just under 2.1x1034 cm-2 s-1 and delivering almost 70 fb-1 of data to CMS. The L1 trigger made extensive use of the new capabilities of the system, for example invariant mass calculations were used to improve efficiencies for vector-boson fusion Higgs channels and for b-physics resonances.

Figures 2 and 3 display some key performance metrics for the CMS trigger, measured in Run 2 data. Figure 2 shows the efficiency of the Level-1 single muon trigger and Fig. 3 the efficiency of the Level-1 tau trigger. Simple single object triggers, such as these, were used widely both in searches for new physics and Standard Model measurements, supplemented by more sophisticated, analysis dependent trigger conditions.

Figure  2: The  efficiency of the Level-1 single muon trigger with a threshold of 22 GeV, which was  a typical value  in Run 2. The efficiency is presented as a function of offline muon pT (left) and muon η (right), and was measured using the unbiased tag and probe technique in events in which a Z boson was produced and decayed to two muons.

Figure 3: The efficiency of the Level-1 hadronic tau trigger with thresholds of around 30 GeV, which were typical for di-tau triggers in Run 2. The efficiency is presented as a function of offline visible hadronic tau pT (left) and number of vertices in the event, which is highly correlated with the pile-up (right). The measurements were made using the unbiased tag and probe technique in events in which a Z boson was produced and decayed to two tau leptons.

As in the final year of Run 1 the CMS collaboration decided to “park” a large data sample, writing it to tape and only running reconstruction on it later, after the end of Run 2. Preparations were made to write a sample of unbiased b-quark decays for later analysis. Profiting from the lower pile-up and smaller event sizes at the end of LHC fills, muon triggers were adjusted to collect these events as the luminosity dropped. By the end of the 2018 proton-proton run 12 billion such events had been saved, corresponding to almost 10 billion b-quark decays (around 20 times larger than the BaBar experiment at SLAC achieved in its lifetime). Rates of up to 50 kHz at L1 and 5.5 kHz at HLT were dedicated to collecting this dataset.

The 2018 run concluded with a highly successful Pb-Pb run, once again with dedicated triggers for the run and an expanded range of heavy ion specific quantities used in the trigger.

Prospects for Run 3 and HL-LHC

The prospects for Run 3 look very bright. Upgrades to the LHC are expected to deliver data samples of unprecedented size, enabling a wealth of novel physics measurements and searches. The CMS trigger system will undergo a modest programme of improvements towards Run 3 over the current shutdown and also prepare for the High-Luminosity LHC (HL-LHC) programme of Run 4 [3, 4]. Some developments originally intended for Run 4, look likely to be used in Run 3 already, for example Kalman Filter muon track finding in the L1 trigger, which provides a means to collect events with muons displaced far from the beamline and GPU based reconstruction at HLT, which is likely to be required for Run 4 and beyond.

Conclusion

The CMS trigger ran very successfully in LHC Run 2, taking data stably and efficiently for the varied CMS physics programme. The flexibility of the trigger system was tested in heavy ion runs and by evolving machine conditions and innovative additions to the physics programme. The lessons learnt in Run 2 are being applied to prepare for Run 3 and looking further ahead to HL-LHC.

 

References

[1] CMS Collaboration, The CMS trigger system, JINST 12 (2017) P01020.

[2] CMS Collaboration, CMS Technical Design Report for the Level-1 Trigger Upgrade, CERN-LHCC-2013-011 (2013).

[3] CMS Collaboration, The Phase-2 Upgrade of the CMS L1 Trigger Interim Technical Design Report, CERN-LHCC-2017-013 (2017).

[4] CMS Collaboration, The Phase-2 Upgrade of the CMS DAQ Interim Technical Design Report, CERN-LHCC-2017-014 (2017).