CERN Accelerating science

HEP Software Foundation Community White Paper looks forward to the HL-LHC

The High Luminosity LHC programme not only pushes the frontiers of accelerator and detector technology, but it also brings enormous challenges to the software and computing that is used to turn high luminosity data into physics. The scale of the problem is huge - the total LHC dataset is already almost 1 exabyte and some 30 times more data than the LHC has currently produced will be collected by ATLAS and CMS in the future. Extrapolating today’s solutions a decade into the future leaves experiments short by at least an order of magnitude in storage and computing, if one assumes Moore's Law and more or less constant operational budgets. At the same time, the nature of computing hardware (processors, storage, networks) is evolving, with radically new paradigms that will require significant re-engineering to exploit.

ATLAS Estimated CPU resources (in kHS06) needed for the years 2018 to 2028 for both data and simulation processing. The blue points are estimates based on the current software performance estimates and using the ATLAS computing model parameters from 2017. The solid line shows the amount of resources expected to be available if a flat funding scenario is assumed, which implies an increase of 20% per year, based on the current technology trends.

CMS estimated disk space required into the HL-LHC era, using the current computing model with parameters projected out for the next 12 years.

In anticipation of these challenges the HEP Software Foundation (HSF) was founded in 2014 to encourage common approaches to the problems we face. The HSF was then charged by WLCG to produce a Community White Paper Roadmap (CWP) for HEP that anticipates the “software upgrade” that is needed to run in parallel with the detector hardware upgrades planned for the HL-LHC. As well as improving the performance of our software for modern architectures we wanted to explore new approaches that would extend our physics reach and ways to improve the sustainability of our software in the coming decades. Although there was a HL-LHC focus we looked at the problems from the perspective of the whole HEP program, including the Linear Collider, the Intensity Frontier, Belle II, and the FCC.

The CWP initiative kicked off with a workshop in San Diego that brought together more than 100 software and computing experts for 2.5 days of plenary and topical discussions. From the ideas seeded here many working groups were formed that in the following six months organised their own workshops and events to marshal ideas and engage with experts outside of our field. A final workshop at LAPP in Annecy in June 2017 started to conclude the process with working groups presenting their work and plans. While groups finalised their work over the next few months, producing papers that will be uploaded to arXiv, an editorial board was assembled that encompased a broad cross section of software and computing experts. The Editorial Board took charge of summarising the work of each of the working groups and producing the final CWP Roadmap. A first draft was released in October, followed by a second draft in November and the final version of the Roadmap has been published on arXiv. Almost every aspect of HEP software and computing is presented in 13 sections. In each section the challenges are discussed, current practice described, and an R&D programme is presented that describes the work that is required in the coming years.

The HSF final CWP workshop in Annecy gathered almost 100 experts from HEP software and computing

Simulation remains a critical part of our programme, with improvements to physics event generators needed to effectively use next-to-next-to-leading order event generation for the processes studied at the HL-LHC, where the massive volume of data reduces experimental uncertainties well below those from theoretical predictions in many cases. Improved physics models for detector simulation need to be developed for high precision work at the LHC and for the neutrino programme. Adapting Geant4 for effective use on modern CPUs and GPUs is another part of the R&D programme, as well as developing common toolkits to help with Fast Simulation. The shift to new computing architectures is equally important for our software triggers and event reconstruction code, where the pile-up at high luminosity makes charged particle tracking within a reasonable computing budget a key challenge to face. Doing more and more in software triggers, as being developed by ALICE and LHCb for Run 3, will help control the data volumes and enable analysis to happen directly from initial reconstruction. The development of Machine Learning techniques appropriate to our field should also lead to advances that improve both simulation and reconstruction performance and reduce costs. These techniques are also under investigation for analysis, where they already find many applications in Run 2. Taking techniques from outside our field offers great promise, as many data science tools look to have applications in HEP. The data science domain tends to tackle analysis problems on dedicated cluster resources, a version of which could replace the many expensive cycles of data skimming and thinning that are employed today.

This restructuring of resources at facilities is a key area to develop in order to evolve our WLCG computing site resources and also to incorporate commercial and scientific clouds into the pool available for HEP computing. In some regions HPCs will also play a major role for us in the future, but are not suitable for current HEP workflows. More effective use of the network and more consolidated storage resources, into a ‘data lake’ configuration will help deliver data to compute resources more effectively than is done today. Our workload management systems and software frameworks will need to evolve to this new heterogeneous landscape.

The challenges we face are wide ranging and hard and they require new investment in the critical areas and a commitment to solving problems in common. We will have to train a new generation of physicists with updated computing skills and help the career paths of our specialists. ‘Business as usual’ will not solve these problems nor will hardware come to our rescue.

The CWP Roadmap has already been very widely endorsed by the community, but we want as many people as possible in HEP software and computing to support it, which you can still do here or by sending an email to hsf-cwp-ghost-writers@googlegroups.com. The next step after defining the roadmap is to start to walk along the road; the joint WLCG/HSF workshop in Naples in March 2018 will start to put into practice the plans we have laid out.