CHEP 2015: Discussing future challenges in HEP computing

John Harvey 15th Jun 2015

The HEP software and computing communities organise a conference every 18 months with the location rotating between the US, Europe and Asia. This year's conference, CHEP2015, took place from 13-17 April and was organised by our Japanese colleagues on the pacific island of Okinawa, at the Okinawa Institute of Science and Technology. Over half of the faculty and students there are recruited from outside Japan, and all education and research is conducted entirely in English. The President of OIST is Jonathan Dorfan, a former director of SLAC, and the Dean of Faculty Affairs is Ken Peach, a former deputy Physics Department head at CERN. Both addressed the conference and gave delegates a warm welcome.

HEP experimental programmes are evolving rapidly and this proved to be an opportune time to meet and exchange experiences, to review recent progress and plan for the future. Experimental groups at the LHC have reviewed their Run 1 experience in detail, acquired the latest computing and software technologies, and constructed new computing models to prepare for Run 2. On the side of the intensity frontier, SuperKEKB will start commissioning in 2015, and there are also ambitious fixed-target programmes at CERN, Fermilab and J-PARC. In nuclear physics, FAIR is under construction and RHIC well engaged into its Phase-II research program facing increased datasets and new challenges with precision physics. For the future, developments are progressing towards the construction of ILC and non-accelerator experiments are also seeking novel computing models as their apparatus and operation become larger and more distributed.

The conference attracted a total of 450 delegates from 28 countries and 535 contributions were accepted for presentation. The plenary sessions were reserved for invited talks that covered a wide range of topical subjects in some depth. In addition there was a total of 8 parallel sessions each organised around a specific theme. This way of organising the programme allowed a large number of oral presentations to be given (264), with the remainder of the contributions (248) being given during the poster sessions.

Concerning software, LHC experiments reported improvements in the performance of reconstruction code in time for running at higher energy and luminosity in Run2. For example, ATLAS reconstruction has improved by a factor 4 following optimisation of the tracking code and by changing to use a new math library. An important theme of the conference was the need to evolve software models to better exploit available hardware, even if this implies a major rewrite of existing code. On this aspect, CMS presented a new multi-threaded version of its data-processing framework (CMSSW) and this is now production-ready. Some more examples of research being done include:

a lively R&D programme that is looking into ways of using GPUs, mainly in online applications but also for event generation, simulation and in analysis;
renewed interest in exploiting FPGAs, trigger and online systems being the perfect candidates;
the use of vectorisation, which is now widely adopted in production software systems;
active research in the area of optimising software performance as a function of power consumption.

The bottom-line is that taking advantage of new hardware features is possible but requires major re-design and re-implementation of both algorithms and data structures. Measuring all aspects of software performance is a pre-requisite for this work and this is challenging due to the complexity of hardware and the need for developing many benchmarks.

Concerning hardware it is evident that certain characteristics, such as chip density and storage bandwidth, are unlikely to improve at the same rate as they have in the past. Improvements in performance are likely to require more intelligent workflows and software that is gentler on resources. Another noticeable trend is the way that resources are being utilized in heterogeneous ways, with grid computing resources being augmented by 'clouds' and supercomputers (HPCs). Commercial clouds have been integrated in experiment job queues and now supply a noticeable fraction of the total resources used in production. Moreover, a substantial share of HEP's needed resources can also be obtained opportunistically by 'backfilling' supercomputers. These are huge machines that currently have several hundred MCPU-hr of unused capacity each year. They are ideal for CPU-bound tasks that require little IO, such as found in event generation and simulation applications. Another message was the strong coupling that exists between online and offline computing, evident through use of common data processing frameworks and the exploitation of HLT farm resources for offline purposes (as exemplified by ALICE's O2 project).

There was general agreement that the conference had been extremely well organised by our Japanese hosts. We were treated to some fine Japanese cuisine at the conference banquet and joined in with some traditional Japanese dancing (see photo). Finally, it was announced that SLAC and LBL have agreed to share the organisation of the CHEP2016 conference, which will take place in Fall 2016 in California.

CERN Accelerating science

CHEP 2015: Discussing future challenges in HEP computing