A couple of months ago, the four-year life cycle of AMVA4NewPhysics, a Horizon2020-funded Marie Skłodowska-Curie Innovative Training Network (ITN), in which CERN participated as one of the member institutions, was completed. With the ultimate objective of searching ways which would improve the measurement and search sensitivity of the ATLAS and CMS experiments at the LHC, AMVA4NewPhysics, under the scientific coordination of Dr. Tommaso Dorigo (INFN, Padova), focused on the study of advanced Multivariate analysis methods for High Energy Physics. Embracing the individual and collaborative work of its members, in the direction of broadening the base of research and enhancing its innovation, AMVA4NewPhysics achieved the set goal, leading the development and optimisation of several promising Machine Learning (ML) tools for use by the HEP experiments, while meeting and advancing the key principles that distinguish a Marie Skłodowska-Curie programme.
The work performed within AMVA4NewPhysics, parts of which have been publicly presented in conferences, workshops and seminars by its members, may be viewed as consisting of four main pillars: (1) customization and optimization of advanced Statistical Learning (SL) tools for the precise measurement of the properties of the Higgs boson, (2) development of new SL algorithms towards achieving higher sensitivity in physics analyses of targeted and global New Physics searches, (3) improvement of the Matrix Element Method through the addition of new tools that extend its applications, as in the Higgs measurements, and (4) development of new SL algorithms for use by the HEP analyses, from modeling methods to anomaly detection methods in model-independent searches.
One of the main studies conducted within the first work package is the application of Machine Learning techniques in the signal versus background classification problem for the Higgs boson decay to a pair of tau leptonsl [1-2]. This study was inspired by the 2014 ‘Higgs Boson ML Challenge’, a benchmark competition on the comparison of the applicability of different ML approaches to HEP datasets. In this study, many tests were made towards assessing the extent to which alternative and recent ML techniques may improve the performance of the then winning solution. In the Neural Networks used for the tests, several modifications are considered, for example in the activation function and learning rate choice, and in the use of ensembling and data augmentation. The proposed solution is eventually found to demonstrate an important improvement over the competition’s winning one, in terms of not only the performance measurement, but also the training and inference time, as well as the hardware required.
Another significant result, connected with the content of the second work package, is the application of Deep Neural Networks to the implementation of multiclass classification for the heavy flavour tagging at the CMS experiment. The identification of jets originating from b and c quarks plays a crucial role in the sensitivity of the physics analyses performing New Physics searches or any precision studies. The suggested taggers DeepCSV, DeepFlavour and DeepJet [3-8], three versions of a common generic approach, significantly outperform the standard identifiers in CMS in all transverse momentum regions, offering a notable gain in the b/c jet efficiency versus the corresponding misidentification probability for the different origins of jets. More input variables describing the jet constituents, deeper Neural Network processing the information, and a NN model that exploits the jet structure as it being an image, constitute the key components of the evolution and the differences among the aforementioned, currently recommended in CMS, tagger versions.
Figure shows b-jet efficiency vs. misidentification probability for c-jets, and uds- and gluon-jets of simulated events, requiring a minimal transverse momentum of 30 GeV. Top: Comparison between DeepCSV and CSVv2, cMVAv2 [4]. Bottom: Comparison between DeepFlavour and DeepCSV, noConv (a DeepFlavour approach with no convolutional layers) [5]
A third indicative example of the rich scientific outcome of the AMVA4NewPhysics ITN lies in the domain of the Matrix Element Method (MEM) applications. MEM appears as the alternative to the ML techniques used on the LHC datasets to discover the data structure and compare it with theory; it proposes starting from theory to evaluate the experimental events probabilities and then measure the related compatibility with the experimental data. Despite the fact that no NN training is required in this case, the intrinsic complexity of this method in terms of the numerical integration that is needed raises other computing time restrictions. However, the suggested MoMEMta software package [9-12], via the parametrisation of the phase space it performs and the consequent change of integration variables, and along with additional technical features, provides a fast, modular and user-friendly way to tackle these MEM problems, manifesting its functionality in several use-cases, as in the Higgs measurements.
Photo with several of the ESRs, while attending the statistical lecture of Prof. Gilles Louppe, organised by AMVA4NewPhysics, during the Network's Workshop in Athens in June 2018.
Moving to the fourth research pillar of this ITN, a number of multivariate algorithms were developed for analysis tasks in Higgs physics and new particle searches. In particular, Inverse Bagging, a novel model-independent method for New Physics searches, has been proposed [13-15]. As the related data may be divided into two categories, simulated (labelled) and experimental (unlabelled), a semi-supervised anomaly detection problem arises. Besides performing hypothesis testing, this method proceeds to multiple data sampling, which can eventually provide classification of observations into signal- and background-like; the information from the individual anomalous properties of the observations, and the observation scores that are eventually obtained through the multiple sampling iterations, can finally lead to deducing how likely an observation has been generated by a signal. Several tests of this method have been done, as well as comparisons with different methods. Inverse Bagging demonstrates a generally satisfying performance, and therefore has the potential of becoming a promising tool for use in the anomaly detection problems.
The material corresponding to the entire work performed within this ITN, including the aforementioned studies, along with the results obtained, has been submitted in the form of several dedicated documents that are available on the AMVA4NewPhysics website, HERE. All this outcome was made possible within -and thanks to- a Network that would manifest a strong diversity aspect; AMVA4NewPhysics comprised an almost fully gender-balanced core of ten Early-Stage Researchers (ESRs) of several nationalities, who pursued their Ph.D. in different universities and countries, and a number of eminent scientists of the academic and non-academic sector, who would share their knowledge and expertise in the fields of HEP, Statistics and Data Science. Throughout the Network’s duration and via the multiple workshops and extensive training sessions that took place, all ESRs closely interacted with those experts whilst being exposed to various research and working environments.These interdisciplinarity and mobility features would foster a fruitful combination of the insight and the skills that were acquired during our training experience, while the dissemination and outreach activities that ran in parallel would complementarily contribute to the direct communication of the programme’s outcome to fellow scientific groups and to the wider public, respectively.
In summary, AMVA4NewPhysics delivered important advancements in Multivariate Analysis and Machine Learning tools for High Energy Physics at the LHC, and in so doing produced an ideal training environment for PhD students. The ESRs were provided not only with an extensive expertise to use while in their PhD studies, but also with the appropriate skill set to subsequently continue conducting key ML research both for HEP experiments and for other applications. Indeed, several of the fellows have already obtained their title and continue related research, holding attractive occupations in academia and outside of it.
Furτher Reading
[1] Giles Strong* ‘Recent developments in deep-learning applied to open HEP data’, https://github.com/GilesStrong/QCHS-2018
[2] Public Deliverable D1.4 ‘Classification and Regression Tools in Higgs Measurements’
[3] Anna Stakia on behalf of the CMS Collaboration ‘Jet flavour tagging using Deep Learning in the CMS experiment’, 6th International Conference on New Frontiers in Physics ICNFP 2017, https://indico.cern.ch/event/559774/contributions/2661212/attachments/1513835/2361654/deepJetTaggerPoster_ICNFP17_AnnaStakia.pdf
[4] CMS Collaboration ‘Heavy flavor identification at CMS with deep neural networks’, CMS DP–2017/005
[5] CMS Collaboration ‘CMS Phase 1 heavy flavour identification: performance and developments’, CMS DP–2017/013
[6] CMS Collaboration ‘New Developments for Jet Substructure Reconstruction in CMS’, CMS DP–2017/027
[7] Markus Stoye, Jan Kieseler, Mauro Verzetti, Huilin Qu, Loukas Gouskos, Anna Stakia and CMS Collaboration ‘DeepJet: Generic physics object based jet multiclass classification for LHC experiments’, Deep Learning for Physical Sciences Workshop, 31st Conference on Neural Information Processing Systems NeurIPS (NIPS) 2017 https://dl4physicalsciences.github.io
[8] Public Deliverable D2.1 ‘Report on studied SL methods for targeted and global searches of new physics’
[9] Sébastien Brochet, Christophe Delaere, Brieuc François, Vincent Lemaître, Alexandre Mertens, Alessia Saggio, Miguel Vidal Marono, Sébastien Wertz ‘MoMEMta, a modular toolkit for the Matrix ElementMethod at the LHC’, https://arxiv.org/pdf/1805.08555.pdf
[10] Public Deliverable D3.1 ‘MoMEMta: a C++ Package to evaluate MEM weights for arbitrary processes in the SM and beyond’
[11] Public Deliverable D3.2 ‘Online Web Documentation of MoMEMta C++ Package’
[12] Public Deliverable D3.3 ‘Publication on MEM and its
Implementations’
[13] Pietro Vischia, Tommaso Dorigo ‘The Inverse Bagging Algorithm: Anomaly Detection by Inverse Bootstrap Aggregating’, https://arxiv.org/pdf/1611.08256.pdf
[14] Public Deliverable D4.4 ‘Report on the Inverse Bagging algorithm’
[15] Public Deliverable D4.5 ‘Software Implementation of the Inverse Bagging algorithm’
*AMVA4NewPhysics members in bold