PHYSTAT - Systematics Workshop

Olaf Behnke (DESY), Richard Lockhart (Simon Fraser University), Louis Lyons (Imperial College London) and Nick Wardle (Imperial College London) 13th Dec 2021

#Statistics

The PHYSTAT-Systematics meeting, held remotely in early November, was the latest in a series of PHYSTAT events. Begun in 2000, these were the first Workshops focused on statistical issues in particle physics. They started at CERN, which has hosted 3 further meetings. An important feature of PHYSTAT events is the participation of professional statisticians. Since April 2019, we have also had PHYSTAT Seminars, given by physicists with a particular interest in statistics, or by statisticians. PHYSTAT Workshops initially dealt with a wide range of statistical issues, but later became more focussed. Thus the latest meeting concentrated on the way systematic effects are incorporated in a range of particle physics analyses.

WHAT ARE SYSTEMATICS?

Whenever we perform an analysis of our data, whether measuring a physical quantity of interest or testing some hypothesis (for example, the existence of SUperSYmmetric particles), it is necessary to assess the accuracy of our result. There are two types of uncertainties, statistical and systematic. Statistical uncertainties arise from the limited accuracy with which we can measure anything; or from the natural Poisson fluctuations involved in counting independent events (for example, the number of asteroids hitting the moon per year). They have the property that repeated measurements result in a distribution of results, and the accuracy of the average of n independent measurements generally improves as n increases.

Systematic uncertainties arise from many sources as can be illustrated by the example of a simple pendulum, a physics experiment that many of us have performed at school. The local acceleration due to gravity (g) is determined from the length of the pendulum L and its oscillation period Τ.

The statistical measurement uncertainties are in L and Τ, and there are the corresponding systematics from the ruler and the clock calibrations. But there are more systematics. For example the formula Τ = 2π√(L/g) assumes among other things that the amplitude of the swings θ is small. If this is not so, some correction should be applied. The uncertainty in this correction is a systematic. These systematics may not cause a spread in results when we repeat the experiment several times, but merely shift them away from the true value -- accumulating more data usually does not reduce the magnitude of a systematic effect.

In general, estimating systematic uncertainties requires much more effort than for statistical ones, and more personal judgement and skill is involved. Furthermore, statistical uncertainties between different analyses are usually independent; this often is not so for systematics.

In particle physics analyses, the main statistical uncertainties are from the Poisson fluctuations on the observed numbers of events in various categories. Many of the systematics are very much related to detector and analysis effects. Examples include trigger efficiency; jet energy scale and resolution; identification of different particle types; the strength of backgrounds and their distributions; etc. There are also theoretical uncertainties, which as well as affecting predicted values for comparison with measured ones, can also influence the experimental variables extracted from the data. Another systematic comes from the intensity of the accelerator’s beam(s) (the ‘integrated luminosity’ at the Large Hadron Collider). This is likely to be correlated for the various measurements made using the same beam(s).

PHYSTAT-SYSTEMATICS

Two big issues for systematics are how the magnitudes of the different sources are estimated, and how they are then incorporated in the analysis. This meeting concentrated on the latter, as it was thought that this was more likely to benefit from the presence of statisticians.

Our meeting started with two introductory talks, one by a physicist and the other by a statistician. Some of the feedback suggested that more introductory material would have been useful. Some previous PHYSTAT meetings indeed have had an ‘optional’ day before the start of the meeting itself on useful background material.

The 17 following talks fell into 3 categories. The first were those devoted to analyses in different Particle Physics areas: The large experiments at the LHC; Neutrino oscillation experiments; Dark Matter searches; and flavour physics. The second group were more on themes: Theoretical systematics; Unfolding; Mismodelling; some of the many aspects that arise in using Machine Learning; and an appeal for experiments to publish their likelihood functions. Finally there was a series of short talks by statisticians.

At the end of 5 of the 7 sessions, Response talks were given, mostly by statisticians. It was valuable to have insights from a different viewpoint on the largely experimental talks. Such responses were not a common feature of previous PHYSTAT Workshops.

A novel feature of this remote meeting was that the summary talks were a week later, to give the speakers (physicist Nick Wardle and statistician Sara Algeri) a longer time to prepare their talks. Wardle’s was an excellent survey of the different ways that systematics are included in our analyses, while Algeri called for improved interaction between physicists and statisticians in dealing with these interesting issues.

STATISTICIANS

We very much appreciated the involvement of many statisticians in PHYSTAT-Systematics (this was the largest number at any PHYSTAT meeting), and the efforts that they made to understand our intricate analyses and the statistical procedures that we used. But we really did miss the benefit of being able to chat with them informally, as happens at live meetings. This was partially compensated by the various talks that they gave. As well as the introduction by David van Dyk and Algeri’s summary, the talk on Unfolding was by Michael Kuusela, and Brad Efron spoke about the bootstrap. There were also shorter contributions related to systematics from Jim Berger, Richard Lockhart, Tudor Manole, Xiao-Li Meng and Larry Wasserman; they provided new ideas for us to consider.

FUTURE ACTIVITY

Systematics is an immense topic, and it was clear that one meeting spread over 4 afternoons was not going to solve all the issues. Ongoing activity may include PHYSTAT Seminars with more talks by statisticians; discussions from fields outside particle physics; description of particular interesting analyses; how the magnitudes of systematic sources are estimated; etc. The organisers welcome further suggestions.

The general conclusion was that a large amount of relevant information was discussed, with interesting differences in the separate sub-fields of particle physics; and that statisticians made valuable contributions to the topic. It was a good first step on the path towards having a systematic approach to systematics.

Further information about the meeting is available at the Workshop’s website [1]. It contains the slides and videos for all the meeting, plus a lot of material on the subject of systematics. There is separate introductory material for statisticians and for particle physicists. The PHYSTAT homepage [2] has links to all the PHYSTAT meetings, and to PHYSTAT Seminars.

Further Reading

[1] https://indico.cern.ch/event/1051224/

[2] https://espace.cern.ch/phystat/_layouts/15/start.aspx#/SitePages/Home.aspx

CERN Accelerating science

PHYSTAT - Systematics Workshop