CERN Accelerating science

PHYSTAT-nu 2019 held at CERN

Statistics is a topic that has gained increasing importance and underwent significant developments in particle physics over the past 25 years. More sophisticated methods have been developed to extract the maximum of information from the recorded data, and to make reliable predictions of the significance of measurements and observations. The community expects the experiments to perform proper statistical scrutiny on data before advancing claims of astonishing new observations.  Entertaining examples, not just limited to statistics issues but situations one would prefer to avoid, have been recalled in a recent historical review paper [1]. Neutrino experiments so far have often dealt with small event samples, that may lead to specific statistical requirements, but e.g. future long baseline and reactor experiments are expected to collect considerably larger data samples, requiring new paradigms for treating statistics questions and further reduction of the systematics.

The PHYSTAT series of Workshops [2] deals with the statistical issues that arise in analyses in High Energy Physics (and at times in Astroparticle Physics).  The series started in 2000 and workshops have been organized at a semi-regular time interval. Over the recent years these workshops have been held to address the problems of specific communities such as the ones for collider and neutrino physics. In 2016 two specific workshops for neutrino physics were held, one in Japan and one in the US [3,4]. With the advent of the new CERN neutrino group in 2016 and the preparation for upcoming new experiments and upgrades, it was considered timely to have a new PHYSTAT statistics workshop on neutrinos at CERN to attract and discuss with proponents from the different communities, to review the status in the field and discuss the potentially interesting future directions.

PHYSTAT-nu 2019 [5] was organized at CERN from 22-25 January and counted about 130 registered participants. Since CERN is the home of the LHC, the venue was also convenient to attract collider experts to share their experience on statistics issues, which was an integral part of the program. Furthermore, as is a tradition for this workshop, it had a mix of physicists and statisticians that are well aware of the high energy experimental challenges. The Local Organizing Committee, composed by Olaf Behnke, Louis Lyons, Albert de Roeck and Davide Sgalaberna, was in charge of organizing the workshop and set the scientific program assisted by a scientific committee with experts from all around the World.

The workshop started with training statistics lectures, given by Louis Lyons and Glen Cowan, attended by a large audience and Jim Berger gave a very interesting talk about “Bayesian techniques”. There were excellent introductory talks on neutrino physics and statistics from Alain Blondel and Yoshi Uchida as well.

The workshop was focused on the statistical tools used in data analyses, rather than experimental details and results. Topics included were, among others, using data to extract model parameters, to referee between models, setting limits, defining discovery criteria, determination and usage of systematic uncertainties, and unfolding and machine learning for event reconstruction and classification.

Speakers and attendees from both the field of neutrino and collider physics as well as statisticians (Jim Berger, Anthony Davison, Mikael Kuusela, Chad Shafer, David van Dyk and Victor Panaretos) discussed their experience on the different statistical issues. First a taste of the tools used in neutrino experiments (reactor, accelerator, atmospheric, solar, cosmic, global fits, etc.) was presented. The core of the workshop was composed of three main topical session of general interest: Systematic Uncertainties, Unfolding and Machine Learning. Each session consisted of an introductory talk given by an expert of the field, a talk reporting on the collider experience and, eventually, a talk on the experience gained by neutrino experiments. The goal was to exploit the synergy between the different communities and encourage fruitful discussions.

Talks and discussions were useful to clarify the issues faced by neutrino experiments. One fundamental challenge is the poor understanding of neutrino interactions with nuclei in the detectors. The challenges faced by the NOvA and the T2K experiments were reported: though they both rely on a near and a far detector to directly compare the un-oscillated and oscillated neutrino fluxes, a full cancellation of the systematic uncertainties is very hard, due to the different detector acceptances, possible different detector technologies, and the compositions of different neutrino interaction modes. Thus, building a total likelihood that does not rely on any model is not possible. Another major issue is given by the “unknown unknowns”. As it is hard to predict the level of understanding of neutrino interactions in 10 years from now, the future long-baseline experiments must design the near detectors very carefully, to achieve the precision required for the measurement of the neutrino CP violating phase and mass ordering. The NOvA and T2K collaborations are moving toward combining their data to improve the sensitivity to the CP violating phase and mass ordering. An analogous approach has been successfully used for ATLAS and CMS data at LHC and a detailed description of the statistical methods and tools used in the analysis was discussed at the meeting.

A common tool used to measure the neutrino-nuclei cross sections and provide data-driven inputs to the theoretical models is the Unfolding. It corrects for the kinematic observables of the final-state particles, such as muons, for effects of detector acceptance and smearing. A full session was exclusively dedicated to this topic. Usually regularization methods are used to smooth the fluctuations produced by unfolding procedures. Physicists and statisticians agreed that also unregularized results should be published.

Another hot topic was discussed during the Machine Learning session. During the last years, neutrino detectors have been developed to provide a more detailed view of the neutrino interaction event. At the same time, new “Deep Learning” techniques have been developed and are widely used in neutrino experiments. For instance, thanks to the recently available many-layer Neural Networks it becomes easier to exploit the full information provided by the detector and improve the identification of the final-state particles. An analogous approach is used in LHC to for example to discriminate beauty-quark jets from those produced by lighter quarks.

At the end of each day a session was dedicated to summarize the statistical issues tackled by the talks. The session was directed by Tom Junk, who raised all the important points for data analyses in neutrino experiments and triggered the discussion between physicists and statisticians, ranging from general problems like Bayes or Frequentism to more practical ones, like the determination of the neutrino mass ordering

David Van Dyk and Kevin Mc Farland gave lively summary talks and were in particular pointing out specific points to address for the neutrino physics community, in the coming years.

During the three days clearly connections between the communities were made. In general the workshop was highly appreciated by the participants and plans were made for a future follow up to resolve further outstanding statistical issues, through a future PHYSTAST-Nu meeting. An important step for current and future neutrino experiments could be setting up Statistics committee, such as at the Tevatron and, more recently, LHC experiments. This PHYSTAT-nu workshop could be the first real step towards such a necessary and very exciting scenario.

 

REFERENCES

[1] Maury C Goodman, arXiv:1901.07068

[2] https://espace.cern.ch/phystat.

[3] PHYSTAT-nu at IPMU, Japan, http://indico.ipmu.jp/indico/event/82/

[4] PHYSTAT-nu at Fermilab, https://indico.fnal.gov/conferenceDisplay.py?confId=11906

[5] PHYSTAT-nu at CERN, https://indico.cern.ch/event/735431/