CERN Accelerating science

Machine learning for new Detector Technologies

by Sandro Marchioro (CERN, ESE group)

I have spent a considerable amount of my time as Master student in Physics in the late 70’s performing a fairly prosaic function that nevertheless required the presence of humans (physics professors, trained technicians and of course students): scanning thousands of BEBC pictures with (occasional) neutrino events, a task that today would simply be called “pattern recognition”.

The task consisted in looking more or less simultaneously at several images showing the inside of the BEBC chamber where charged particles had deposited a very volatile and ethereal image of their own passage under the form of tracks made of tiny gas bubbles that could be photographed with some stereo cameras. A good event coming from a neutrino interaction had to have the collision point generated within a fiducial volume well inside the BEBC cylinder. Computers in the ‘70s were already sufficiently powerful that once a potential good track was pointed out to them, its precise geometry could be calculated without too much trouble. But the very fact of recognising such tracks as seen from a set of stereo cameras was still only in the realm of human pattern recognition. Instructions for human scanners were reasonably simple and easy to perform: search for a vertex within the fiducial volume with so and so many out-coming tracks with more or less this bending: if found, use a sort of primitive mouse device to select certain points on the tracks in all available projections (this was the tricky part), and ask the computer to work-out if such hypothesis was consistent with having chosen the right tracks in all projections, and finally - if everything went well — let the computer calculate the kinematic for the event. Repeat.

The first week of such task was very exciting, red LEDs flashing in the computer room, a smooth and regular noise coming from the stepping-motors used to move the huge rolls of high resolution film, even air-conditioning was available, and surely my human nature made me dream that on those pictures I could possibly make some great discovery; nevertheless, after several thousand pictures a day, drudgery came up fast and my brain started wondering in some other direction.

Without the shadow of a doubt today this function could be performed by a computer appropriately programmed to execute one of the many “machine learning”, “neural network” or “artificial intelligence” (pick one) algorithms that are promising to change the way machines interact with sensors and actuators in so many application fields. Today millions of images of far greater complexity than those simple BEBC pictures are analysed daily in real time for all sort of purposes using CPUs, GPUs or other high power Vector or Tensor Units, all trying to imitate in a more or less “brain inspired” way what we humans can perform with such apparent ease.

A huge design community with expertise between computer science and electronics is currently working at moving some of the algorithms (or parts thereof) used in such applications to more energy efficient hardware. After all, we humans spend continuously of the order of just 10W to see, recognise, analyse, store, elaborate, interpret and correlate images and their implications, but our best machines are still orders of magnitudes away from such performances and the opportunities for improvements are huge.

While people carefully avoid the use of the two words “artificial intelligence”, what they are trying to achieve is precisely to imitate - at least at some levels - the synthesis capability of the human brain by recognising and predicting behaviours from information rich in details but also of redundancy and noise. Surely the definition of true “intelligence” is sufficiently vague that philosophical discussions on the matter will continue for years to come, but even a fairly primitive self-driving car system - in my humble opinion - is far more intelligent that what I was doing during (part of) my Master thesis.

Many modern machine learning algorithms run on computers with almost infinite digital precision, but researchers have quickly realised that many algorithms are very undemanding in terms of precision (exactly like the we can recognise a face after years of ageing of within a very noisy, discoloured or partial image). This suggests that certain features could be computed through the more imprecise but potentially much lower power “analog” circuits without loss of accuracy on the final result.

The development of algorithms cannot be decoupled from the development of suitable architectures. Essentially all modern computers are built around the separation of processing and (several levels of) memory storage, while processing is so embedded in the operation of the human brain that nobody has yet been able to separate the two components within it. This hints that while suitable for initial explorations, standard computers or even fancy tensorial units may in the long term not be the best tools to integrate widely diffused machine learning. In-memory-processing may be the way to go, and many academic and commercial researchers are looking into new materials, circuits and architectures to implement such ideas with many impressive results as those shown at recent conferences and workshops on these subjects. 

A recent seminar, organized by CERN’s EP department, of Prof. Boris Murmann (Stanford University EE Department) illustrated some of these advances, with particular emphasis on work aimed at a drastic optimization of the power consumption of such machines. He showed how analog and digital circuit and system design have to proceed hand-in-hand to make such future systems efficient. In the present state of relatively shallow theoretical understanding of the fundamental mathematics underlying the behavior of these machines, mixing people with different backgrounds and expertise and attack problems from different angles is also very important.

Boundaries between analog and digital may indeed become fuzzy, but does it really matter if at the end the algorithm works?

Scientists in High Energy Physics have historically been early and visionary adopters of many advanced technologies: they have been willing to bet their career on new materials (for instance silicon sensors long before commercial imagers came of age), advanced technologies (for instance deep sub-micron well before the space and military communities) or even architectures (up to the mid-80s people in HEP were literally making their own computers). They have also been courageous enough to build machines and experiments of unprecedented size and complexities for scientific purposes, and also have not hesitated to conceive experiments to collect unheard-of quantities of data. It is therefore very reasonable to predict that the process of looking at and interpreting data will soon have to be complemented by the adoption of new techniques coming from the machine-learning community and that future experiments will be enthusiastic adopters of such techniques.

 

Image note: The graphic illustrating this article was kindly produced by the author who holds the copyright for this image.