Edge SpAIce: Leveraging CERN's AI for Real-Time Ocean Plastic Tracking from Space

Sioni Paris Summers (CERN) 19th Jun 2024

Earth Observation (EO) and particle physics research have more in common than you might think. In both environments, whether capturing fleeting particle collisions or detecting transient traces of ocean plastics, rapid and accurate data analysis is paramount. We are excited to present a new EU project, Edge SpAIce [1]. It applies CERN’s cutting-edge AI technology to monitor the Earth’s ecosystems from space to detect and track plastic pollution in our oceans.

At the LHC experiments, hardware trigger systems perform the first data processing, determining in real time which collision events should be kept for further analysis and which should be discarded. Only a small fraction of the 40 million collisions per second are kept. As we prepare for the High Luminosity LHC and upgraded detectors, with new and more sophisticated trigger systems, we anticipate the increasing use of Machine Learning techniques to select those rare events with high efficiency.

Making use of Machine Learning in the hardware triggers presents a technological challenge, due to the extreme throughput and latency constraints, and the use of custom computing platforms equipped with FPGA processors (Field Programmable Gate Array). These devices provide vast computational performance and flexibility, but typically require significant engineering expertise for their effective use. Recently, the particle physics community has developed its own tools in order to bridge the gap between Machine Learning, particle physicists, and high performance FPGA implementations for triggering. The major outcomes of this effort have been the hls4ml project [2] and its sister project targeting Decision Forests, conifer [3]. These projects are being used in the LHC Run 3 in 2024 at the hardware triggers of the CMS and ATLAS experiments respectively, and elsewhere. Both ATLAS and CMS have plans to use Machine Learning in their hardware triggers extensively for their Phase 2 Upgrades.

Major contributions to the development of both of these tools have been driven by the CERN EP department. Soon, the R&D in these areas will receive a boost from the NextGen Triggers project. The technological aspect of hls4ml that makes it especially suitable for use in hardware triggers is its fully on-chip, dataflow compute architecture. Each layer of a Neural Network is mapped to different hardware in the FPGA device, and each computation can be parallelised or serialised with fine-grained control. By keeping all the weights and other variables on-chip, hls4ml avoids access to off-chip memory which is slow on the timescales relevant for a hardware trigger. One cost of the fully on-chip approach is that the size of the Neural Network model - its number of layers and parameters - impacts the resource usage of the device, which is eventually capped by the capacity of the device being used.

For most efficient use of those resources, Neural Networks should be trained using the technique of Quantization Aware Training. A collaboration between researchers in the EP Department and Google produced an interface between Google’s Quantization Aware Training tool ‘QKeras’ and hls4ml [4]. The approach shrinks the precision of the model parameters during the learning of the Neural Network. This way, extremely low bitwidths can be achieved with no, or minimal performance loss of the model predictions. Reducing the bitwidth directly reduces the resource cost of a model, allowing models with more parameters to fit inside the same chip. Another consequence of the reduced bitwidth computation, and no access to off-chip memory, is lower power consumption. While this may not be a constraint for hardware triggers, other edge computation environments have significant limitations on power availability. This includes frontend ASICs for particle detectors, low cost compute devices, and Earth Observation satellites.

Edge SpAIce is a collaborative endeavour involving CERN (EP and KT departments), EnduroSat (BG) and NTU Athens (GR) and coordinated by AGENIUM Space (FR). Its aim is to develop and demonstrate real time data filtering on-board satellites for Earth Observation. In analogy with the LHC experiment trigger scenario, Earth Observation satellites have only a limited bandwidth to transmit data to the Earth, and most of the images captured likely do not see the objects of interest for a given mission. Using Neural Networks running on System-on-Chip FPGA devices on the satellite itself, segmentation and classification of the images can be carried out onboard. With that, a decision can be made whether to transmit the image depending on the subject of the mission. In the Edge SpAIce project, we will search for marine plastic debris. The deployment of the Neural Networks using hls4ml will profit from the previously mentioned advantages of reduced power consumption and high performance. In a second phase of the project, the system will also be deployed on FPGA hardware developed in Europe, which will improve competitiveness. This could open the door for a whole new market for EO services and applications. The graphic above illustrates the different stages of preparing, deploying, and monitoring the Neural Network onboard the satellite, and the roles of the different partners.

CERN Accelerating science

Edge SpAIce: Leveraging CERN's AI for Real-Time Ocean Plastic Tracking from Space