CERN Accelerating science

Problems and Solutions: The CMS ECAL lessons

The story starts in September 2022 while, during the ECAL Operations meeting our veteran electronics expert, Evgueni Vlassov reported that the voltage in one of the control cards was unstable. He prophetically proposed that the cause might be water in the circuits. A few days later, an alarm went off from a water leak detection cable. These cables are sometimes  unreliable and have resulted in many false alarms in the past. Moreover, the cable was located in a completely different position with respect to the control card, without anyone suspecting that the two were related.

A few days later, we faced problems with the high voltage power supply of the positive side end cap of the ECAL (EE+). When a channel draws too much current, the power supply “trips”, switching  the power off to avoid a potentially dangerous overload. The problem is that a single power supply channel powers up several of the 7324 crystals each ECAL endcap is made of, and we were not sure which one was the source of the problem. David Petyt and Thomas Reis spent a couple afternoons in the service cavern, 80m underground, connecting and disconnecting cables and connectors, to try isolating the problem. They slimmed down the pool of suspects, but did not come to a definite conclusion.  These channels are located in a lower position with respect to the control card Evgueni initially reported. 

Later in September, a test of the cooling circuits was performed, which brought the pressure to a higher than nominal value. With this test, the problems worsened. Humidity sensors started to give high readings: at this point it was clear that there was a leak in the cooling circuit. But where exactly? 

Technical coordination started a deeper investigation. A device called “Christmas tree” controls the distribution of the water to the ECAL circuits. Norbert started opening and closing the valves one by one, while the humidity reading was watched closely. It was finally evident that when closing the valve that feeds line number 7, the humidity went down while rising again when it was opened. Unfortunately, this line feeds what we call a “cooling superblock” that serves nearly 500 channels. The leaky line would have to be isolated and these 500 channels kept off. For reference, ECAL is composed of 61 200 crystals in the barrel and 7 324 crystals in each of the endcaps. Since the issue was confined in a single region, about 7% of the positive end cap had to be turned off.

Notoriously, having channels off, not only affects what you can detect, but also what you cannot detect. In the region affected by the leak, the ECAL crystals became insensitive to electrons and photons. But there are particles, the neutrinos, that escape detection: we know a high-energy neutrino has escaped our detector when an imbalance is found in the energy deposited in the calorimeters We call that “missing energy”. The trick works only if the detector is “hermetic”: if part of the detector is off, we don’t know if energy is missing because it went into neutrinos or was deposited in the unsensitive region. 

But there is a sentimental aspect to this too. The detector is like a baby for the people taking care of it. Some of them brought it to life years ago! It really hurt to see a “hole” in the instrument when looking at the quality plots!

Identifying the problem

The task force started by identifying all possible failure points where the leak could originate. The number one suspect in these cases are the connections: there are a number of connections along the water circuits of different types. From the “Christmas tree”, water is distributed to the sub-circuits. Outside the detector, the last piece is a green rubber flexible pipe that enters a very busy “patch panel” where a huge number of optical fibers, high voltage cables and low voltage cables concentrate. This flexible pipe is connected via a quick-release fitting (sort of the ones used in garden hoses, only made of solid stainless steel and much more reliable and expensive…) to a steel pipe called the “funny pipe”. The funny pipe is called that because of its unusual shape: it goes through the wall of the patch panel, turns by 180 degrees and connects, through a different kind of fixed connection, to the “cooling superblock”. This connection is on the inside and very difficult to access.

Immediately after, software simulations were run to precisely estimate the impact of the leak problem on the physics potential of CMS. And of course, people started investigating whether a repair was possible. With that respect, an “EE Leak Repair Task Force” was set up with weekly meetings. Many of the people that assembled the detector in the early 2000s are retired, but enthusiastically accepted to help by providing their memories, expertise… and pictures and drawings!

These steel pipes are called funny pipes because of their unusual bending shape and difficulty to repair. Credits: J. Daguin.

The cooling superblock is made of all welded stainless-steel pipes, so it was thought unlikely to leak in itself. At this point the team was confronted with two hypotheses:
    1.    The leak was in the last connection from the funny pipe to the cooling superblock. That would explain why water reached inside the detector. In this case the repair would be difficult, because of limited accessibility.
    2.    The leak was in the connection of the flexible to the funny pipe, or in another part in the outside of the patch panel. In that case the repair would be easier, but that case would not explain easily how water could reach the inside of the detector without leaving any trace outside.

The exact location of the cooling lines on the CMS detector. Credits: J. Daguin.

The team provided for all possible scenarios. Even for the difficult case of the “inside” leak, a repair strategy, that involved enrobing the connection with a rubber sheath and injecting epoxy resin to make it water-tight, was prepared. The intervention would have required use of endoscopes and remote-handling tools. For the “outside leak” case all types of connectors were ordered and the tools to redo the connections ordered.

The solution

On December 15th, CMS was in an open configuration, thus allowing the team to perform a first investigation - but only starting at 17.00, because of several other ongoing activities in the cavern. The first non-trivial task was to position the scissor-lift in a way to be able to work in the right section of the detector. The patch panel cover was open removing a few screws… we stopped for the day. Going back home and reporting to team members, we realised that we had opened the wrong hatch. We had to go a little higher... more difficult to reach.

CMS scientist working to fix the ECAL leak problem. Credits: St. Argiro.

On December 16th we went up again. And unscrewed the right panel cover. We identified the cooling lines feeding cooling superblock #7. We opened the valves feeding those lines… and we immediately saw several drops of water coming from the green flexible pipes! Lots of excitement and patting on the back: the leak was in a place relatively easy to access and fix. In the following days the circuit was purged, and the green rubber pipe was entirely replaced from one end to the other.  The circuit was left at nominal pressure throughout the winter closure of CERN, while humidity was closely monitored.

As soon as operations restarted in January, the EE+ endcap was powered on: everything worked, no permanent damage! In the following weeks, more extensive tests were carried out, thanks to the periods in data taking using cosmic rays. The noise level was checked and found normal.

All's well that ends well… but there still was a concern. What caused the failure of the rubber pipe? Will more of those 72 pipes fail? The leaky flexible pipe was sent to the materials lab together with a new one for a detailed examination. The samples were scanned via microtomography, microscopy, and underwent stress tests. The report said that the pipe has aged well and does not show evident signs of wear. The leak showed up as a cut, probably due to mechanical damage or mishandling (overbending) at the time of installation.

As a last note, the CMS collaboration presented the “ECAL leak repair team” with the 2023 CMS Award “for the exceptional response to the crisis that followed a water leak in the ECAL Endcap cooling circuit, which was promptly repaired.

CMS collaboration presented the “ECAL leak repair team”, namely: Etiennette Auffray, David Bailleux, Ken Bell, David Cockerill, Jerome Daguin, Jean Fay, Norbert Frank, Wolfgang Funk, James Hill, David Petyt, Igor Tarasov with the 2023 CMS Award for their prompt and efficient intervention. 

The CMS ECAL is ready for the 2023 LHC run in its full splendour! Preparation, organisation, a scientific approach, and a bit of luck resulted in a completely successful repair.