(Resolved) Failure of the LiDO3 cluster due to failure of the cooling water supply

The LiDO3 cluster had to be taken offline today in a controlled manner due to a malfunction in the central cold water cooling system to prevent overheating.
Update as of November 5, 2025 1:20 pm: Cooling water supply got restored on Monday morning. LiDO3 cluster got booted subsequently; compute service is now fully operational again.




