Cooling outage impacting HPC cluster

Monitoring

The HPC system is mostly back online. There are a couple of nodes that we are still working to power on.
Posted Jun 19, 2026 - 13:09 CDT

Identified

Many of our HPC worker nodes are down after a cooling outage in one of our server rooms last night.
We will work to bring these nodes back up as soon as we know that cooling has stabilized.
Posted Jun 18, 2026 - 08:41 CDT
This incident affects: High Performance Computing (HPC) System (Cluster Nodes and Jobs).