[HTC] ap2002.chtc.wisc.edu is offline
Postmortem

ap2002.chtc.wisc.edu shut down unexpectedly last night just before 10 PM and required manual intervention to restart.

Because execution points lost communication with ap2002 for more than 2 hours, running jobs submitted from ap2002 were abandoned. In practice this means when ap2002 restarted, the jobs returned to the “Idle” state in the queue.

We are still investigating the root cause of the shut down, but since we have not yet identified it there is the possibility of future recurrence. We thank you for your patience as we work to address the underlying issue.

Posted Mar 21, 2024 - 11:13 CDT

Resolved
This incident has been resolved and login has been restored.
Posted Mar 21, 2024 - 10:34 CDT
Update
We are continuing to investigate this issue.
Posted Mar 21, 2024 - 09:29 CDT
Investigating
We are currently investigating an issue with ap2002.chtc.wisc.edu and have prevented users from logging in to expedite the troubleshooting process.
Posted Mar 21, 2024 - 09:25 CDT
This incident affected: High Throughput Computing (HTC) System (Access Points).