Identified - CHTC services (including HTC and HPC) are offline while we patch a serious Linux vulnerability.

Last evening, a serious Linux exploit was broadly published online. This exploit allows any user to easily obtain root (admin) access, bypassing standard security controls.

We pre-emptively shut down our systems this morning to protect our system and users. We are working to patch the vulnerability on our systems. CHTC servers will remain offline through tomorrow (and possibly longer).

We will share more updates as the situation changes. We will provide the most up-to-date status of the system on our status page: https://status.chtc.wisc.edu/incidents/8zycydmtkf74.

Apr 30, 2026 - 16:30 CDT
Update - We have expanded the shut down to all user-facing CHTC servers. Users are unable to login and no jobs are running.
This is required to ensure the safety and security of our systems.
We are actively working to address the core issue.

Apr 30, 2026 - 11:32 CDT
Investigating - All CHTC login nodes (spark-login, ap2001, ap2002, etc.) are being taken down.
Apr 30, 2026 - 09:21 CDT
Investigating - CHTC is currently experiencing limited availability in /home storage. As a result, we are unable to approve /home quota increase requests at this time.

Our team is actively working on a longer-term storage solution and will provide updates as more information becomes available.

In the meantime, users should reduce /home usage where possible:
- Move individual files larger than 1 GB to /staging.
- For directories containing many small files, bundle them into a compressed archive before moving them to /staging.

For example:
tar -czvf my_directory.tar.gz my_directory/
mv my_directory.tar.gz /staging//

After confirming the archive was created and moved successfully, you may remove the original directory from /home if it is no longer needed there.

Thank you for your patience while we work to expand available storage capacity.

Apr 24, 2026 - 11:25 CDT

About This Site

This page provides information about unplanned downtimes and scheduled maintenance for services offered by the Center for High Throughput Computing

High Throughput Computing (HTC) System Major Outage
90 days ago
99.37 % uptime
Today
Access Points Major Outage
90 days ago
98.82 % uptime
Today
CHTC Pool Major Outage
90 days ago
99.14 % uptime
Today
External Pools (OSPool, Campus HTCondor Pools) Major Outage
90 days ago
99.14 % uptime
Today
Staging and Projects Space Operational
90 days ago
100.0 % uptime
Today
File Transfers Operational
90 days ago
99.75 % uptime
Today
High Performance Computing (HPC) System Major Outage
90 days ago
99.33 % uptime
Today
Login Nodes Major Outage
90 days ago
98.57 % uptime
Today
Cluster Nodes and Jobs Major Outage
90 days ago
98.93 % uptime
Today
Central Software Installations Operational
90 days ago
100.0 % uptime
Today
Home and Scratch File Systems Operational
90 days ago
99.81 % uptime
Today
Data Transfer Tools Major Outage
90 days ago
99.04 % uptime
Today
Globus Endpoint Major Outage
90 days ago
99.04 % uptime
Today
CHTC Internal Infrastructure Operational
90 days ago
99.85 % uptime
Today
Tiger Cluster Operational
90 days ago
100.0 % uptime
Today
RT Email/Ticket Support System Operational
90 days ago
99.71 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
May 1, 2026

No incidents reported today.

Apr 30, 2026
Completed - The scheduled maintenance has been completed.
Apr 30, 14:00 CDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Apr 30, 10:00 CDT
Scheduled - We will be performing maintenance on AP2001 and AP2002. Jobs may get interrupted in the process. Interrupted jobs should be able to restart without manual intervention.
Apr 29, 13:21 CDT
Apr 29, 2026

No incidents reported.

Apr 28, 2026

No incidents reported.

Apr 27, 2026

No incidents reported.

Apr 26, 2026

No incidents reported.

Apr 25, 2026

No incidents reported.

Apr 24, 2026
Apr 23, 2026

No incidents reported.

Apr 22, 2026

No incidents reported.

Apr 21, 2026
Resolved - We're still not certain what caused the outage, but the server is currently operational.
Apr 21, 09:37 CDT
Monitoring - We restarted the server and it appears to be functional, however, we have not yet identified the cause and it may recur.

Please remember: do not run intensive commands on the access point - this extends to any commands your AI agent may be running on your behalf!

Apr 20, 16:23 CDT
Investigating - Confirmed user reports of being unable to login to ap2002.
We're investigating the cause.

Apr 20, 15:08 CDT
Apr 20, 2026
Completed - The scheduled maintenance has been completed.
Apr 20, 14:41 CDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Apr 20, 00:00 CDT
Scheduled - We are re-installing a GPU into xhuanggpu4001. The machine will be unavailable at this time.
Apr 13, 08:32 CDT
Apr 19, 2026

No incidents reported.

Apr 18, 2026

No incidents reported.

Apr 17, 2026

No incidents reported.