Investigating - CHTC is currently experiencing limited availability in /home storage. As a result, we are unable to approve /home quota increase requests at this time.

Our team is actively working on a longer-term storage solution and will provide updates as more information becomes available.

In the meantime, users should reduce /home usage where possible:
- Move individual files larger than 1 GB to /staging.
- For directories containing many small files, bundle them into a compressed archive before moving them to /staging.

For example:
tar -czvf my_directory.tar.gz my_directory/
mv my_directory.tar.gz /staging//

After confirming the archive was created and moved successfully, you may remove the original directory from /home if it is no longer needed there.

Thank you for your patience while we work to expand available storage capacity.

Apr 24, 2026 - 11:25 CDT

About This Site

This page provides information about unplanned downtimes and scheduled maintenance for services offered by the Center for High Throughput Computing

High Throughput Computing (HTC) System Degraded Performance
90 days ago
99.86 % uptime
Today
Access Points Degraded Performance
90 days ago
99.59 % uptime
Today
CHTC Pool Operational
90 days ago
100.0 % uptime
Today
External Pools (OSPool, Campus HTCondor Pools) Operational
90 days ago
100.0 % uptime
Today
Staging and Projects Space Operational
90 days ago
100.0 % uptime
Today
File Transfers Operational
90 days ago
99.75 % uptime
Today
High Performance Computing (HPC) System Operational
90 days ago
99.78 % uptime
Today
Login Nodes Operational
90 days ago
99.52 % uptime
Today
Cluster Nodes and Jobs Operational
90 days ago
99.79 % uptime
Today
Central Software Installations Operational
90 days ago
100.0 % uptime
Today
Home and Scratch File Systems Operational
90 days ago
99.81 % uptime
Today
Data Transfer Tools Operational
90 days ago
100.0 % uptime
Today
Globus Endpoint Operational
90 days ago
100.0 % uptime
Today
CHTC Internal Infrastructure Operational
90 days ago
99.85 % uptime
Today
Tiger Cluster Operational
90 days ago
100.0 % uptime
Today
RT Email/Ticket Support System Operational
90 days ago
99.71 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Apr 25, 2026

No incidents reported today.

Apr 24, 2026

Unresolved incident: Limited /home Storage Availability - AP2001.

Apr 23, 2026

No incidents reported.

Apr 22, 2026

No incidents reported.

Apr 21, 2026
Resolved - We're still not certain what caused the outage, but the server is currently operational.
Apr 21, 09:37 CDT
Monitoring - We restarted the server and it appears to be functional, however, we have not yet identified the cause and it may recur.

Please remember: do not run intensive commands on the access point - this extends to any commands your AI agent may be running on your behalf!

Apr 20, 16:23 CDT
Investigating - Confirmed user reports of being unable to login to ap2002.
We're investigating the cause.

Apr 20, 15:08 CDT
Apr 20, 2026
Completed - The scheduled maintenance has been completed.
Apr 20, 14:41 CDT
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Apr 20, 00:00 CDT
Scheduled - We are re-installing a GPU into xhuanggpu4001. The machine will be unavailable at this time.
Apr 13, 08:32 CDT
Apr 19, 2026

No incidents reported.

Apr 18, 2026

No incidents reported.

Apr 17, 2026

No incidents reported.

Apr 16, 2026

No incidents reported.

Apr 15, 2026
Resolved - A fix has been implemented and confirmed to work. Users with idle GPU jobs should remove their jobs (`condor_rm`) and resubmit the jobs, due to an incorrect expression in the jobs' attributes. Newly submitted jobs should match normally.
Apr 15, 10:48 CDT
Monitoring - A fix has been implemented. Users with idle GPU jobs should remove their jobs (`condor_rm`) and resubmit the jobs, due to an incorrect expression in the jobs' attributes. We are monitoring the situation.
Apr 14, 15:24 CDT
Investigating - We've received reports of GPU jobs failing to match and start up, staying stuck in the IDLE state. We're investigating the cause and will update this statuspage as more information or a solution is implemented.
Apr 13, 16:50 CDT
Resolved - We identified the cause of the issue. When all licenses are checked out, any new jobs requesting licenses will fail with the "Failed to connect to token server" message. We have contact users who are using a majority of the licenses.

All Gurobi users must use `concurrency_limits = GUROBI:1` in their Gurobi jobs' submit files. This ensures that when all licenses are checked out, jobs will remain in idle instead of failing.

Apr 15, 10:47 CDT
Investigating - Some users are reporting that their Gurobi jobs are failing with the message, "Failed to connect to token server". We are currently investigating. We encourage users using Gurobi to double-check that they are using `concurrency_limits = GUROBI:1` in their submit file.
Apr 13, 14:29 CDT
Apr 14, 2026
Apr 13, 2026
Apr 12, 2026

No incidents reported.

Apr 11, 2026

No incidents reported.