Investigating - Users are experiencing slow transfer speeds associated with our SQUID server. We are investigating the issue and working on a solution.
Nov 13, 2024 - 14:23 CST
Update - We are continuing to investigate the cause of slow transfers to/from /staging and /projects
Oct 22, 2024 - 12:18 CDT
Investigating - Multiple confirmed user reports of slow transfers to/from /staging or /projects.
We have been investigating this issue but unfortunately we have not been able to identify a clear cause yet.
We appreciate your patience while we investigate the issue.

Oct 18, 2024 - 09:42 CDT

About This Site

This page provides information about unplanned downtimes and scheduled maintenance for services offered by the Center for High Throughput Computing

High Throughput Computing (HTC) System Degraded Performance
90 days ago
99.66 % uptime
Today
Access Points Operational
90 days ago
99.99 % uptime
Today
CHTC Pool Operational
90 days ago
100.0 % uptime
Today
External Pools (OSPool, Campus HTCondor Pools) Operational
90 days ago
100.0 % uptime
Today
Staging and Projects Space Degraded Performance
90 days ago
100.0 % uptime
Today
File Transfers Operational
90 days ago
98.32 % uptime
Today
High Performance Computing (HPC) System Operational
90 days ago
99.4 % uptime
Today
Login Nodes Operational
90 days ago
99.99 % uptime
Today
Cluster Nodes and Jobs Operational
90 days ago
100.0 % uptime
Today
Central Software Installations Operational
90 days ago
100.0 % uptime
Today
Home and Scratch File Systems Operational
90 days ago
97.63 % uptime
Today
Data Transfer Tools Operational
90 days ago
100.0 % uptime
Today
Globus Endpoint Operational
90 days ago
100.0 % uptime
Today
CHTC Internal Infrastructure Operational
90 days ago
100.0 % uptime
Today
Tiger Cluster ? Operational
90 days ago
100.0 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Nov 20, 2024

No incidents reported today.

Nov 19, 2024

No incidents reported.

Nov 18, 2024
Resolved - This incident has been resolved.
Nov 18, 15:51 CST
Monitoring - A fix has been implemented, and we are monitoring the situation. Users may still encounter this message intermittently as the pull rate cooldown period refreshes.
Nov 15, 12:43 CST
Identified - Users pulling Docker containers received an error message stating the docker pull rate limit is exceeded. This is due to our docker cache host blocking IPv6 connections, causing clients to pull directly from Docker. We currently implementing a fix for the issue.
Nov 14, 10:31 CST
Nov 17, 2024

No incidents reported.

Nov 16, 2024

No incidents reported.

Nov 15, 2024
Resolved - The issue appears to have been limited to accessing /scratch via the login node (spark-login.chtc.wisc.edu). Jobs should have been unaffected by the problem.

Reminder: do not run intensive processes on the login server. Intensive calculations or scripts should be run as part of sbatch/srun command.

Nov 15, 13:23 CST
Investigating - We're aware of an issue with the /scratch directory on the HPC system that is preventing users from accessing the files there.

We're working to identify and resolve the issue.

Nov 15, 11:19 CST
Nov 14, 2024
Resolved - This incident has been resolved.
Nov 14, 09:26 CST
Monitoring - Users pulling Docker containers received an error message stating the docker pull rate limit is exceeded. This is due to our docker cache host blocking IPv6 connections, causing clients to pull directly from Docker. This has since been resolved, but users may continue to receive the error during the cooldown period for Docker pull requests. We are monitoring the situation.
Nov 13, 14:27 CST
Nov 13, 2024

Unresolved incident: Slow transfer speeds with SQUID.

Nov 12, 2024

No incidents reported.

Nov 11, 2024

No incidents reported.

Nov 10, 2024

No incidents reported.

Nov 9, 2024

No incidents reported.

Nov 8, 2024

No incidents reported.

Nov 7, 2024

No incidents reported.

Nov 6, 2024

No incidents reported.