Identified - Some jobs running on gpulab2001 or gpulab2003 may fail with an error "CUDA error: failed call to cuInit: CUDA_ERROR_UNKNOWN". We are working to resolve the issue.
Jun 02, 2026 - 16:53 CDT
Resolved -
We restarted the service and the Globus interface appears to be operating again.
However, we don't know yet what is causing the issue, so it may reoccur. Please let us know at chtc@cs.wisc.edu if you encounter the issue again.
Jun 16, 15:00 CDT
Resolved -
OSDF transfers should be operational. If you encounter errors, please let us know at chtc@cs.wisc.edu
Jun 16, 13:46 CDT
Investigating -
The OSDF system has been having trouble over the weekend. This is causing OSDF transfers to fail with a message like "error while querying the director at https://osdf-director.osg-htc.org: Transfer.DirectorTimeout Error".
We are investigating the problem.
Jun 8, 09:01 CDT
Resolved -
This incident has been resolved.
Jun 15, 12:17 CDT
Investigating -
Confirmed user reports of being unable to launch a BadgerCompute instance. The loading screen hangs on "Your server is starting up" and eventually times out with "Spawn failed".
Resolved -
We identified the specific cause and are addressing it. Condor commands on ap2002 should be working again, though the issue may reoccur in the future.
Jun 4, 10:08 CDT
Investigating -
We're seeing reports of condor commands, such as condor_submit and condor_q, hanging or failing. We are investigating the cause and will update this Status Page as more information becomes available.
Jun 3, 16:32 CDT
Jun 3, 2026
Jun 2, 2026
Unresolved incident: [HTC] GPU issues on gpulab2001, gpulab2003.