Back to overview
Degraded

v3 runs are paused due to network issues

Jun 13 at 12:19pm BST
Affected services
Trigger.dev cloud

Resolved
Jun 13 at 01:20pm BST

Runs are operating at full speed.

We think this issue was caused by the clean-up operations that clear completed pods. There are far more runs than a week ago, so that list can get very large causing a strain on the system including internal networking. We've increased the frequency and are monitoring the load including networking. After 15 mins everything seems normal.

Updated
Jun 13 at 01:15pm BST

v3 runs are processing with slightly reduced capacity in our cluster. Some nodes that we've isolated have network issues. We're still trying to diagnose the root cause to prevent this from happening again.

Created
Jun 13 at 12:19pm BST

There's a networking issue in our cluster. The BPF networking change we made yesterday hasn't fully fixed the problems.

We're working to get runs executing as quickly as possible and then figure out the root cause of this issue so it doesn't happen again.