Previous incidents
V3 runs are processing slowly
Resolved Nov 23 at 12:55am GMT
Runs are processing normally again, queues should come down fast.
The Kubernetes database etcd didn’t allow new values. Increasing max sizes, restarting, and changing some other settings worked.
1 previous update
Realtime (beta) is offline
Resolved Nov 22 at 07:36pm GMT
Realtime is back online.
We've made some configuration changes and have some more reliability fixes in progress to make this rock solid.
1 previous update
V2 runs are processing slowly
Resolved Nov 08 at 04:30pm GMT
V2 queues are caught up. Now any queued runs are due to concurrency limits.
V3 was not impacted during the entire period.
We restarted all V2 worker servers and V2 runs started processing again. We are still investigating the underlying cause to prevent this happening again. There were no code changes or deploys during this period and the overall V2 load wasn't unusual.
2 previous updates
Realtime service degraded
Resolved Nov 01 at 12:38am GMT
Realtime is recovering after a restart and a clearing of the consumer cache, but the underlying issue has not been solved. We're still working on a fix and will update as we make progress.
1 previous update
Dashboard instability and slower run processing
Resolved Oct 25 at 06:47pm BST
The networking issues from our worker cluster cloud provider is no longer happening. Networking has been back to full speed for the past 10 minutes and run are processing fast.
3 previous updates