Slow queue times and some runs system failing
Resolved
Feb 09 at 06:10pm GMT
Queue times and timeout errors are back to their previous levels.
Note that start times of containers are still slower than they should be, especially if you aren't doing a lot of runs.
There's a GitHub issue here about slow start times and what we're doing to make them consistently fast: https://github.com/triggerdotdev/trigger.dev/issues/1685
It will start with a new Docker image caching layer that will ship tomorrow.
What's causing these problems
We've had a more than 10x increase in load in the past 7 days. Some things that have worked well for the past few months now work less well at this new scale. Some of those issues can compound under high load and cause more significant issues.
Affected services
Trigger.dev cloud
Created
Feb 09 at 05:35pm GMT
We have made some adjustments to stop this issue, we're looking into it
Affected services
Trigger.dev cloud