Previous incidents

April 2025
Apr 07, 2025
1 incident

Run are dequeuing slower than normal

Degraded

Resolved Apr 07 at 08:09pm BST

Runs have been dequeuing quickly for some time now, so marking this as resolved. We're continuing to monitor it closely.

Runs dequeued for the entire period but queue times were longer than normal, across all customers.

The vast majority of queues have reduced back to normal length already or will soon.

We suspect this was caused by an underlying Digital Ocean networking issue, that meant our Kubernetes control plane nodes were slow to create and delete pods. We are trying to figure out i...

1 previous update

March 2025
Mar 21, 2025
1 incident

Tasks with large payloads or outputs are sometimes failing

Degraded

Resolved Mar 21 at 10:48pm GMT

Cloudflare R2 is back online and uploads of large payloads and outputs have resumed. We'll continue to monitor the situation

4 previous updates

Mar 07, 2025
1 incident

Significant disruption to run starts (runs stuck in queueing)

Degraded

Resolved Mar 07 at 09:15pm GMT

We are confident that most queues have caught up again but are still monitoring the situation.

If you are experiencing unexpected queue times this is most likely due to plan or custom queue limits. Should this persist, please get in touch.

5 previous updates

Mar 04, 2025
1 incident

Uncached deploys are causing runs to be queued

Degraded

Resolved Mar 04 at 12:40pm GMT

We tracked this down to a broken deploy pipeline which reverted one of our internal components to a previous version. This caused a required environment variable to be ignored.

We have applied a hotfix and will be making more permanent changes to prevent this from happening again.

2 previous updates

February 2025
Feb 09, 2025
1 incident

Slow queue times and some runs system failing

Degraded

Resolved Feb 09 at 06:10pm GMT

Queue times and timeout errors are back to their previous levels.

Note that start times of containers are still slower than they should be, especially if you aren't doing a lot of runs.

There's a GitHub issue here about slow start times and what we're doing to make them consistently fast: https://github.com/triggerdotdev/trigger.dev/issues/1685

It will start with a new Docker image caching layer that will ship tomorrow.

What's causing these problems

We've had a more than 10x increase...

1 previous update