Previous incidents

March 2025
Mar 21, 2025
1 incident

Tasks with large payloads or outputs are sometimes failing

Degraded

Resolved Mar 21 at 10:48pm GMT

Cloudflare R2 is back online and uploads of large payloads and outputs have resumed. We'll continue to monitor the situation

4 previous updates

Mar 07, 2025
1 incident

Significant disruption to run starts (runs stuck in queueing)

Degraded

Resolved Mar 07 at 09:15pm GMT

We are confident that most queues have caught up again but are still monitoring the situation.

If you are experiencing unexpected queue times this is most likely due to plan or custom queue limits. Should this persist, please get in touch.

5 previous updates

Mar 04, 2025
1 incident

Uncached deploys are causing runs to be queued

Degraded

Resolved Mar 04 at 12:40pm GMT

We tracked this down to a broken deploy pipeline which reverted one of our internal components to a previous version. This caused a required environment variable to be ignored.

We have applied a hotfix and will be making more permanent changes to prevent this from happening again.

2 previous updates

February 2025
Feb 09, 2025
1 incident

Slow queue times and some runs system failing

Degraded

Resolved Feb 09 at 06:10pm GMT

Queue times and timeout errors are back to their previous levels.

Note that start times of containers are still slower than they should be, especially if you aren't doing a lot of runs.

There's a GitHub issue here about slow start times and what we're doing to make them consistently fast: https://github.com/triggerdotdev/trigger.dev/issues/1685

It will start with a new Docker image caching layer that will ship tomorrow.

What's causing these problems

We've had a more than 10x increase...

1 previous update

January 2025
Jan 30, 2025
1 incident

Slow queue times

Resolved Jan 30 at 10:10am GMT

Queue processing performance is back to normal because there's been a reduction in demand.

We have identified the underlying bottleneck and are working on a permanent fix. This shouldn't be a major change and should be live. There is a high degree of contention on an update when a single queue's concurrencyLimit is different on every call to trigger a task. This is an edge case we haven't seen anyone do before.

Jan 23, 2025
1 incident

Deploys are failing with a 520 status code

Downtime

Resolved Jan 24 at 07:28pm GMT

Important: Upgrade to 3.3.12+ in order to deploy again

If you use npx you can upgrade the CLI and all of the packages by running:
npx trigger.dev@latest update

This should download 3.3.12 (or newer) of the CLI and then prompt you to update the other packages too.

If you have pinned a specific version (e.g. in GitHub Actions) you may need to manually update your package.json file or a workflow file.

Read our full package upgrading guide here: https://trigger.dev/docs/upgrading-packages

3 previous updates