Previous incidents

January 2025
Jan 30, 2025
1 incident

Slow queue times

Resolved Jan 30 at 10:10am GMT

Queue processing performance is back to normal because there's been a reduction in demand.

We have identified the underlying bottleneck and are working on a permanent fix. This shouldn't be a major change and should be live. There is a high degree of contention on an update when a single queue's concurrencyLimit is different on every call to trigger a task. This is an edge case we haven't seen anyone do before.

Jan 23, 2025
1 incident

Deploys are failing with a 520 status code

Downtime

Resolved Jan 24 at 07:28pm GMT

Important: Upgrade to 3.3.12+ in order to deploy again

If you use npx you can upgrade the CLI and all of the packages by running:
npx trigger.dev@latest update

This should download 3.3.12 (or newer) of the CLI and then prompt you to update the other packages too.

If you have pinned a specific version (e.g. in GitHub Actions) you may need to manually update your package.json file or a workflow file.

Read our full package upgrading guide here: https://trigger.dev/docs/upgrading-packages

3 previous updates

December 2024
Dec 31, 2024
1 incident

Realtime is degraded

Degraded

Resolved Jan 02 at 12:07pm GMT

The ElectricSQL team created a fix for the Postgres transaction wraparound issue and we've confirmed that is now deployed and fixed on Prod.

5 previous updates

Dec 02, 2024
1 incident

Deploys are failing due to a downstream provider

Degraded

Resolved Dec 02 at 10:09pm GMT

We have moved all deploys to Europe on Depot as a temporary fix while they fix the underlying issue in the US. Deploys will be a bit slower than normal and the first one won't use the cache, but they should work.

1 previous update

November 2024
Nov 23, 2024
1 incident

V3 runs are processing slowly

Degraded

Resolved Nov 23 at 12:55am GMT

Runs are processing normally again, queues should come down fast.

The Kubernetes database etcd didn’t allow new values. Increasing max sizes, restarting, and changing some other settings worked.

1 previous update

Nov 22, 2024
1 incident

Realtime (beta) is offline

Downtime

Resolved Nov 22 at 07:36pm GMT

Realtime is back online.

We've made some configuration changes and have some more reliability fixes in progress to make this rock solid.

1 previous update

Nov 08, 2024
1 incident

V2 runs are processing slowly

Degraded

Resolved Nov 08 at 04:30pm GMT

V2 queues are caught up. Now any queued runs are due to concurrency limits.

V3 was not impacted during the entire period.

We restarted all V2 worker servers and V2 runs started processing again. We are still investigating the underlying cause to prevent this happening again. There were no code changes or deploys during this period and the overall V2 load wasn't unusual.

2 previous updates