Previous incidents
Runs are missing from the dashboard and runs.list is degraded
Resolved Jul 23 at 03:45pm BST
The dashboard/runs.list is back to normal. We're working on and deploying multiple changes which will reduce and prevent these kind of issues from happening.
1 previous update
Runs are missing from the dashboard and runs.list is degraded
Resolved Jul 23 at 12:14am BST
The runs list is now fully operational. There is still missing data that we will be backfilling ASAP.
2 previous updates
Some runs list calls impacted by ClickHouse server crashes
Resolved Jul 18 at 04:26pm BST
We've opened a case with ClickHouse Cloud to try and understand why this happened.
2 previous updates
Batches with more than 20 runs are slow to process
Resolved Jul 15 at 03:15pm BST
Batches are processing as normal now. We have increased future capacity.
This was caused by a runaway loop of batches by a customer and this part of the system didn't have enough capacity to process them all fast enough.
We are updating how we process and rate-limit batches to prevent this from happening again, as well as improved internal alerts if similar issues happen in the future.
2 previous updates
Realtime not processing updates
Resolved Jul 13 at 05:37pm BST
Realtime is sending updates again. The attached storage stopped working and restarting the AWS task didn't work. A hard reset caused it to become healthy again.
We're looking into how to prevent this from happening again
1 previous update
Realtime not sending updates
Resolved Jul 04 at 10:34am BST
This is resolved – Realtime is sending updates again.
Restarting Electric released and reacquired the Postgres replication slot. We're discussing why this happened to try and prevent it in the future.
1 previous update