Previous incidents
Some runs list calls impacted by ClickHouse server crashes
Resolved Jul 18 at 04:26pm BST
We've opened a case with ClickHouse Cloud to try and understand why this happened.
2 previous updates
Batches with more than 20 runs are slow to process
Resolved Jul 15 at 03:15pm BST
Batches are processing as normal now. We have increased future capacity.
This was caused by a runaway loop of batches by a customer and this part of the system didn't have enough capacity to process them all fast enough.
We are updating how we process and rate-limit batches to prevent this from happening again, as well as improved internal alerts if similar issues happen in the future.
2 previous updates
Realtime not processing updates
Resolved Jul 13 at 05:37pm BST
Realtime is sending updates again. The attached storage stopped working and restarting the AWS task didn't work. A hard reset caused it to become healthy again.
We're looking into how to prevent this from happening again
1 previous update
Realtime not sending updates
Resolved Jul 04 at 10:34am BST
This is resolved – Realtime is sending updates again.
Restarting Electric released and reacquired the Postgres replication slot. We're discussing why this happened to try and prevent it in the future.
1 previous update
v4 dequeue performance degradation
Resolved May 26 at 02:44pm BST
v4 dequeue performance has now improved again, and we're working on two things:
A short term fix to prevent this from happening again in the short term, to be deployed today.
A long term fix for dequeue performance that will be worked on and hopefully shipped this week, which will vastly improve dequeue performance and scaling.
1 previous update
v3 runs dequeuing slower than normal
Resolved May 06 at 03:28pm BST
Queues are back to nominal length and have been for some time.
This issue was caused by a huge influx of queues, which meant we weren't considering them all when selecting queues for dequeuing.
We have increased some settings to make this better and we're looking at what we can do in the future to make this scale better for the next 10–100x multiple of queues.
3 previous updates