Logs cluster failure

Incident Report for Fasterize

Resolved

Logs are now fully operational. We are sorry for logs lost during this incident.
Posted Nov 19, 2019 - 17:25 CET

Update

Due to logs cluster issues, we have lost some logs from sunday 17 to monday 18 th of november 2019.
Cloudfront logs are not lost. We'll relaunch logs extractions batch when lated log indexation will be catch up.
Posted Nov 19, 2019 - 14:56 CET

Identified

Our logs cluster has some performance issues since yesterday 10:00PM UTC. It seems that some logs indices experiment sharding issue that may have corrupted the yesterday's index.
We have scaled cluster to force it to re-shard and rebalance shard. This will allow us to delete unhealthy nodes once re-balance is finished.
Posted Nov 18, 2019 - 14:50 CET
This incident affected: Logs delivery.