Postmortem -
Read details
Jun 19, 10:39 CEST
Resolved -
This incident is now resolved. Post-mortem will follow but to sum up: the root cause is a change in a DNS record. During that change, the record pointing to our DC took a temporary wrong value that was captured by some edge servers and stored for one hour. This affected only a subset of edge servers and only a subset of the health checkers responsible for triggering the failover mechanism. This explains why the failover mechanism wasn't fully triggered.
Jun 18, 10:55 CEST
Update -
The failover mechanism didn't trigger. We trigger it manually.
Jun 18, 09:49 CEST
Investigating -
We currently have some issues on one of our european DC. Being fixed. Trafic is interrupted for a large portion of the customers. Really sorry for the inconvenience.
Jun 18, 09:41 CEST