During the migration of website configurations database, the traffic has been redirected to the customers origin web servers for 32 minutes.
For the majority of websites, the traffic has correctly been served by the origin. However, for a few websites, the origin didn’t succeed to do so due to origin configuration.
All times are UTC+2.
On October 3, 2023, the deployment of a new database holding website configurations occurred. During the deployment, the platform health checks switched to an unhealthy state.
Platform health checks consist of multiple monitors sending requests to the platform at regular intervals to validate that all layers in the platform are functional. When these requests fail, the traffic is automatically routed to the customers’ origin.
After the migration, the health checks received 521 errors (meaning that the relevant configuration for a given requested domain was not found).
The issue occurred because the deployment brought in a change in the logic involved in config loading. In the previous release, a request from the health checks was satisfied even if no configuration matched. In the current version, this is not possible. To quickly fix the issue, we created a configuration for health checks.
This issue was not detected in our testing phases for the following reasons:
By design, redirecting browser traffic to the origin when the platform is considered down is correct. However, we are seeing more and more cases where the origin cannot accept the traffic sent by browsers due to various reasons such as firewalls or incorrect certificates. We will improve our API to manage these edge cases.