Summary
On May 14, 2025, Fasterize experienced a partial service disruption affecting a subset of customers. The issue was caused by a large-scale DDoS attack targeting a website accelerated by our platform. The incident lasted approximately 25 minutes, with service fully restored at 21:47.
Timeline (UTC+2)
- 21:22 – 21:30: Our systems registered an abnormally high volume of requests — over 37 million in total, peaking at 350,000 requests per second.
- 21:47: Traffic stabilized and all services were back to normal.
What Happened
The DDoS attack overwhelmed several load balancers, leading to repeated restarts. Under normal circumstances, our failover system automatically routes traffic directly to the origin servers if a platform zone becomes unhealthy.
However, the DNS health checks tied to certain zones were misconfigured. They continued to report the zone as healthy despite the outage, preventing failover from triggering correctly.
Impact
- Severity Level: 1 (Unplanned downtime affecting multiple production websites)
- Detection time: 12 minutes
- Time to full recovery: 25 minutes
What We're Doing
Immediate fixes
- Corrected the failover configuration to ensure accurate health checks.
Short-term improvements
- Tuned load balancer settings for better resilience under high traffic.
- Improved alerting on health check anomalies.
Medium-term improvements
- Increasing infrastructure redundancy to distribute traffic more effectively.
- Evaluating native rate-limiting solutions to mitigate volumetric attacks.