Platform has been unavailable

Incident Report for Fasterize

Postmortem

Summary

On May 14, 2025, Fasterize experienced a partial service disruption affecting a subset of customers. The issue was caused by a large-scale DDoS attack targeting a website accelerated by our platform. The incident lasted approximately 25 minutes, with service fully restored at 21:47.

Timeline (UTC+2)

  • 21:22 – 21:30: Our systems registered an abnormally high volume of requests — over 37 million in total, peaking at 350,000 requests per second.
  • 21:47: Traffic stabilized and all services were back to normal.

What Happened

The DDoS attack overwhelmed several load balancers, leading to repeated restarts. Under normal circumstances, our failover system automatically routes traffic directly to the origin servers if a platform zone becomes unhealthy.

However, the DNS health checks tied to certain zones were misconfigured. They continued to report the zone as healthy despite the outage, preventing failover from triggering correctly.

Impact

  • Severity Level: 1 (Unplanned downtime affecting multiple production websites)
  • Detection time: 12 minutes
  • Time to full recovery: 25 minutes

What We're Doing

Immediate fixes

  • Corrected the failover configuration to ensure accurate health checks.

Short-term improvements

  • Tuned load balancer settings for better resilience under high traffic.
  • Improved alerting on health check anomalies.

Medium-term improvements

  • Increasing infrastructure redundancy to distribute traffic more effectively.
  • Evaluating native rate-limiting solutions to mitigate volumetric attacks.
Posted May 15, 2025 - 17:04 CEST

Resolved

We experienced a service disruption caused by a Distributed Denial of Service (DDoS) attack.

The issue has now been resolved. A full post-mortem will follow.

Thank you for your patience and understanding.
Posted May 13, 2025 - 22:53 CEST

Investigating

One of our datacenter has been unavailable between 21h21 and 21h47. We are investigating the incident.
Posted May 13, 2025 - 21:54 CEST
This incident affected: Acceleration.