On May 24th of 2023 all the 3scale SaaS APIs suffered a major outage for 26 minutes.
May 24, 2023 09:03 UTC: rollout of a new version of the control plane that the 3scale SaaS uses to distribute its ingress layer configuration is completed.
May 24, 2023 09:04 UTC: alerts start firing for all 3scale SaaS APIs. Statistics show all requests to all APIs failing. Given the widespread outage the rollback procedure is started some minutes later to bring the control plane back to its previous version.
May 24, 2023 09:30 UTC - the rollback procedure is completed and service is recovered for all APIs.
The root cause has been identified as being related to the rollout strategy used to deploy the new versions of the control plane.
Mitigations include modifying this strategy appropriately and adding some control mechanisms to make the process more robust and avoid this from happening in the future.
Posted May 24, 2023 - 16:37 CEST
This incident has been resolved.
Posted May 24, 2023 - 11:29 CEST
Our operations team is working to identify the root cause and implement a solution.