Service Management API Errors
Incident Report for Red Hat 3scale
Postmortem

On July 24, 2017, we experienced an elevated number of errors in our Service Management API. The outage was caused by an automated database failover which has failed during a maintenance task, corrupting the data on one of the shards in our database cluster.

To reestablish the service our team had to restore the database from a backup file. Due to the size of the backup file, the restore operation has taken several minutes to be accomplished.

After the backup restoration, the database has been successfully restored, and all the errors in our Service Management API have been resolved.

Posted Jul 24, 2017 - 21:06 CEST

Resolved
The incident has been resolved.
Posted Jul 24, 2017 - 19:36 CEST
Monitoring
The team has restored the affected service and we are monitoring the situation
Posted Jul 24, 2017 - 19:14 CEST
Identified
The issue has been identified, and we are working on a fix.
Posted Jul 24, 2017 - 19:06 CEST
Investigating
We are investigating an elevated number of errors in our Service Management API.
Posted Jul 24, 2017 - 18:54 CEST