On November 7th 2023 the Service Management API suffered a partial outage for 35 minutes affecting about the 25% of the API requests.
A fix for the internal events fetching mechanism that pulls data using an internal API on the component serving Service Management API, overloaded the component due to accumulated data in the events queue. This caused increased latency and partial outage of the Service Management API.
Mitigations include developing a different mechanism that does not affect the performance of the API and improve testing to cover this scenario on the staging and development environments.