Post Incident Report
Dublin 3 Cluster Application Incident
Date of Issue 23 April 2024
Incident Reference INC-20240423-1418
01
Summary
On the 23rd April 2024 , the Sedna Platform experienced a partial loss of service. A small number of customers were affected by an application outage and were therefore unable to send or receive emails for a period of approximately 15 minutes. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s availability. We have conducted an internal investigation and are taking steps to improve our service.
02
Detailed description
At approximately 14:08 UTC on 23 April 2024 Sedna incurred a restart of one of its primary applications (Node). The restart was an automated action triggered by the system as a result of an unhealthy state, and resulted in a period of 15 minutes of downtime while the restart completed.
The reason for the restart was related to a memory issue with the service, combined with an extraordinarily high workload on the system. The workload caused a backup in requests, which eventually exhausted the system memory and triggered a restart. The restart itself is an automated action that allows the system to respond to such an event and recover quickly, however during the restart process systems can be unavailable.
The first customer case surfacing symptoms of an application outage was raised with Customer Support at 14:13 UTC, at which point the Sedna Support team triggered a major incident with the engineering team to urgently investigate the issue. The Sedna team deployed a Status Page notification at 14:21 UTC notifying all Status Page followers of the incident under investigation.
The service was fully operational at 14:23 UTC and all customers who reported an incident were informed of the incident closure on the same day.
03
Remediation and PreventionAn incident of this nature receives Sedna’s highest level of scrutiny to ensure we can provide our customers with full confidence in the system. Following the incident the team conducted a retrospective to review the remediation taken and to detail next steps to ensure prevention of similar issues occurring in the future. See below the Remediation and Prevention details:
Remediation:
We have put the following changes in effect to reduce the likelihood of the issue from recurring:
04
What you can expect from SEDNA
We understand the critical nature of the services SEDNA provides your business. We will continue to communicate with customers to answer any questions and ensure we do our best to provide a seamless customer experience. We apologize for any issues these events may have caused.
Please reach out directly to SEDNA Support (support@sedna.com) with any questions.