STL - POD02 Downtime Incident.

Incident Report for Hostek

Postmortem

Affected Services:
Customers hosted in location St Louis, limited to compute POD02.

Date and Time Issue Start:
27/March/2024 03:19 UTC

Date and Time Issue End:
27/March/2024 09:40 UTC

Root Cause:
High Availability (HA) network did not engage as designed

Contributing Factors:
Management Interface accessibility proved to take longer than setup was designed to facilitate

Future Corrective Actions

  • Deploy a new secondary routing device with enhanced failover capabilities
  • Implement proactive testing of HA devices to validate performance under real-world conditions
  • Expand monitoring coverage to detect and mitigate potential failures earlier
  • Optimize network design to improve management access during unexpected events

Management Notes:
We sincerely apologize for the disruption caused by this outage. We understand the impact downtime has on our customers, and while such incidents are rare we take them very seriously. Our team is committed to continuously improving our infrastructure to prevent similar occurrences in the future. The corrective actions outlined above will enhance our networks resilience and ensure a more stable and reliable service.

Posted Mar 28, 2025 - 16:38 UTC

Resolved

The issue has been resolved.
A post mortem will be provided in the near future post investigations.
Posted Mar 27, 2025 - 09:47 UTC

Identified

We believe the underlying issue has now been identified.
Our teams and vendor support are continuing to investigate and address the underlying fault.

Further updates will be provided.
Posted Mar 27, 2025 - 09:17 UTC

Update

We believe have identified the underlying issue.
Alongside our vendor support, we are continuing to work to identify the cause.
Further updates will be provided.
Posted Mar 27, 2025 - 07:13 UTC

Update

Our team are continuing to investigate the underlying cause.
Further updates will be provided
Posted Mar 27, 2025 - 05:16 UTC

Investigating

We are aware of, and currently investigating an issue affecting location STL, limited to compute POD02.
Further updates will be provided
Posted Mar 27, 2025 - 04:18 UTC
This incident affected: Network Infrastructure (STL Region).