CyberCat - Failure of one of the container nodes – Incident details

All systems operational

Failure of one of the container nodes

Resolved
Major outage
Started about 1 month agoLasted 13 minutes

Affected

Production web services

Partial outage from 11:47 PM to 12:00 AM

Container services

Partial outage from 11:47 PM to 12:00 AM

Updates
  • Resolved
    Resolved

    This incident has now been resolved.

  • Monitoring
    Monitoring

    One of the nodes supporting our docker infrastructure has stopped responding. The orchestrator automatically detected that this node was unresponsive and the services hosted on this node were automatically restarted on another node after 1-2 minutes. Some public-facing services may therefore have been affected.

    The situation has since returned to normal.

    We will transparently balance some services to better distribute the load.