Consul/Nomad outage interfering with app instance scheduling
Incident Report for Fly.io
Resolved
This incident has been resolved.
Posted Oct 28, 2022 - 16:48 UTC
Monitoring
We have cleaned up most lingering issues from the Consul/Nomad outage and are monitoring to ensure that peoples' apps recover properly.
Posted Oct 28, 2022 - 14:30 UTC
Update
We are still working on this issue. Deploys, restarts, and rescheduling of apps are all currently affected by this.
Posted Oct 28, 2022 - 13:42 UTC
Identified
We are having continued issues with our consul and nomad clusters. We are working on restoring them.
Posted Oct 28, 2022 - 11:00 UTC
Monitoring
We've repaired our Nomad and Consul clusters and are working through a backlog of queued operations. Deployments and restarts will continue to be delayed ~5-30m until the queue is empty.
Posted Oct 28, 2022 - 07:29 UTC
Update
Deployment monitoring is erroring in most cases. Deployments themselves are working through a queue. It's currently taking upwards of 30 minutes for deploys to take effect.
Posted Oct 28, 2022 - 03:01 UTC
Investigating
We are currently investigating an issue with consul and nomad, which is causing issues with deployment monitoring. Deployments are still happening.
Posted Oct 28, 2022 - 02:16 UTC