Consul/Nomad outage interfering with app instance scheduling
Incident Report for Fly.io
Resolved
This incident has been resolved.
Posted Oct 28, 2022 - 11:48 CDT
Monitoring
We have cleaned up most lingering issues from the Consul/Nomad outage and are monitoring to ensure that peoples' apps recover properly.
Posted Oct 28, 2022 - 09:30 CDT
Update
We are still working on this issue. Deploys, restarts, and rescheduling of apps are all currently affected by this.
Posted Oct 28, 2022 - 08:42 CDT
Identified
We are having continued issues with our consul and nomad clusters. We are working on restoring them.
Posted Oct 28, 2022 - 06:00 CDT
Monitoring
We've repaired our Nomad and Consul clusters and are working through a backlog of queued operations. Deployments and restarts will continue to be delayed ~5-30m until the queue is empty.
Posted Oct 28, 2022 - 02:29 CDT
Update
Deployment monitoring is erroring in most cases. Deployments themselves are working through a queue. It's currently taking upwards of 30 minutes for deploys to take effect.
Posted Oct 27, 2022 - 22:01 CDT
Investigating
We are currently investigating an issue with consul and nomad, which is causing issues with deployment monitoring. Deployments are still happening.
Posted Oct 27, 2022 - 21:16 CDT