Fly machines API slow/unresponsive

Incident Report for Fly.io

Resolved

This incident is now resolved.
Posted Sep 13, 2024 - 18:34 UTC

Monitoring

After reverting the suspect changes, our metrics indicate service recovery. The Machines API is now operating normally. We are monitoring for any lingering issues.
Posted Sep 13, 2024 - 18:04 UTC

Identified

We've identified the problematic change and are performing a rollback to restore service.
Posted Sep 13, 2024 - 17:56 UTC

Investigating

Fly API operations are currently slow or failing, we're investigating the cause.

This will affect most operations that rely in the machines API, including starting Fly builders, deploying, creating/destroying machines, querying machine or volume data via the API.

Existing, running apps and machines are not affected and continue to serve normally.
Posted Sep 13, 2024 - 17:47 UTC
This incident affected: Machines API and Remote Builds.