Elevated error rate with API

Incident Report for Fly.io

Resolved

This incident has been resolved.
Posted Aug 21, 2024 - 10:27 UTC

Monitoring

A fix is being deployed slowly across the fleet. While a fix is being deployed to a host server, API operations on the Fly Machines on that server may fail. Fly Machines themselves will continue to function normally.
Posted Aug 20, 2024 - 23:17 UTC

Identified

Users may see higher rates of 504 and 408 HTTP errors when accessing the Fly Platform API. These may occur particularly with the deploy, scale, and destroy commands. These errors will only occur in relation to specific Fly Machines. Recommended work arounds include ignoring affected Machines during a deploy with --exclude-machines, or destroying an affected Machine with fly m destroy. The problem only involves API operations; Fly Machines themselves function normally.
Posted Aug 20, 2024 - 23:14 UTC
This incident affected: Machines API.