All Systems Operational

About This Site

This page is for updates about global incidents. It does not include updates about routine hardware failures or isolated infrastructure events that have limited impact. For a personalized view of all events that might affect your apps, please check the personalized status page in your Fly Organization's dashboard. For all internal incidents and other activities, please check Infra Log.

Customer Applications Operational
Dashboard Operational
Machines API Operational
Regional Availability Operational
AMS - Amsterdam, Netherlands Operational
ARN - Stockholm, Sweden Operational
BOM - Mumbai, India Operational
CDG - Paris, France Operational
DFW - Dallas, Texas (US) Operational
EWR - Secaucus, NJ (US) Operational
FRA - Frankfurt, Germany Operational
GRU - Sao Paulo, Brazil Operational
IAD - Ashburn, Virginia (US) Operational
JNB - Johannesburg, South Africa Operational
LAX - Los Angeles, California (US) Operational
LHR - London, United Kingdom Operational
NRT - Tokyo, Japan Operational
ORD - Chicago, Illinois (US) Operational
SIN - Singapore Operational
SJC - San Jose, California (US) Operational
SYD - Sydney, Australia Operational
YYZ - Toronto, Canada Operational
Persistent Storage (Volumes) Operational
Deployments Operational
Remote Builds Operational
Logs Operational
Metrics Operational
SSL/TLS Certificate Provisioning Operational
UDP Anycast Operational
Fly Machine Image Registry 1 Operational
Fly Machine Image Registry 2 Operational
Extensions Operational
Upstash for Redis Operational
DNS Operational
Fly Machine .internal DNS Operational
Fly Machine External DNS Operational
*.flyio.net Nameservers Operational
flydns.net Operational
Billing Operational
Usage Metrics API Operational
Stripe API Connection Operational
Corrosion Operational
Managed Postgres Operational
90 days ago
99.95 % uptime
Today
Management Plane - ORD Operational
90 days ago
99.96 % uptime
Today
Management Plane - IAD Operational
90 days ago
99.81 % uptime
Today
Management Plane - FRA Operational
90 days ago
100.0 % uptime
Today
Management Plane - GRU Operational
90 days ago
100.0 % uptime
Today
Management Plane - LAX Operational
90 days ago
100.0 % uptime
Today
Management Plane - SYD Operational
90 days ago
99.98 % uptime
Today
Management Plane - AMS Operational
90 days ago
99.77 % uptime
Today
Management Plane - LHR Operational
90 days ago
100.0 % uptime
Today
Management Plane - NRT Operational
90 days ago
100.0 % uptime
Today
Management Plane - SIN Operational
90 days ago
99.87 % uptime
Today
Management Plane - SJC Operational
90 days ago
100.0 % uptime
Today
Management Plane - YYZ Operational
90 days ago
100.0 % uptime
Today
Phoenix.new Operational
Support Portal Operational
Sprites Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Mar 14, 2026
Resolved - This incident has been resolved.
Mar 14, 14:05 UTC
Monitoring - Organizations with names prefixed with numerical digits may experience 401 errors. Affected operations include actions such as Sprite creation, listing, etc...

A fix has been implemented since 2026-03-14 12:30 UTC and we are monitoring the results!

Mar 14, 04:20 UTC
Mar 13, 2026

No incidents reported.

Mar 12, 2026

No incidents reported.

Mar 11, 2026
Resolved - This incident has been resolved.
Mar 11, 11:37 UTC
Update - While the secret storage service was in a read-only state, app creation requests queued up, due to the retry logic and insufficient request concurrency limits in our GraphQL API. This prevented our GraphQL API from serving any other requests. We have scaled up the GraphQL API and are continuing to monitor the situation.
Mar 11, 11:03 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 11, 10:14 UTC
Identified - An ongoing data migration in our secret storage service is causing degraded Machines API functionality.
Mar 11, 09:19 UTC
Mar 10, 2026

No incidents reported.

Mar 9, 2026

No incidents reported.

Mar 8, 2026

No incidents reported.

Mar 7, 2026
Resolved - This incident has been resolved.
Mar 7, 15:56 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 7, 15:10 UTC
Investigating - We are investigating a private networking failure between SYD and other regions. Apps continue to run, and private networking within SYD is unaffected.
Mar 7, 14:42 UTC
Mar 6, 2026

No incidents reported.

Mar 5, 2026
Resolved - This incident has been resolved. Due to a BGP issue, we saw some North American traffic routed to edges in Singapore (sin). Users in North America would have seen additional request latency during this period.
Mar 5, 19:50 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 5, 19:38 UTC
Investigating - We're aware of routing issues affecting some customers in North America regions, and we're actively investigating.
Mar 5, 19:24 UTC
Mar 4, 2026
Completed - The scheduled maintenance has been completed.
Mar 4, 09:00 UTC
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Mar 4, 03:00 UTC
Scheduled - An upstream provider is performing network maintenance in GRU on 2026-03-04, from 03:00 UTC (00:00am local time) to 09:00 UTC (6:00am local time). A loss of connectivity for up to 30 minutes is expected within the scheduled maintenance window.
Feb 20, 19:32 UTC
Mar 3, 2026
Resolved - This incident was caused by a failed Redis node that powers our GraphQL API. We were able to recreate the Redis node and restore service.

We are still investigating the root cause of the failure. In the mean time, all API endpoints now appear to be stable and errors have dropped to baseline level.

Mar 3, 21:15 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 3, 20:36 UTC
Investigating - We're investigating elevated GraphQL errors that affect some API endpoints.
Mar 3, 20:18 UTC
Resolved - This incident has been resolved.
Mar 3, 12:10 UTC
Investigating - We are currently investigating this issue.
The page currently displays: "We’re having trouble loading the cost breakdown."

Mar 3, 10:50 UTC
Resolved - Between 19:54 and 20:06 UTC, our Vault cluster serving app certificates was unavailable. This caused various API requests to fail, mainly operations on certificates but also app creates and IP assignments.

As the failure mode was Vault requests hanging rather than failing immediately, TLS requests through fly-proxy for domains where the certificate was not cached on the local node remained open for a long time while proxy attempted to fetch the certificate; this caused some connections to fail as too many connection slots were taken up by requests waiting on Vault.

The root cause of this incident was a partially completed update to the Vault cluster. We will be implementing safeguards in the proxy for this failure mode, as well as improving certificate storage longer-term.

Mar 3, 00:54 UTC
Mar 2, 2026
Resolved - This incident has been resolved.
Mar 2, 22:49 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 2, 20:35 UTC
Identified - The issue has been identified and a fix is being implemented.
Mar 2, 18:21 UTC
Investigating - We are currently investigating this issue.
Mar 2, 17:42 UTC
Resolved - This incident has been resolved.
Mar 2, 21:50 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Mar 2, 21:47 UTC
Identified - The issue has been identified and a fix is being implemented.
Mar 2, 21:39 UTC
Investigating - We're currently investigating issues with the Machines API. Customer deployments and the Fly dashboard may be affected.
Mar 2, 21:19 UTC
Mar 1, 2026

No incidents reported.

Feb 28, 2026

No incidents reported.