Partially Degraded Service

About This Site

This page is for updates about global incidents. It does not include updates about routine hardware failures or isolated infrastructure events that have limited impact. For a personalized view of all events that might affect your apps, please check the personalized status page in your Fly Organization's dashboard. For all internal incidents and other activities, please check Infra Log.

Customer Applications Operational
Dashboard Operational
Machines API Operational
Regional Availability Operational
AMS - Amsterdam, Netherlands Operational
ARN - Stockholm, Sweden Operational
ATL - Atlanta, Georgia (US) Operational
BOG - Bogotá, Colombia Operational
BOM - Mumbai, India Operational
CDG - Paris, France Operational
DEN - Denver, Colorado (US) Operational
DFW - Dallas, Texas (US) Operational
EWR - Secaucus, NJ (US) Operational
EZE - Ezeiza, Argentina Operational
FRA - Frankfurt, Germany Operational
GDL - Guadalajara, Mexico Operational
GIG - Rio de Janeiro, Brazil Operational
GRU - Sao Paulo, Brazil Operational
HKG - Hong Kong Operational
IAD - Ashburn, Virginia (US) Operational
JNB - Johannesburg, South Africa Operational
LAX - Los Angeles, California (US) Operational
LHR - London, United Kingdom Operational
MAD - Madrid, Spain Operational
MEL - Melbourne, Australia Operational
MIA - Miami, Florida (US) Operational
NRT - Tokyo, Japan Operational
ORD - Chicago, Illinois (US) Operational
OTP - Bucharest, Romania Operational
PHX - Phoenix, Arizona (US) Operational
QRO - Querétaro, Mexico Operational
SCL - Santiago, Chile Operational
SEA - Seattle, Washington (US) Operational
SIN - Singapore Operational
SJC - San Jose, California (US) Operational
SYD - Sydney, Australia Operational
WAW - Warsaw, Poland Operational
YUL - Montréal, Canada Operational
YYZ - Toronto, Canada Operational
Persistent Storage (Volumes) ? Operational
Deployments ? Operational
Remote Builds Operational
Logs Operational
Metrics ? Operational
SSL/TLS Certificate Provisioning Operational
UDP Anycast ? Operational
Fly Machine Image Registry 1 Operational
Fly Machine Image Registry 2 Operational
Extensions Operational
Upstash for Redis Operational
DNS Degraded Performance
Fly Machine .internal DNS ? Operational
Fly Machine External DNS Operational
*.fly.dev Nameservers Degraded Performance
*.flyio.net Nameservers Operational
Billing Operational
Usage Metrics API Operational
Stripe API Connection Operational
Corrosion ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Jan 24, 2025

No incidents reported today.

Jan 23, 2025
Resolved - This incident has been resolved
Jan 23, 22:06 UTC
Monitoring - We have brought additional IAD capacity online. Customers should see machine creation, deploy, and scaling operations succeed as normal in the region.
We're continuing to monitor to ensure full recovery.

Jan 23, 21:42 UTC
Identified - We are continuing the process of adding additional machine capacity in the IAD region.
Jan 23, 20:44 UTC
Investigating - Machine capacity in the IAD region is currently low. We're working to bring additional capacity online.

In the meantime, you may see errors deploying new machines in IAD, or increasing the size of existing machines in the region. Customers may want to deploy machines to nearby regions, such as ewr

Jan 23, 20:23 UTC
Resolved - This issue has been resolved, deploys using Depot Builders are succeeding as expected.
Jan 23, 18:35 UTC
Monitoring - The Depot builder service is partially recovered and we are seeing deploys using Depot builders succeed again. Some customers may still experience degraded performance using Depot builders at this time. We're continuing to monitor for full recovery.

Customers can still deploy using Fly.io hosted builders with `fly deploy --depot=false`

Jan 23, 18:28 UTC
Identified - The Depot service is currently degraded due to a database outage. We're continuing to monitor for recovery. Customers can also follow the Depot status page at https://status.depot.dev/ for updates.

Customers can still deploy using Fly.io hosted builders with `fly deploy --depot=false`

Jan 23, 18:21 UTC
Investigating - We are investigating increased error rates when deploying apps using the default Depot Builders.

Customers who experience this issue can work around it by using `fly deploy --depot=false` to deploy your image with a Fly.io hosted builder.

Jan 23, 18:05 UTC
Jan 22, 2025

No incidents reported.

Jan 21, 2025

No incidents reported.

Jan 20, 2025

No incidents reported.

Jan 19, 2025

No incidents reported.

Jan 18, 2025

No incidents reported.

Jan 17, 2025

No incidents reported.

Jan 16, 2025
Resolved - This incident has been resolved.
Jan 16, 17:48 UTC
Investigating - We are investigating error 503 when making requests to our GraphQL API, or running flyctl commands.
Jan 16, 17:30 UTC
Jan 15, 2025
Resolved - This incident has been resolved.
Jan 15, 19:22 UTC
Monitoring - A fix has been implemented and bluegreen deploys are succeeding as expected. We're continuing to monitor deploys to ensure stability, but customers should see BlueGreen deploys succeed in all regions.
Jan 15, 18:44 UTC
Identified - The issue has been identified and a fix is being implemented.
Jan 15, 18:19 UTC
Update - We are seeing signs of recovery, with Bluegreen deployments succeeding for many customers. We are continuing to investigate the root cause of the issue.

Customers who still experience a Bluegreen deployment failure can retry using the rolling strategy with `fly deploy --strategy rolling`.

Jan 15, 18:11 UTC
Update - A temporary workaround for new deployments is to use rolling strategy: `fly deploy --strategy rolling`.
Jan 15, 14:41 UTC
Update - We are still investigating the issue.
Jan 15, 14:35 UTC
Investigating - When deploying with bluegreen strategy some green machines (new app version) won't pass healthchecks.
Temporary workaround: unless bluegreen is a must for your app you can temporarily deploy using a different strategy by `fly deploy --strategy NAME`.

Jan 15, 13:28 UTC
Resolved - We observed several periods where Machine creations in LHR resulted in authentication errors from 11 Jan to 15 Jan 2025. Customers creating new Machines in the region may have seen failures with:

failed to launch VM: permission_denied: bolt token: failed to verify service token: no verified tokens; token : verify: context deadline exceeded

The disruptions were caused by degraded connectivity to our token creation service from three hosts.

We deployed a preventative fix for the network issues on 15 Jan 2025 at 12:58 UTC.

Timestamps of occurrences (UTC):

2025-01-11 03:32 to 2025-01-11 04:11
2025-01-11 17:07 to 2025-01-11 17:54
2025-01-14 11:36 to 2025-01-14 12:14
2025-01-15 07:46 to 2025-01-15 09:49

Jan 15, 18:43 UTC
Jan 14, 2025

No incidents reported.

Jan 13, 2025

No incidents reported.

Jan 12, 2025

No incidents reported.

Jan 11, 2025
Resolved - This incident has been resolved.
Jan 11, 05:24 UTC
Monitoring - A fix has been implemented and we are monitoring the results.
Jan 11, 04:14 UTC
Investigating - We are currently investigating inbound network connectivity issues in SJC region. Users routed to SJC may be unable to access apps, or latency may be increased.
Jan 11, 03:20 UTC
Jan 10, 2025

No incidents reported.