Emergency hardware maintenance in LAX region

Incident Report for Fly.io

Resolved

This incident has been resolved.
Posted Feb 19, 2022 - 02:22 UTC

Monitoring

Applications have been restored, logs and metrics are coming back now.
Posted Feb 19, 2022 - 01:51 UTC

Update

The disk array is coming up now, we will slowly start restoring apps over the next hour.
Posted Feb 19, 2022 - 01:21 UTC

Update

Hardware is reinstalled and we're now working through configuration changes necessary to bring it online.
Posted Feb 18, 2022 - 23:50 UTC

Identified

We are still working to get hardware reinstalled.
Posted Feb 18, 2022 - 23:43 UTC

Investigating

We are performing emergency hardware maintenance in LAX. We will be taking disk arrays offline to migrate them to a new, more reliable datacenter.

We expect this migration to take 2-4 hours. The impact on applications will vary:

1. If you are running redundant Postgres, your database will remain online with a single node running for the duration of the maintenance.
2. If you are running individual volumes with no redundancy, your app may be unavailable during the migration.
3. If you are not using volumes in LAX, your application will remain online.
Posted Feb 18, 2022 - 21:06 UTC
This incident affected: Regional Availability (LAX - Los Angeles, California (US)).