Site Failure Recovery
Table of Contents
MinIO can make the loss of an entire site, while significant, a relatively minor incident. Site recovery depends on the replication option you use for the site.
Site Replication |
Total restoration of IAM configurations, bucket configurations, and data from the healthy peer site(s) |
Bucket Replication |
Data restoration of objects and metadata from a healthy remote location for each bucket configured for replication |
Data restoration of objects only from a healthy remote location with no versioning |
Site Replication
Site replication keeps two or more MinIO deployments in sync with IAM policies, buckets, bucket configurations, objects, and object metadata. If a peer site fails, such as due to a major disaster or long power outage, you can use the remaining healthy site(s) to restore the replicable data.
The following procedure can restore data in scenarios where site replication was active prior to the site loss.
Remove the failed site from the MinIO site replication configuration using the
mc admin replicate remove
command. For example, the following command removes a failed site with aliassiteB
from a site replication configuration that includes healthy site with aliassiteA
:mc admin replicate remove siteA siteB --force
Deploy a new MinIO site using the same
root
credentials as used on other deployments in the site replication configurationYou can use the original hardware from the failed site, if still available and functional, but you must first wipe any remaining data before creating the new site. Ensure you have fully remediated any issues that resulted in the original failure state prior to reusing the hardware.
Configure the new site with the same Identity Provider (IDp) as the other site(s)
Expand the existing site replication by adding the newly deployed site
Site replication healing automatically adds IAM settings, buckets, bucket configurations, and objects from the existing site(s) to the new site with no further action required.
You cannot configure site replication if any bucket replication rules remain in place on other healthy sites. Bucket replication is mutually exclusive with site replication.
If you are switching from using bucket replication to using site replication, you must first remove all bucket replication rules from the healthy site prior to setting up site replication.
Active Bucket Replication Resynchronization
For scenarios where bucket replication was in place prior to the failure, you can use mc replicate resync
to restore data to a new site.
Create a new site to replace the failed deployment, then synchronize the data from an existing, healthy, bucket replication-enabled deployment to the new site.
Deploy a new MinIO site
Set up IAM and users as needed
On the site with data, create a new
remote target
using themc admin bucket remote add
command and record the ARN from the outputFrom the site with the data, use the
mc replicate resync start
command with the ARN from the previous command to rebuild the bucket on the new siteWait for re-synchronization to complete (us
mc replicate resync status
to check)Set up bucket replication rule(s) from the new MinIO site to the existing target bucket(s)
(Optional) Delete the bucket replication rules from the target deployment(s) to restore an active-passive replication scenario
Passive Bucket Replication Resynchronization
Bucket replication can directly restore the site contents by performing a replication from the target bucket(s) to a new MinIO site.
As a passive process, bucket replication may not perform as quickly as desired for a site recovery scenario.
Using bucket replication relies on the standard replication scanner queue, which does not take priority over other processes.
For recovery procedures with stricter SLA/SLO, use the active bucket replication process with mc replicate resync
command as described above.
Bucket replication rules copy the object, its version ID, versions, and other metadata to the target bucket. MinIO can restore the object with all of these attributes to a new MinIO site if bucket replication had already been in use prior to the site loss.
Deploy a new MinIO site
Set up IAM and users as needed
On the remaining target bucket deployment(s), create bucket replication rule(s) for each bucket to the new MinIO site
Wait for replication to complete
Set up bucket replication rule(s) from the new MinIO site to the existing target bucket(s)
(Optional) Delete the bucket replication rules from the target deployment(s) to restore an active-passive replication scenario
Do not delete the bucket replication rules from the deployments used to recover data if you prefer to keep an active-active replication between the buckets. In active-active replication, changes to the objects at either location affect the objects at the other location.
Mirroring
MinIO’s mirroring copies an object from any S3 compatible storage system.
Mirroring only copies the latest version of each object and does not include versioning metadata, regardless of the source. You cannot restore those attributes with this method.
Use mc mirror
in situations where you need to restore only the latest version of an object.
Use bucket replication or site replication where those methods were already in use if you are copying from another MinIO deployment and wish to restore the object’s version history and version metadata.
Deploy a new MinIO site
Set up IAM and users as needed
Create buckets on the new site
Use the
mc cp
CLI command to copy the contents from the mirror location to the new MinIO site