We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Quick Restoration through Replacing the Root Volumes of Amazon EC2 instances
This blog post is written by Katja-Maja Krödel, IoT Specialist Solutions Architect, and Benjamin Meyer, Senior Solutions Architect, Game Tech.
Customers use
The feature of
In this post, we show you how to design your architecture for automated Root Volume Replacement using this Amazon EC2 feature. We start with the automated snapshot creation, continue with automatically replacing the root volume, and finish with how to keep your environment clean after your replacement job succeeds.
What is Root Volume Replacement?
Amazon EC2 enables customers to replace the root
- A new EBS volume is created from a previously taken snapshot or the launch state
- Reboot of the instance
- While rebooting, the current root volume is detached and the new root volume is attached
The previous EBS root volume isn’t deleted and can be attached to an instance for later investigation of the volume. If replacing to a different state of the EBS than the launch state, then a snapshot of the current root volume is used.
An example use case is a continuous integration/continuous deployment (CI/CD) System that builds on EC2 instances to build artifacts. Within this system, you could alter the installed tools on the host and may cause failing builds on the same machine. To prevent any unclean builds, the introduced architecture is used to clean up the machine by replacing the root volume to a previously known good state. This is especially interesting for EC2 Mac Instances, as their Dedicated Host won’t undergo
Overview
The feature of replacing Root Volumes was introduced in April 2021 and has just been Feb. 3, 2023 extended to work for Bare Metal EC2 Mac Instances. This means that EC2 Mac Instances are included. If you want to reset an EC2 instance to a previously known good state, then you can
In the case that you use a snapshot to create a new root volume, you must take a new snapshot of that volume to be able to get back to that state later on. You can’t use a snapshot of a different volume to restore to, which is the reason that the architecture includes the automatic snapshot creation of a fresh root volume.
The architecture is built in three steps:
- Automation of Snapshot Creation for new EBS volumes
- Automation of replacing your Root Volume
- Preparation of the environment for the next Root Volume Replacement
The following diagram illustrates the architecture of this solution.
In the next sections, we go through these concepts to design the automatic Root Volume Replacement Task.
Automation of Snapshot Creation for new EBS volumes
The figure above illustrates the architecture for automatically creating a snapshot of an existing EBS volume. In this architecture, we focus on the automation of creating a snapshot whenever a new EBS root volume is created.
createVolume
event. For automated reaction to the event, you can add a rule to the EventBridge which will forward the event to an
{
"source": ["aws.ec2"],
"detail-type": ["EBS Volume Notification"],
"detail": {
"event": ["createVolume"]
}
}
An example event is emitted when an EBS root volume is created, which will then invoke the Lambda function to look like this:
{
"version": "0",
"id": "01234567-0123-0123-0123-012345678901",
"detail-type": "EBS Volume Notification",
"source": "aws.ec2",
"account": "012345678901",
"time": "yyyy-mm-ddThh:mm:ssZ",
"region": "us-east-1",
"resources": [
"arn:aws:ec2:us-east-1:012345678901:volume/vol-01234567"
],
"detail": {
"result": "available",
"cause": "",
"event": "createVolume",
"request-id": "01234567-0123-0123-0123-0123456789ab"
}
}
The code of the function uses the resource ARN within the received event and requests resource
The following is a summary of the tasks of the Lambda function:
- Extract the EBS ARN from the EventBridge Event
- Verify that it’s a root volume of an EC2 Instance
- Call the Amazon EC2 API
create-snapshot
to create a snapshot of the root volume and add a tagreplace-snapshot=true
Then, the tag is used to clean up the environment and get rid of snapshots that aren’t needed.
As an alternative, you can emit your own event to EventBridge. This can be used to automatically create snapshots to which you can restore your volume. Instead of reacting to the createVolume
event, you can use a customized approach for this architecture.
Automation of replacing your Root Volume
The figure above illustrates the procedure of replacing the EBS root volume. It starts with the event, which is created through the
To invoke the create-replace-root-volume-task
, you can call the Amazon EC2 API with the following
aws ec2 create-replace-root-volume-task --instance-id <value> --snapshot <value> --tag-specifications ResourceType=string,Tags=[{Key=replaced-volume,Value=true}]
If you want to restore to launch state, then omit the --snapshot
parameter:
aws ec2 create-replace-root-volume-task --instance-id <value> --tag-specifications ResourceType=string,Tags=[{Key=delete-volume,Value=true}]
After running this command, Amazon Web Services will create a new EBS volume, add the tag to the old EBS replaced-volume=true
, restart your instance, and attach the new volume to the instance as the root volume. The tag is used later to detect old root volumes and clean up the environment.
If this is combined with the earlier explained automation, then the automation will immediately take a snapshot from the new EBS volume. A restore operation can only be done to a snapshot of the current EBS root volume. Therefore, if no snapshot is taken from the freshly restored EBS volume, then no restore operation is possible except the restore to launch state.
Preparation of the Environment for the next Root Volume Replacement
After the task is completed, the old root volume isn’t removed. Additionally, snapshots of previous root volumes can’t be used to restore current root volumes. To clean up your environment, you can schedule a Lambda function which does the following steps:
- Delete detached EBS volumes with the tag
delete-volume=true
- Delete snapshots with the tag
replace-snapshot=true
, which aren’t associated with an existing EBS volume
Conclusion
In this post, we described an architecture to quickly restore EC2 instances through Root Volume Replacement. The feature of
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.