We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Automate Packet Acceleration configuration using DPDK on Amazon EKS
Packet acceleration is a key component in achieving performance efficiency for your workloads. Packet acceleration lets you optimize your workloads to get maximum throughput, high network bandwidth, minimal latency, and reduced cost. Industries such as Telco, Media & Entertainment, Robotics, and Internet of Things (IOT) have been key users of packet acceleration using
SRIOV enables hardware-based acceleration in a virtualized environment that provides higher I/O performance, lower CPU utilization, higher packet per second (PPS) performance, and lower latency. DPDK provides software-based development kit, which bypasses the operating system (OS) kernel and reduces packet processing overhead, resulting in performance improvement and lower latency.
In this blog, we explain in detail how to achieve packet acceleration for your Kubernetes workloads on
Overview
SRIOV is supported in
Now that you have SRIOV enabled AMI and instance, the automation steps to enable DPDK on your Amazon EKS worker nodes follow.
Prerequisites
The sample
- VPC ID
- WokerNode Primary Interface (eth0) Subnet ID : This is the primary subnet for worker node (used for the Kubernetes primary interface, i.e., aws-vpc-cni).
- Multus Subnets Groups ID: The list of multus subnets used to create the DPDK/Multus interfaces. Interfaces are created on the worker node in same order as subnet list.
- DPDK S3 Bucket: This is where the DPDK scripts are called during the userdata initialization. The scripts can be found at the
s3-content/user-data-support-files folder that is included in thegit repo. - Multus S3 Bucket: This is where the zipped Multus
Amazon Web Services Lambda function code is stored. - EKS Cluster: The value that is required is the Amazon EKS Cluster Name.
- EKS Security Group: The security group ID that was created alongside the Amazon EKS cluster.
DPDK setup on worker nodes
Your DPDK enabled workloads, have the DPDK library to achieve packet acceleration. The underlying worker nodes need to configure and automate below steps with
- Install Linux libraries, ENA compatible drivers, and patches needed for DPDK and huge pages setup
- Linux sysctl configuration as per the workload requirement
- Configure any hugepages configuration and setup CPUAffinity for the system based on your NUMA configuration
- Enabling secondary interfaces used by your workloads
- Bind your secondary interfaces with DPDK drivers, such as vfio-pci driver
- Prepare the config file for
SRIOV-DP plugin , which maps the interface’s pciAddress to a resource name
Based on your requirements, you can choose either of the two options mentioned in the following section to prepare your worker nodes.
Option 1: Pre-built AMI with DPDK setup
Figure 1: Create DPDK AMI from EKS Optimized AMI
In this option, as shown in the preceding figure, you would prepare the custom-built AMI with all required patches, configuration and OS level setups. You can use the Amazon EKS Optimized Amazon Linux AMI or any other ENA driver-enabled Linux AMI as the base AMI, and then create an EC2 instance. Once the node is up, you can perform Steps 1-3 mentioned in the DPDK Configuration section either manually or with the userdata of your EC2 instance. After the installation and configuration steps are completed, you can prepare a new AMI from the root disk (
Figure 2: Create EKS Nodegroup with the pre-built DPDK AMI
For your Amazon EKS worker node preparation, you can use this pre-built AMI and perform Steps 4—6 mentioned in the DPDK Configuration section. As these steps are just configuration steps, user data execution is fast. It still gives you flexibility to decide on the number of interfaces to be DPDK bound, and you can build it with the planned resource names used by your workloads through SRIOV-DP plugin.
The advantage of this approach is fast boot time, as the installation steps (1-3) are already baked into the pre-built custom AMI. Furthermore, it has pre-downloaded packages, so there’s no need to manage/copy the patches on
The steps to build a sample custom AMI has been provided in the
Option 2: On-demand DPDK installation and configuration
In this option, you take the latest Amazon EKS Optimized Amazon Linux AMI, or any other ENA driver-enabled Linux AMI, and deploy your worker nodes. You automate all the necessary steps as mentioned in the preceding ‘DPDK Configuration’ section with the userdata of your Launch template.
You can store your packages, patches, helper, and configuration scripts in a private S3 bucket. During instantiation, EKS Worker node downloads these files from the S3 bucket and utilizes it for patch installation, DPDK setup, and system configuration. This option also gives you the flexibility to decide on the number of interfaces to be DPDK bound, and you can build it with the planned resource names used by your workloads through the SRIOV-DP plugin.
Figure 3: Create EKS Nodegroup with EKS-Optimized AMI and install DPDK patches with userdata
The advantage of this approach is that you don’t need custom-built AMIs, and your system is prepared dynamically with the latest base AMI for each of the different use cases. Furthermore, you don’t have to manage multiple custom-built AMIs, share across multiple accounts, keep up-to-date with security patches on the base AMI, or maintain multiple versions of the AMI. Different requirements, such as hugepage configuration and the DPDK version, are handled via automation dynamically. The disadvantage of this approach is a slight increased boot time compared to the custom-built AMI, as patches are getting installed at the bring-up.
Refer to the
Amazon EKS cluster adaptation for DPDK enabled workloads
Amazon Web Services-Auth Configmap for NodeGroup
After the CloudFormation stack for the worker node has completed, you must add the instance role of the worker node to the Amazon EKS cluster aws-auth configmap. This is needed for the worker node to join the Amazon EKS cluster. Use the following command syntax to add the instance role (you can get the instance role ARN from the output section of the CloudFormation stack).
cat <<EOF | kubectl apply -f-
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: arn:aws:iam::xxxxxxxx:role/NG-workers-NodeInstanceRole-XXXXX
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
EOF
SRIOV device plugin setup
The SRIOV device plugin daemonset manifest can be found at
- hostPath:
path: /etc/pcidp/config.json
type: ""
name: config-volume
/etc/pcidp/config.json is created during the nodegroup creation via the /opt/dpdk/dpdk-resource-builder.py script in the userdata. You can refer to the CloudFormation template in the
You can control the deployment of the SRIOV device plugin daemonset by using a node selector in its definition. Make sure that you apply that nodeselector as a label on your DPDK enabled worker nodes. This is helpful in cases where only some of the nodegroups have DPDK enabled. For a non-DPDK based nodegroup, which don’t have /etc/pcidp/config.json on the worker node, this causes the failure in the SRIOV device daemonset pod running on that worker.
In the CloudFormation sample, an example node selector sriov=enabled is used as an additional Kubernetes label on the DPDK based worker nodegroup. The following snippet reflects the node selector for the
nodeSelector:
beta.kubernetes.io/arch: amd64
sriov: enabled
DPDK workload deployment on Amazon EKS
Deploy sample application to validate the setup
In the below example VPP (Vector Packet Processor) CNF that is used. VPP is a popular packet processing software that has support of DPDK for the packet processing. The architecture of how the VPP CNF runs on the Amazon EKS cluster is shown in the following figure:
Figure 4: Sample VPP POD consuming DPDK enabled interfaces with SRIOV-DP Plugin
A helm chart has been created with the necessary parameters to deploy a sample VPP POD. The status of the VPP pod deployment after installation is given in the following section.
- The following command shows the environment variables that confirm that the sriov-device plugin has injected the respective PCI Interface Addresses from DPDK interfaces.
kubectl -n dpdk exec -ti deploy/core-vpp-dpdk printenv | grep PCI
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE_3=0000:00:08.0
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE_1=0000:00:06.0
PCIDEVICE_INTEL_COM_INTEL_SRIOV_NETDEVICE_2=0000:00:07.0
The procedure to deploy the VPP CNF can be found at the
Cleanup
To avoid incurring future charges, delete the deployed resources in this blog:
- Delete the private DPDK AMI from EC2 console.
- Go to CloudFormation console and Delete the EKS Node group CloudFormation stack.
Conclusion
In this blog, we presented two options to automate the DPDK configuration for your Amazon EKS workloads to achieve packet acceleration on Amazon EKS. With the automated approach you get the flexibility to configure the interfaces with DPDK, or use them as Multus IPVLAN interfaces.
Try the sample application from the
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.