Karpenter now supports Windows containers

by Marcio Morales and Bill Pfeiffer | on

Introduction

In November 2021, Amazon Web Services introduced Karpenter , an open-source high-performance Kubernetes Cluster Autoscaler licensed under the Apache License 2.0 . Karpenter helps improve your application availability and cluster efficiency by rapidly launching right-sized compute resources in response to changing application load. Since its release, we’ve been seeing an increase in customers migrating from Kubernetes Cluster Autoscaler to Karpenter. However, for customers running a heterogeneous Amazon Elastic Kubernetes Service ( Amazon EKS ) cluster with Windows workloads, it became a showstopper as Karpenter didn’t support Windows nodes, until now.

The OSS community did a great job starting development on Windows workloads in Karpenter. The Amazon Web Services team took it a step further to review the proposed design, add enhancements to improve the customer experience, and integrate it with our internal continuous integration (CI) process.

When Karpenter is installed in your cluster, it observes the aggregate resource requests of unscheduled pods and decides to launch new nodes when additional capacity is needed, while deciding to deprovision nodes when that capacity is no longer needed. By doing this, Karpenter reduces the scheduling latencies and infrastructure costs of your cluster.

Karpenter high-level scheduling

Figure 1: Karpenter high-level scheduling

In this post, we focus on scaling out/in Windows Server 2019 and Windows Server 2022 using Karpenter for Amazon EKS. To learn more about Karpenter architecture and components, access the Karpenter website.

Prerequisites

  • Ensure you are running eksctl commands with an Amazon Web Services Identify and Access Management ( Amazon Web Services IAM) profile that has permissions to create and manage Amazon EKS . This Amazon Web Services IAM security principal is used in the Getting Started section below for the Amazon Web Services Command Line Interface ( Amazon Web Services CLI ) configuration.
  • Ensure you are using eksctl v0.124.0 or higher to operate Karpenter.
  • Follow the Getting Started section in the Amazon EKS documentation to install aws cli , kubectl , and eksctl on your development machine.
  • Alternatively, you could leverage Cloud9 or Cloudshell to handle deployment and maintenance tasks.

Solution overview

  1. Create OS variables to be used throughout the post.
  2. Deploy Karpenter service requirements.
  3. Create an Amazon EKS cluster with the necessary iamIdentityMappings for Karpenter.
  4. Enable Amazon EKS Windows support.
  5. Install Karpenter with Helm.
  6. Create Karpenter provisioner and NodeTemplate.
  7. Test Karpenter for Windows – scale out.
  8. Test Karpenter for Windows – scale in.
  9. Cleanup test resources.

Walkthrough

1. Create OS variables to be used throughout the post

export KARPENTER_VERSION=v0-c990a2d9fb10c1bfeffd5c6af64bf8575536d67e
export AWS_PARTITION="aws"
export CLUSTER_NAME="windows-karpenter-demo"
export AWS_DEFAULT_REGION="us-west-2"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export TEMPOUT=$(mktemp)

2. Create Karpenter service requirements

Karpenter directly integrates with the Amazon Elastic Compute Cloud ( Amazon EC2 ) API endpoint to take specific actions based on events such as spot interruption or instance state changes. The following command automatically deploys the necessary Amazon Web Services services/components such as Amazon EventBridge rules applied to messages being sent over an Amazon SQS queue using Amazon Web Services CloudFormation .

curl -fsSL https://karpenter.sh/v0.29/getting-started/getting-started-with-karpenter/cloudformation.yaml  > $TEMPOUT \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

Upon successful execution of the Amazon Web Services CloudFormation template , you’ll be presented with the following output:

Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - Karpenter-windows-karpenter-demo

3. Create an Amazon EKS cluster with the necessary iamIdentityMappings for Karpenter

Next, we deploy a temporary Amazon EKS cluster using eksctl in order to test Karpenter integration with Windows. The necessary Amazon Web Services IAM and IdentityMapping are created as ServiceAccounts and added to the Kubernetes ConfigMap.

eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "1.27"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

iam:
  withOIDC: true
  serviceAccounts:
  - metadata:
      name: karpenter
      namespace: karpenter
    roleName: ${CLUSTER_NAME}-karpenter
    attachPolicyARNs:
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}
    roleOnly: true

iamIdentityMappings:
- arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
  username: system:node:{{EC2PrivateDNSName}}
  groups:
  - system:bootstrappers
  - system:nodes

managedNodeGroups:
- instanceType: m5.large
  amiFamily: AmazonLinux2
  name: ${CLUSTER_NAME}-linux-ng
  desiredCapacity: 2
  minSize: 1
  maxSize: 10
EOF

export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)"
export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
echo $CLUSTER_ENDPOINT $KARPENTER_IAM_ROLE_ARN

eksctl uses Amazon Web Services CloudFormation to create all the necessary resources to build an Amazon EKS cluster. Upon successful creation of your cluster, you’ll be presented with a similar output. If the cluster creation fails, then the failure reason is provided in the Amazon Web Services CLI output (or Amazon Web Services CloudFormation console).

2023-06-14 06:20:19 [✔]  all EKS cluster resources for "windows-karpenter-demo" have been created
2023-06-14 06:20:19 [ℹ]  nodegroup "windows-karpenter-demo-linux-ng" has 2 node(s)
2023-06-14 06:20:19 [ℹ]  node "ip-192-168-12-160.ec2.internal" is ready
2023-06-14 06:20:19 [ℹ]  node "ip-192-168-53-156.ec2.internal" is ready
2023-06-14 06:20:19 [ℹ]  waiting for at least 1 node(s) to become ready in "windows-karpenter-demo-linux-ng"
2023-06-14 06:20:19 [ℹ]  nodegroup "windows-karpenter-demo-linux-ng" has 2 node(s)
2023-06-14 06:20:19 [ℹ]  node "ip-192-168-12-160.ec2.internal" is ready
2023-06-14 06:20:19 [ℹ]  node "ip-192-168-53-156.ec2.internal" is ready
2023-06-14 06:20:20 [ℹ]  kubectl command should work with "/Users/bpfeiff/.kube/config", try 'kubectl get nodes'
2023-06-14 06:20:20 [✔]  EKS cluster "windows-karpenter-demo" in "us-east-1" region is ready

4. Enable Amazon EKS Windows support

To deploy Windows nodes to our cluster, we need to enable Amazon EKS Windows support .

kubectl apply -f - <<EOF
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: amazon-vpc-cni
  namespace: kube-system
data:
  enable-windows-ipam: "true"
EOF

5. Install Karpenter with Helm

Next, we will use Helm to install Karpenter.

# Logout of helm registry to perform an unauthenticated pull against the public ECR
helm registry logout public.ecr.aws

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
  --set settings.aws.clusterName=${CLUSTER_NAME} \
  --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
  --set settings.aws.interruptionQueueName=${CLUSTER_NAME} \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait

Upon successful installation, you‘ll see the following output.

Release "karpenter" does not exist. Installing it now.
Pulled: public.ecr.aws/karpenter/karpenter:v0-c990a2d9fb10c1bfeffd5c6af64bf8575536d67e
Digest: sha256:33e2597488e3359653515bb7bd43a4ed6c1e811cb95c261175f8808a9ea4fc97
NAME: karpenter
LAST DEPLOYED: Wed Jun 14 08:16:36 2023
NAMESPACE: karpenter
STATUS: deployed
REVISION: 1
TEST SUITE: None

6. Create provisioner as required

Now we create two Karpenter provisioners to support Windows Server 2019 and Windows Server 2022 in the same Amazon EKS cluster. The Karpenter provisioner sets constraints on the nodes that can be created by Karpenter and the pods that can run on those nodes.

cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: windows2019
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: kubernetes.io/os
      operator: In
      values: ["windows"]      
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: windows2019
  ttlSecondsAfterEmpty: 30
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: windows2019
spec:
  subnetSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME}
  securityGroupSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME}
  amiFamily: Windows2019
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required
---
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: windows2022
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: kubernetes.io/os
      operator: In
      values: ["windows"]      
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: windows2022
  ttlSecondsAfterEmpty: 30
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: windows2022
spec:
  subnetSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME}
  securityGroupSelector:
    karpenter.sh/discovery: ${CLUSTER_NAME}
  amiFamily: Windows2022
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required    
EOF

7. Scale out the deployment

We now have our Amazon EKS cluster prepped for running Windows nodes and all the necessary components of Karpenter. We scale a sample application to see Karpenter automatically add nodes to the Amazon EKS cluster based on demand.

7.1 Run the following code to create your Windows Server 2022 sample application.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: windows-server-iis-simple-2022
spec:
  selector:
    matchLabels:
      app: windows-server-iis-simple-2022
      tier: backend
      track: stable
  replicas: 0
  template:
    metadata:
      labels:
        app: windows-server-iis-simple-2022
        tier: backend
        track: stable
    spec:
      containers:
        - name: windows-server-iis-simple-2022
          image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2022
          imagePullPolicy: IfNotPresent
          command:
            - powershell.exe
            - -command
            - while(1){sleep 2; ping -t localhost;}
      nodeSelector:
        kubernetes.io/os: windows
        node.kubernetes.io/windows-build: 10.0.20348
EOF

The Windows Server version used by each pod must match that of the node. If you want to use multiple Windows Server versions in the same cluster, then you should set additional node labels and nodeSelector fields. Kubernetes automatically adds a label to the Windows node, named node.kubernetes.io/windows-build to simplify this.

This label reflects the Windows major, minor, and build number that need to match for compatibility. Here are values used for each Windows Server version:

 

Product Name Version
1 Windows Server 2019 10.0.17763
2 Windows Server 2022 10.0.20348

Based on the build version specified in the Pod nodeSelector, Karpenter launches new Windows nodes with the operating system accordingly. For example, if the build version is specified as 10.0.17763, then Karpenter uses the Windows 2019 provisioner to launch Windows nodes. For more information, please refer to the Guide for Running Windows Containers in Kubernetes .

7.2 Run the following command to scale your Windows Server 2022 sample application.

kubectl scale deployment windows-server-iis-simple-2022 --replicas 10

7.3 You can use the Karpenter logs to track the scaling progress.

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

The following output shows the Windows Server 2022 Karpenter provisioner scaling from 0 nodes to 1 to support the 10 replicas we requested be run.

2023-06-14T12:19:01.581Z        INFO    controller.machine_lifecycle    launched machine        
{"commit": "c990a2d", "machine": "windows2022-4hq46", "provisioner": "windows2022",
 "provider-id": "aws:///us-east-1f/i-039507775a01898e6", "instance-type": "c6a.xlarge",
  "zone": "us-east-1f", "capacity-type": "on-demand", "allocatable": {"cpu":"3920m",
  "ephemeral-storage":"44Gi","memory":"6012Mi","pods":"110","vpc.amazonaws.com/
  PrivateIPv4Address":"14"}}

7.4 Run the following command to track the deployment progress of your pods.

kubectl rollout status deploy/windows-server-iis-simple-2022

You’ll see the 10 replicas being created on our new Karpenter provisioned Windows worker nodes.

Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 0 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 1 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 2 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 3 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 4 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 5 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 6 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 7 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 8 of 10 updated replicas are available...
Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 9 of 10 updated replicas are available...
deployment "windows-server-iis-simple-2022" successfully rolled out

7.5 Run the following code to scale out your Windows Server 2019 deployment.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: windows-server-iis-simple-2019
spec:
  selector:
    matchLabels:
      app: windows-server-iis-simple-2019
      tier: backend
      track: stable
  replicas: 0
  template:
    metadata:
      labels:
        app: windows-server-iis-simple-2019
        tier: backend
        track: stable
    spec:
      containers:
        - name: windows-server-iis-simple-2019
          image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019
          imagePullPolicy: IfNotPresent
          command:
            - powershell.exe
            - -command
            - while(1){sleep 2; ping -t localhost;}
      nodeSelector:
        kubernetes.io/os: windows
        node.kubernetes.io/windows-build: 10.0.17763
EOF

7.6 Run the following command to scale your Windows Server 2019 sample application.

kubectl scale deployment windows-server-iis-simple-2019 --replicas 10

A new Windows Server 2019 worker is launched by Karpenter as more pods are requested to be scheduled. This process is identical to Windows Server 2022 and you can reuse the steps above to track the progress of launching Windows Server 2019 worker node.

8. Scale in the deployment

Karpenter handles scale out and scale in of Windows nodes based on demand. We’ll now tear down our sample applications and watch Karpenter terminate our Windows nodes.

8.1 Run the following commands to delete your sample application deployments.

kubectl delete deployment windows-server-iis-simple-2022
kubectl delete deployment windows-server-iis-simple-2019

The Windows instances launched earlier by Karpenter will now be terminated. You can use the Karpenter logs to track the scale down progress.

kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

Once all pods have been terminated, Karpenter deletes all idle instances.

2023-06-20T16:27:12.878Z        DEBUG   controller.node added TTL to empty node {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal", "provisioner": "windows2022"}
2023-06-20T16:27:15.140Z        DEBUG   controller.node added TTL to empty node {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal", "provisioner": "windows2019"}
2023-06-20T16:27:42.051Z        INFO    controller.deprovisioning       deprovisioning via emptiness delete, terminating 1 machines ip-192-168-99-4.ec2.internal/c6a.xlarge/on-demand       {"commit": "c990a2d"}
2023-06-20T16:27:42.138Z        INFO    controller.termination  cordoned node   {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal"}
2023-06-20T16:27:42.478Z        INFO    controller.termination  deleted node    {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal"}
2023-06-20T16:27:42.751Z        INFO    controller.machine_termination  deleted machine       {"commit": "c990a2d", "machine": "windows2022-4hq46", "node": "ip-192-168-99-4.ec2.internal", "provisioner": "windows2022", "provider-id": "aws:///us-east-1f/i-039507775a01898e6"}
2023-06-20T16:27:54.105Z        INFO    controller.deprovisioning       deprovisioning via emptiness delete, terminating 1 machines ip-192-168-88-252.ec2.internal/c6a.xlarge/on-demand     {"commit": "c990a2d"}
2023-06-20T16:27:54.177Z        INFO    controller.termination  cordoned node   {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal"}
2023-06-20T16:27:54.480Z        INFO    controller.termination  deleted node    {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal"}
2023-06-20T16:27:54.754Z        INFO    controller.machine_termination  deleted machine       {"commit": "c990a2d", "machine": "windows2019-khmc5", "node": "ip-192-168-88-252.ec2.internal", "provisioner": "windows2019", "provider-id": "aws:///us-east-1a/i-0978aeb1680f37d7c"}
2023-06-20T16:31:21.596Z        DEBUG   controller.awsnodetemplate      discovered subnets    {"commit": "c990a2d", "awsnodetemplate": "windows2019", "subnets": ["subnet-05d7fed709f082b75 (us-east-1a)", "subnet-0109ebad1a6808805 (us-east-1f)", "subnet-0ff0ebe5e1a8630f1 (us-east-1a)", "subnet-0d01b14a3e9c91d1f (us-east-1f)"]}
2023-06-20T16:33:19.192Z        DEBUG   controller.deprovisioning       discovered subnets    {"commit": "c990a2d", "subnets": ["subnet-05d7fed709f082b75 (us-east-1a)", "subnet-0109ebad1a6808805 (us-east-1f)", "subnet-0ff0ebe5e1a8630f1 (us-east-1a)", "subnet-0d01b14a3e9c91d1f (us-east-1f)"]}      discovered instance types     {"commit": "c990a2d", "count": 649}

Cleaning up

When you’ve finished, clean up the resources associated with the example cluster deployment to avoid incurring unwanted charges.

eksctl delete cluster --name ${CLUSTER_NAME} --region us-west-2

If this command times out, then you can run the command above again to show the cluster has been successfully removed.

Conclusion

In this post, we showed you can leverage Karpenter to seamlessly scale out/in your Windows worker nodes on Amazon EKS. Customers no longer need to maintain two auto-scaler solutions on a heterogeneous Amazon EKS cluster with Windows and Linux nodes.

A big shout-out to topikachu , who proactively started the development of the add-on.