An EKS cluster consists of two VPCs:
- The first VPC managed by AWS that hosts the Kubernetes control plane and
- The second VPC managed by customers that hosts the Kubernetes worker nodes (EC2 instances) where containers run, as well as other AWS infrastructure (like load balancers) used by the cluster.
All worker nodes need the ability to connect to the managed API server endpoint. This connection allows the worker node to register itself with the Kubernetes control plane and to receive requests to run application pods.
The worker nodes connect through the EKS-managed elastic network interfaces (ENIs) that are placed in the subnets that you provide when you create the cluster.
Amazon EKS nodegroups are immutable by design i.e. once created, it is not possible to change its type (managed/unmanaged), the AMI or instance type.
Nodegroups can be scaled up/down anytime. The same EKS cluster can have "multiple" nodegroups to accommodate different type of workloads. A nodegroup can have mixed instance types when configured to use Spot.
Users can use the Controller to provision Amazon EKS Clusters with either "Self Managed" or "AWS Managed" nodegroups.
Comparing Node Group Types¶
|Feature||Self Managed||AWS Managed|
|Custom Security Group Rules||Yes||Limited|
|Custom SSH Auth||Yes||Limited|
Users can select from multiple Node AMI family types for the nodegroup. In addition, users can also bring their own "Custom AMI".
|Node AMI Family|
|Amazon Linux2, Ubuntu18.04, Ubuntu 20.04, Bottlerocket|
Self Managed Node Groups¶
Self Managed node groups are essentially user provisioned EC2 instances or Auto Scaling Groups that are registered as worker nodes to the EKS control plane.
To provision EC2 instances as EKS workers, you need to ensure that the following criteria is satisifed:
- The AMI has all the components installed to act as Kubernetes Nodes (i.e. kubelet, container engine at min)
- The associated Security Group needs to allow communication with the Control Plane and other Workers in the cluster.
- User data or boot scripts of the instances need to include a step to register with the EKS control plane.
- The IAM role used by the worker nodes are registered users in the cluster.
On EKS optimized AMIs, the user data is handled by the bootstrap.sh script installed on the AMI.
The Controller streamlines and automates all these steps as part of the provisioning process essentially providing a custom, managed experience for users.
Self managed node groups do not benefit from any managed services provided by AWS. The user needs to configure everything including the AMI to use, Kubernetes API access on the node, registering nodes to EKS, graceful termination, etc. The Controller helps streamline and automate the entire workflow.
On the flip side, self managed node groups give users the most flexibility in configuring their worker nodes. Users have complete control over the underlying infrastructure and can customize all the nodes to suit their preference.
Managed Node Groups¶
Managed Node Groups automate the provisioning and lifecycle management of the EKS cluster's worker nodes. With this configuration, AWS takes on the operational burden for the following items:
- Running the latest EKS optimized AMI.
- Gracefully draining nodes before termination during a scale down event.
- Gracefully rotate nodes to update the underlying AMI.
- Apply labels to the resulting Kubernetes Node resources.
While Managed Node Groups provides a managed experience for the provisioning and lifecycle of EC2 instances, they do not configure horizontal auto-scaling or vertical auto-scaling.
Managed Node Groups also do not automatically update the underlying AMI to handle OS patches or Kubernetes version updates. The user still needs to manually trigger a Managed Node Group update.
With Managed Node Groups
- Users do not have control over the underlying AMI.
- Only the EKS optimized Amazon Linux 2 AMIs are supported.
- SSH access is possible only with an EC2 Key Pair i.e. you have to use a single, shared key pair for all SSH access.
Users do not have the ability to set a user data script, or update the underlying packages installed in the AMI as the instances are booting
Users have limited control over the security group rules for remote access.
i.e. when you specify an EC2 key pair on the Managed Node Group, by default the security group automatically opens access to port 22 to the whole world (0.0.0.0/0).
You can further restrict access by specifying source security group IDs, but you do not have the option to restrict CIDR blocks. This makes it hard to expose access over a peered VPC connection or Direct Connect, where the security group may not live in the same account.
Node Group Lifecycle¶
Amazon EKS Clusters provisioned by the Controller starts life with one node group. Additional node groups can be added after initial provisioning. Users can also use the Controller to perform actions on node groups.
View Node Group Details¶
Click on the nodegroup to view all the nodegroups and their details. In the example below, as you can see, the EKS cluster has one nodegroup.
Scale Node Group¶
Click on the gear on the far right on a node group to view available actions for a node group.
This will present the user with a prompt for "desired" number of worker nodes. Depending on what is entered, the node group will be either "Scaled Up" or "Scaled Down"
Scaling a node group can take ~5 minutes to ensure that the ec2 instances are provisioned, fully operational and attached to the cluster. The user is provided with feedback and status. Illustrative screenshot below
Scaling down a node group does not explicitly drain the node before removing the nodes from the Auto Scaling Group (ASG). Pods running on the node are terminated and will be restarted by Kubernetes on available nodes.
Add Node Group¶
Click on "Add Node Group" on the far right. The user will be presented with a configuration screen for nodegroups. Enter the required details and Click on Add.
Adding a new nodegroup can take ~5 minutes to ensure that the ec2 instances are provisioned, fully operational and attached to the cluster. The user is provided with feedback and status. Illustrative screenshot below
Drain Node Group¶
When the user drains a node group, the nodes are cordoned. This ensures that existing pods are relocated from these nodes and new pods cannot be scheduled on these nodes.
The user is provided a warning before the node group is drained.
Draining a node group can take a few minutes. The user is provided with feedback and status once this is completed. Illustrative screenshot below
Users can leave a node group in a "drained" state for extended periods of time.
Delete Node Group¶
When the user deletes a node group, the Controller ensures that the node group is drained first before it is deleted.
Deleting a node group can take ~5 minutes to ensure that the ec2 instances are deprovisioned and the CF templates appropriately reconciled. The user is provided with feedback and status during this process. Illustrative screenshot below