Skip to content

Troubleshooting

This section explains the frequently occurred errors during cluster provision


Resource Provisioning Failures

Scenario 1: Instance Type Not supported

The below error is an example that might occur at the time of cluster provision or adding a new nodegroup to the existing cluster

Error 1

Validation

To overcome this issue, perform the below validations for instance types in a region:

  • Check your Cloud Credentials (roles based or access id or secret) has the required permission to call ec2 AWS APIs. If the Cloud Credentials are role based, ensure all the appropriate IAM Policies are met
  • Check whether the configuration has an instance type that is not available in the selected region

Scenario 2: Availability Zones

The below error is an example that might occur when the Cloud credentials does not have permission to create resources in the selected region during EKS cluster provision

Error 2

Validation

Validate the permissions of the cloud credentials used for cluster provisioning to create the resources in that configured region


Scenario 3: Instance Type Permission

The below error is an example that might occur when the cloud credentials do not have permission to use a particular instance type, used in the EKS cluster configuration

Error 2

Validation

  • Check for permission and use the right instance type for the cloud credentials
  • Rectify the permission on AWS to use the required configured instance type

Scenario 4: K8s version upgrade

During the k8s version upgrade to 1.25, the below error occurs if the aws-load-balancer-controller version is 2.4.6. The upgrade gets halted and the preflight check fails

Error 2

Validation

Update the aws-load-balancer-controller to version v2.4.7 and then upgrade the k8s version to 1.25


Scenario 5: Removal of PSPs

The below error is an example that might occur when PSPs are found during the k8s version upgrade to 1.25.

Error 2

Validation

PSPs are no longer supported in k8s v1.25, hence remove the PSPs and upgrade again


AWS Cloud Errors

When provisioning an EKS cluster, it might fail due to various AWS Cloud errors. These errors can stem from resource limitations, network connectivity issues, misconfigurations in the provisioning process, insufficient permissions, service outages impacting required AWS services, software bugs, and region-specific constraints. These factors can disrupt the EKS cluster provisioning process and necessitate troubleshooting to identify and resolve the underlying issues for successful deployment.

To gain insight into the failure and its underlying cause, click on Provision Status of the failed cluster

Error 2

Expand the Cloud Error(s) section to access detailed information about AWS CloudFormation errors. This action will provide specific details regarding the encountered issues during the cluster provisioning process, enabling you to identify the root cause and take appropriate remedial actions for successful deployment.

Error 2