Troubleshooting
This section explains the frequently occurred errors during GKE cluster provision
Scenario 1: Invalid credentials/project details¶
The below error is an example that might occur at the time of provisioning of a GKE cluster without enabling Compute Engine API in the newly created GCP project
Validation¶
To overcome this issue, perform the below validations for instance types in a region:
- Ensure the credentials are valid via the controller in the Cloud Credentials page
- Rectify the project name created in the GCP console
Scenario 2: Cluster's control plane IP range¶
The below error is an example that might occur when the cluster's control plane IP range is not 28-bit.
Validation¶
- On setting the cluster privacy to Private, specify the control plane IP range of 28-bit
- Cross-verify the Control plane IP range field and specify CIDR of a 28 bit. This field indicates an internal IP address range for the control plane
Scenario 3: Invalid Region or Zone¶
The below error is an example that might occur when providing an invalid region or zone details
Validation¶
Edit and rectify the region and zone details. Ensure to specify valid zones in the chosen region
Scenario 4: Mismatch Between GCP Reservation and Requested Cluster¶
The following failure error occurs with a warning when there is a mismatch between the GCP Reservation, which is in a different zone, and the requested cluster, with the location type specified as zonal in a different zone.
Validation¶
- Review the specified zones for the GCP Reservation and the requested cluster
- Adjust either the GCP Reservation or the cluster's specified location to ensure they are in the same zone
- Ensure consistency in the specified zones to prevent the mismatch error
Scenario 5: Insufficient Capacity in GCP Reservation¶
Also, the below failure error occurs with a warning when the capacity of VMs in the GCP reservation is insufficient to meet the requested number of nodes in a node pool.
Validation¶
- Increase the capacity of VMs in the GCP Reservation to accommodate the requested number of nodes
- Review and adjust the configurations of the node pool, ensuring it aligns with the available capacity in the GCP Reservation
- Consider optimizing the usage of resources or upgrading the GCP Reservation for increased capacity
Scenario 6: Provisioning Halting at 'Cluster Control Plane Ready' Phase¶
This error occurs when the GKE Cluster Provisioning state becomes unresponsive and remains stuck at the 'Cluster Control Plane Ready' phase. This indicates that the target cluster has been established on GKE. However, the controller is currently awaiting feedback to confirm the readiness of the Control Plane on the end cluster.
Validation¶
When creating a private GKE Cluster, - ensure 'Access Control Plane ExternalIP' is disabled, and 'Control Plane Authorized Networks' is enabled - provide a CIDR with all IPs in that range requiring access to your private cluster
Scenario 7: Initialization of Cluster Provider Infrastructure Stage¶
The below error occurs when the infra-agent installed on the bootstrap VM is unable to establish a connection back to the controller.
Validation¶
- Ensure the successful installation of the Infra-Agent on the GKE bootstrap VM by checking logs at
/var/log/infra_agent.log
- If the agent is not installed, review Google startup script logs and, if necessary, create a Google Cloud NAT for internet connectivity
- If the agent is installed, address specific errors in the logs, such as expired certificates, to maintain proper functionality
- Regularly monitor and troubleshoot issues to ensure a healthy connection between the Infra-Agent and the controller