|
56 | 56 | ### 3. Failed to apply a cluster template with release not found error
|
57 | 57 |
|
58 | 58 | While trying to apply a cluster template from unreleased version like from main branch, we will run into error like `release not found for version vX.XX.XX`. In that case, instead of `--flavor` we need to use `--from=<path_to_cluster_template>`.
|
| 59 | + |
| 60 | + |
| 61 | +### 4. Debugging Machine struck in PROVISIONED phase |
| 62 | + |
| 63 | +* A Machine's Running phase indicates that it has successfully created, initialised and has become a Kubernetes Node in a Ready state. |
| 64 | + |
| 65 | +* Sometimes a machine will be in Provisioned phase forever indicating infrastructure has been created and configured but yet to become a Kubernetes node. |
| 66 | + |
| 67 | +* Cloud controller manager(CCM) takes care of turning a machine into a node by fetching and initialising with appropriate data from cloud. |
| 68 | + |
| 69 | +* As a part of cluster create template we make use of [ClusterResourceSet](https://cluster-api.sigs.k8s.io/tasks/cluster-resource-set) to apply the CCM [resources](https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud/blob/cbdb2550ab3e326c95d075a6dc852c81c15b1189/templates/cluster-template-powervs.yaml#L300-L315) into the workload cluster. |
| 70 | + |
| 71 | +* Check the machine's current status |
| 72 | + |
| 73 | + ```shell |
| 74 | + $ kubectl get machines |
| 75 | + NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION |
| 76 | + powervs-control-plane-pqnt4 powervs ibmpowervs://osa/osa21/10b1000b-da8d-4e18-ad1f-6b2a56a8c130/bc0c9621-12d2-47f1-932e-a18ff041aba2 Provisioned 5m36s v1.31.0 |
| 77 | + ``` |
| 78 | + |
| 79 | +* Verify that the ClusterResourceSet is applied to the workload cluster |
| 80 | + |
| 81 | + ```shell |
| 82 | + $ kubectl get clusterresourceset |
| 83 | + NAME AGE |
| 84 | + crs-cloud-conf 10m |
| 85 | + |
| 86 | + $ kubectl describe clusterresourceset crs-cloud-conf |
| 87 | + . |
| 88 | + . |
| 89 | + Status: |
| 90 | + Conditions: |
| 91 | + Last Transition Time: 2025-05-06T08:36:40Z |
| 92 | + Message: |
| 93 | + Observed Generation: 1 |
| 94 | + Reason: Applied |
| 95 | + Status: True |
| 96 | + Type: ResourcesApplied |
| 97 | + Last Transition Time: 2025-05-06T08:31:27Z |
| 98 | + Message: |
| 99 | + Observed Generation: 1 |
| 100 | + Reason: NotPaused |
| 101 | + Status: False |
| 102 | + Type: Paused |
| 103 | + ``` |
| 104 | + |
| 105 | +* Verify that the CCM resources are created in the workload cluster |
| 106 | + |
| 107 | + * Get the workload cluster kubeconfig |
| 108 | + |
| 109 | + ``` |
| 110 | + $ clusterctl get kubeconfig powervs > workload.conf |
| 111 | + ``` |
| 112 | + |
| 113 | + * Check the CCM daemonset's status |
| 114 | +
|
| 115 | + ``` |
| 116 | + $ kubectl get daemonset -n kube-system --kubeconfig=workload.conf |
| 117 | + NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE |
| 118 | + ibmpowervs-cloud-controller-manager 2 2 2 2 2 node-role.kubernetes.io/control-plane= 45m |
| 119 | + ``` |
| 120 | + |
| 121 | + * Check the logs of CCM |
| 122 | +
|
| 123 | + ``` |
| 124 | + $ kubectl -n kube-system get pods --kubeconfig=workload.conf |
| 125 | + ibmpowervs-cloud-controller-manager-472lq 1/1 Running 1 (45m ago) 46m |
| 126 | + ibmpowervs-cloud-controller-manager-fw47h 1/1 Running 1 (38m ago) 38m |
| 127 | + |
| 128 | + $ kubectl -n kube-system logs ibmpowervs-cloud-controller-manager-472lq --kubeconfig=workload.conf |
| 129 | + I0506 09:23:51.420992 1 ibm_metadata_service.go:206] Retrieving information for node=powervs-control-plane-ftd8j from Power VS |
| 130 | + I0506 09:23:51.421003 1 ibm_powervs_client.go:270] Node powervs-control-plane-ftd8j found metadata &{InternalIP:192.168.236.114 ExternalIP:163.68.98.114 WorkerID:001275c5-f454-4944-8419-61c16f16f8b7 InstanceType:s922 FailureDomain:osa21 Region:osa ProviderID:ibmpowervs://osa/osa21/10b1000b-da8d-4e18-ad1f-6b2a56a8c130/001275c5-f454-4944-8419-61c16f16f8b7} from DHCP cache |
| 131 | + I0506 09:23:51.421038 1 node_controller.go:271] Update 3 nodes status took 7.03624ms. |
| 132 | + ``` |
| 133 | + |
| 134 | + * Check the cloud-conf config map |
| 135 | +
|
| 136 | + ``` |
| 137 | + $ kubectl -n kube-system get cm ibmpowervs-cloud-config -o yaml --kubeconfig=workload.conf |
| 138 | + apiVersion: v1 |
| 139 | + kind: ConfigMap |
| 140 | + metadata: |
| 141 | + creationTimestamp: "2025-05-06T08:36:39Z" |
| 142 | + name: ibmpowervs-cloud-config |
| 143 | + namespace: kube-system |
| 144 | + resourceVersion: "329" |
| 145 | + uid: ae2bd436-0b1e-4534-9c6c-48f717f6f47e |
| 146 | + data: |
| 147 | + ibmpowervs.conf: | |
| 148 | + [global] |
| 149 | + version = 1.1.0 |
| 150 | + [kubernetes] |
| 151 | + config-file = "" |
| 152 | + [provider] |
| 153 | + cluster-default-provider = g2 |
| 154 | + . |
| 155 | + . |
| 156 | + ``` |
| 157 | + |
| 158 | + * Check whether the secret is configured with correct IBM Cloud API key. |
| 159 | + |
| 160 | + ``` |
| 161 | + $ kubectl -n kube-system get secret ibmpowervs-cloud-credential -o yaml --kubeconfig=workload.conf |
| 162 | + ``` |
| 163 | +* Check whether the node is initialised correctly and does not have taint `node.cloudprovider.kubernetes.io/uninitialized` taint |
| 164 | +
|
| 165 | + ```shell |
| 166 | + $ kubectl get nodes --kubeconfig=workload.conf |
| 167 | + NAME STATUS ROLES AGE VERSION |
| 168 | + powervs-control-plane-ftd8j NotReady control-plane 53m v1.31.0 |
| 169 | + powervs-control-plane-pqnt4 NotReady control-plane 61m v1.31.0 |
| 170 | + powervs-md-0-2dnrm-8658c NotReady <none> 56m v1.31.0 |
| 171 | + |
| 172 | + |
| 173 | + $ kubectl get node powervs-control-plane-ftd8j -o yaml --kubeconfig=workload.conf |
| 174 | + apiVersion: v1 |
| 175 | + kind: Node |
| 176 | + metadata: |
| 177 | + annotations: |
| 178 | + cluster.x-k8s.io/annotations-from-machine: "" |
| 179 | + cluster.x-k8s.io/cluster-name: powervs |
| 180 | + cluster.x-k8s.io/cluster-namespace: default |
| 181 | + cluster.x-k8s.io/labels-from-machine: "" |
| 182 | + cluster.x-k8s.io/machine: powervs-control-plane-ftd8j |
| 183 | + cluster.x-k8s.io/owner-kind: KubeadmControlPlane |
| 184 | + cluster.x-k8s.io/owner-name: powervs-control-plane |
| 185 | + kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock |
| 186 | + node.alpha.kubernetes.io/ttl: "0" |
| 187 | + volumes.kubernetes.io/controller-managed-attach-detach: "true" |
| 188 | + ``` |
| 189 | +
|
| 190 | +* On the successful CCM initialisation the machine will turn into Running phase and corresponding NODENAME field will be populated. |
| 191 | + ```shell |
| 192 | + NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION |
| 193 | + powervs-control-plane-pqnt4 powervs powervs-control-plane-pqnt4 ibmpowervs://osa/osa21/10b1000b-da8d-4e18-ad1f-6b2a56a8c130/bc0c9621-12d2-47f1-932e-a18ff041aba2 Running 8m52s v1.31.0 |
| 194 | + ``` |
0 commit comments