csi_failover

Failover Test

This example demonstrates a Deployment with a single pod using a RWO PVC, which can be moved to another node if the original node fails. The PVC remains available and its data stays intact.

Step 1: Create a PVC

Define a PersistentVolumeClaim requesting 1Gi of storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc-opennebula
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: opennebula-fs

Apply the PVC:

kubectl apply -f pvc.yaml

Step 2: Create a deployment

Create a Deployment with a single replica that mounts the PVC:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pvc-failover-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pvc-failover
  template:
    metadata:
      labels:
        app: pvc-failover
    spec:
      containers:
        - name: app
          image: busybox
          command: ["sh", "-c", "echo $HOSTNAME >> /data/example && sleep infinity"]
          volumeMounts:
            - mountPath: /data
              name: pvc-storage
      volumes:
        - name: pvc-storage
          persistentVolumeClaim:
            claimName: test-pvc-opennebula

Apply the Deployment:

kubectl apply -f deployment.yaml

Step 3: Verify Pod and Node

Check that the pod is running and see which node it is on:

$ kubectl get pods -l app=pvc-failover -o wide
NAME                           READY   STATUS   NODE
pvc-failover-deployment-<XX>   1/1     Running  capone-workload-md-0-dmlwv-mvdpm

Check all nodes in the cluster:

$ kubectl get nodes
NAME                               STATUS   ROLES           AGE     VERSION
capone-workload-g4rrh              Ready    control-plane   10m     v1.31.4
capone-workload-md-0-dmlwv-mvdpm   Ready    <none>          8m5s    v1.31.4
capone-workload-md-0-dmlwv-t65nc   Ready    <none>          8m22s   v1.31.4

Step 4: Simulate Node Failure

Terminate the node where the pod is running:

onevm terminate capone-workload-md-0-dmlwv-mvdpm --hard

Verify the node is removed:

$ kubectl get nodes
NAME                               STATUS   ROLES           AGE     VERSION
capone-workload-g4rrh              Ready    control-plane   10m     v1.31.4
capone-workload-md-0-dmlwv-t65nc   Ready    <none>          8m54s   v1.31.4

Step 5: Handle VolumeAttachment

When a node fails, Kubernetes keeps the VolumeAttachment object, assuming the volume is still attached to that node. The Attach/Detach controller automatically takes care of this:

After ~6 minutes (default timeout), it detects the node is not recovering.
It triggers the detach function in the CSI driver.
The VolumeAttachment is deleted, and the PVC becomes available to attach to a new node.

This behavior is intentional, designed to avoid false positives during partial network outages.

Step 6: Verify Pod Failover

Check that the pod has been rescheduled on a new node and is running:

$ kubectl get pods -l app=pvc-failover -o wide
NAME                           READY   STATUS   NODE
pvc-failover-deployment-<XX>   1/1     Running  capone-workload-md-0-dmlwv-t65nc

Overview
Installation and Requirements
Testing PersistentVolumeClaims
Developer Information
- Development environment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

csi_failover

Failover Test

Step 1: Create a PVC

Step 2: Create a deployment

Step 3: Verify Pod and Node

Step 4: Simulate Node Failure

Step 5: Handle VolumeAttachment

Step 6: Verify Pod Failover

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally