-
Notifications
You must be signed in to change notification settings - Fork 0
csi_failover
This example demonstrates a Deployment with a single pod using a RWO PVC, which can be moved to another node if the original node fails. The PVC remains available and its data stays intact.
Define a PersistentVolumeClaim requesting 1Gi of storage:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc-opennebula
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: opennebula-fs
Apply the PVC:
kubectl apply -f pvc.yaml
Create a Deployment with a single replica that mounts the PVC:
apiVersion: apps/v1
kind: Deployment
metadata:
name: pvc-failover-deployment
spec:
replicas: 1
selector:
matchLabels:
app: pvc-failover
template:
metadata:
labels:
app: pvc-failover
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "echo $HOSTNAME >> /data/example && sleep infinity"]
volumeMounts:
- mountPath: /data
name: pvc-storage
volumes:
- name: pvc-storage
persistentVolumeClaim:
claimName: test-pvc-opennebula
Apply the Deployment:
kubectl apply -f deployment.yaml
Check that the pod is running and see which node it is on:
$ kubectl get pods -l app=pvc-failover -o wide
NAME READY STATUS NODE
pvc-failover-deployment-<XX> 1/1 Running capone-workload-md-0-dmlwv-mvdpm
Check all nodes in the cluster:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
capone-workload-g4rrh Ready control-plane 10m v1.31.4
capone-workload-md-0-dmlwv-mvdpm Ready <none> 8m5s v1.31.4
capone-workload-md-0-dmlwv-t65nc Ready <none> 8m22s v1.31.4
Terminate the node where the pod is running:
onevm terminate capone-workload-md-0-dmlwv-mvdpm --hard
Verify the node is removed:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
capone-workload-g4rrh Ready control-plane 10m v1.31.4
capone-workload-md-0-dmlwv-t65nc Ready <none> 8m54s v1.31.4
When a node fails, Kubernetes keeps the VolumeAttachment
object, assuming the volume is still attached to that node.
The Attach/Detach controller automatically takes care of this:
- After ~6 minutes (default timeout), it detects the node is not recovering.
- It triggers the detach function in the CSI driver.
- The VolumeAttachment is deleted, and the PVC becomes available to attach to a new node.
This behavior is intentional, designed to avoid false positives during partial network outages.
Check that the pod has been rescheduled on a new node and is running:
$ kubectl get pods -l app=pvc-failover -o wide
NAME READY STATUS NODE
pvc-failover-deployment-<XX> 1/1 Running capone-workload-md-0-dmlwv-t65nc
- Overview
- Installation and Requirements
- Testing PersistentVolumeClaims
- Developer Information