-
Notifications
You must be signed in to change notification settings - Fork 303
Description
/kind bug
Title: Static IPs from addressesFromPools are not persisted on Talos VMs (CAPV passes IP via cloud-init/guestinfo, which Talos does not consume)
What steps did you take and what happened:
I deploy a Talos-based management/ workload cluster on vSphere via Cluster API + CAPV and allocate static IPs via CAPI IPAM (InClusterIPPool) referenced from VSphereMachineTemplate.spec.template.spec.network.devices[].addressesFromPools.
# IPAM pool
apiVersion: ipam.cluster.x-k8s.io/v1alpha2
kind: InClusterIPPool
metadata:
name: pool
spec:
addresses:
- 192.168.163.150-192.168.163.160
prefix: 24
gateway: 192.168.163.2
---
# Infra: vSphere cluster + endpoint
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
metadata:
name: cluster
spec:
server: "192.168.163.158"
controlPlaneEndpoint:
host: 192.168.163.162
port: 6443
identityRef:
kind: Secret
name: vsphere-cluster
---
# CAPV machine template with addressesFromPools
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: master
spec:
template:
spec:
cloneMode: fullClone
datacenter: "Datacenter"
datastore: "datastore1"
folder: "testvm"
resourcePool: "default"
server: "192.168.163.158"
template: "talos-cloud"
network:
devices:
- networkName: "VM Network"
# The key part: static addresses from IPAM
addressesFromPools:
- apiGroup: ipam.cluster.x-k8s.io
kind: InClusterIPPool
name: pool
---
# Talos control plane (Talos bootstrap provider)
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: TalosControlPlane
metadata:
name: master
spec:
version: v1.34.0
replicas: 3
infrastructureTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
name: master
controlPlaneConfig:
controlplane:
generateType: controlplane
talosVersion: v1.11.1
# Example network patch (two scenarios described below)
strategicPatches:
- |
- op: replace
path: /machine/network/interfaces
value:
- interface: eth0
dhcp: false # Scenario A (static in OS)
vip:
ip: 192.168.163.162
Scenario A (Talos dhcp: false, relying on IPAM static):
On first boot, the VM may come up and the control plane can start, but after the first reboot the interface on Talos has no IP assigned anymore; the node becomes unreachable and the VIP cannot bind. Controller logs include messages like “connect: no route to host” and “no addresses were found for node”.
The IP remains allocated in CAPI/IPAM/CAPV objects, but it is not present inside the guest OS after reboot.
Scenario B (Talos dhcp: true + IPAM static):
On first boot, sometimes both a DHCP lease and the IPAM-provided address appear; sometimes only the IPAM address appears.
After a reboot, the Talos node keeps only the DHCP address; the IPAM-provided static IP disappears from the interface.
Talos with dhcp:false → after reboot the configured IP is gone on the interface.
Talos with dhcp:true → after reboot only the DHCP address remains; the IPAM static IP disappears.
What did you expect to happen:
When addressesFromPools is set, the IP address allocated by CAPI IPAM is consistently configured inside the guest OS and survives reboots for CAPV + Talos (i.e., static address reliably applied), or
If CAPV relies on cloud-init via guestinfo to apply network configuration (and the guest OS does not consume cloud-init), CAPV should clearly surface a warning (event/condition) and document that addressesFromPools will not configure networking in the guest OS for Talos, and that DHCP reservations or explicit static configuration in the Talos machine config must be used instead.
Anything else you would like to add:
From the CAPV code and docs, CAPV writes cloud-init user-data/metadata via vSphere guestinfo (guestinfo.userdata / guestinfo.metadata). That requires the guest OS to consume cloud-init to actually configure the network.
Talos Linux configures networking via its own machine configuration (e.g., machine.network.interfaces), and by default runs DHCP on interfaces with link; it does not consume cloud-init metadata for networking. So the IP set by CAPV/IPAM in guestinfo is not applied/persisted by Talos across reboots.
Related upstream discussion notes that CAPV currently places the IP in guestinfo.metadata, which Talos does not read, hence no static IP applied inside the VM
Environment:
- Cluster-api-provider-vsphere version: v1.14.0
- Kubernetes version: (use
kubectl version): local meneged cluster 1.31.2 - OS (e.g. from
/etc/os-release): debian 11.6 for local meneged cluster 1.31.2