-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Description
What happened?
I have loadbalancer_apiserver_localhost enabled (which I need for other reasons), but do not have calico_bpf_enabled enabled. With that combination, the kubespray deployment run fails when deploying metallb. Working back through the error stack, the failure ultimately is because an empty kubernetes-services-endpoint ConfigMap causes Calico CNI to use an unreachable service IP. Which in turn means pod deployments get stuck in ContainerCreating.
What did you expect to happen?
kubespray deployment run to succeed
pods to create containers successfully, with functional networking
How can we reproduce it (as minimally and precisely as possible)?
I've narrowed down to the root cause and will submit a PR to fix shortly. But the common factors needed to reproduce are
- Calico CNI in vxlan mode
- loadbalancer_apiserver_localhost enabled
- calico_bpf_enabled set to false
Problem was seen with v2.28.1 (a20891a) but is still present in master HEAD
OS
Ubuntu 24
Version of Ansible
ansible [core 2.16.14]
config file = /home/chricker/kubespray/ansible.cfg
configured module search path = ['/home/chricker/kubespray/library']
ansible python module location = /home/chricker/kubespray/venv_ansible/lib/python3.12/site-packages/ansible
ansible collection location = /home/chricker/.ansible/collections:/usr/share/ansible/collections
executable location = /home/chricker/kubespray/venv_ansible/bin/ansible
python version = 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0] (/home/chricker/kubespray/venv_ansible/bin/python3)
jinja version = 3.1.6
libyaml = True
Version of Python
Python 3.12.3
Version of Kubespray (commit)
Network plugin used
calico
Full inventory with variables
omitted for policy reasons
Command used to invoke ansible
cd /home/chricker/kubespray && source venv_ansible/bin/activate && ansible-playbook -i /path/to/hosts.ini cluster.yml --become
Output of ansible run
TASK [kubernetes-apps/metallb : Kubernetes Apps | Wait for MetalLB controller to be running] ***
task path: /home/chricker/kubespray/roles/kubernetes-apps/metallb/tasks/main.yml:35
fatal: [test-k8s-dev1-01.test.domain: FAILED! => {"changed": true, "cmd": ["/usr/local/bin/kubectl", "rollout", "status", "-n", "metallb-system", "deployment", "-l", "app=metallb,component=controller", "--timeout=2m"], "delta": "0:00:00.089180", "end": "2025-09-28 04:51:33.042994", "msg": "non-zero return code", "rc": 1, "start": "2025-09-28 04:51:32.953814", "stderr": "error: deployment "controller" exceeded its progress deadline", "stderr_lines": ["error: deployment "controller" exceeded its progress deadline"], "stdout": "", "stdout_lines": []}
Anything else we need to know
No response