-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Istio istio-init fails on gVisor: maxOptLen 8KB limit + missing raw table block iptables-restore #12685
Description
Description
Istio's istio-init container fails to set up iptables traffic interception in gVisor pods. There are two independent blockers:
-
maxOptLenhard-coded to 8KB —setsockopt(IPT_SO_SET_REPLACE)silently returnsEINVALfor nat table payloads exceeding 8192 bytes. Istio 1.28+ generates ~13KB payloads due to TCP+UDP exclusion rules. Fix: PR fix(setsockopt): increase maxOptLen from 8KB to 32KB #12686. -
rawtable not implemented — Istio withISTIO_META_DNS_CAPTURE=true(the default) generates rules for both thenatandrawtables. gVisor only implementsnat,mangle, andfiltertables.iptables-restorefails withunable to initialize table 'raw'. Therawtable rules use theCTtarget with--zonefor conntrack zone isolation. No fix yet.
Even after fixing blocker 1, blocker 2 prevents Istio from working when DNS capture is enabled.
Environment
- gVisor version: release-20260302.0
- Architecture: arm64 (aarch64)
- Runtime: containerd with
containerd-shim-runsc-v1 - runsc config:
net-raw = "true"(default netstack, no--network=host) - Kubernetes: v1.34.2 with gVisor RuntimeClass
- Istio version: 1.28.0 / 1.28.4
- iptables:
v1.8.10 (legacy)
Blocker 1: maxOptLen (8KB limit on setsockopt)
Root cause: In pkg/sentry/syscalls/linux/sys_socket.go:
const maxOptLen = 1024 * 8 // 8192 bytesWhen iptables-restore calls setsockopt(SOL_IP, IPT_SO_SET_REPLACE, ...) with a payload larger than 8192 bytes, gVisor returns EINVAL before the buffer reaches the netfilter layer, with no log message.
Data points:
- Istio 1.24 full nat ruleset (~30 rules, TCP-only exclusions): ~5-6KB — passes
- Existing
istio_blobtest fixture in gVisor: 5,688 bytes — passes - Istio 1.28 full nat ruleset (~65 rules, TCP+UDP exclusions for 22 ports): ~13KB — fails
What Linux does: Linux limits setsockopt optval to INT_MAX (net/socket.c: do_sock_setsockopt). Other runtimes (runc, Kata) inherit this.
Fix: PR #12686 raises maxOptLen to 32KB.
Blocker 2: raw table not implemented
Root cause: gVisor's netfilter only implements 3 tables (pkg/sentry/socket/netfilter/netfilter.go):
var nameToID = map[string]stack.TableID{
"nat": stack.NATID,
"mangle": stack.MangleID,
"filter": stack.FilterID,
}When Istio has ISTIO_META_DNS_CAPTURE=true (the default in most deployments), it generates rules for both nat and raw tables. The raw table rules use the CT target with --zone for conntrack zone isolation:
* raw
-N ISTIO_OUTPUT_DNS
-N ISTIO_INBOUND
-A OUTPUT -j ISTIO_OUTPUT_DNS
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -m owner --uid-owner 1337 -j CT --zone 1
-A ISTIO_OUTPUT_DNS -p udp --sport 15053 -m owner --uid-owner 1337 -j CT --zone 2
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -m owner --gid-owner 1337 -j CT --zone 1
-A ISTIO_OUTPUT_DNS -p udp --sport 15053 -m owner --gid-owner 1337 -j CT --zone 2
-A PREROUTING -j ISTIO_INBOUND
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -d 169.254.1.1/32 -j CT --zone 2
-A ISTIO_INBOUND -p udp --sport 53 -s 169.254.1.1/32 -j CT --zone 1
COMMIT
iptables-restore passes both tables in a single call. When it reaches * raw, getsockopt(IPT_SO_GET_INFO, "raw") fails because gVisor doesn't recognize the table name, and iptables-restore reports unable to initialize table 'raw'.
Error:
iptables-restore v1.8.10 (legacy): iptables-restore: unable to initialize table 'raw'
Error occurred at line: 75
What would be needed:
- Add
RawIDto theTableIDenum inpkg/tcpip/stack/iptables.go - Add
"raw": stack.RawIDto thenameToIDmap - Create a default raw table with PREROUTING and OUTPUT hooks
- Wire raw table processing into
CheckPrerouting()andCheckOutput()(before mangle, matching Linux's table traversal order) - Implement the
CTtarget (at minimum as a no-op that accepts the rules)
Steps to Reproduce
Prerequisite:
net-raw = "true"inrunsc.toml. Without it, the nat table is inaccessible (Table does not exist).
Reproducing Blocker 1 (maxOptLen)
Apply the full ~65-rule Istio nat ruleset via iptables-restore. With DNS_CAPTURE=false (so only nat table rules are generated), the setsockopt payload exceeds 8KB and fails:
iptables-restore: line 70 failed
Reproducing Blocker 2 (raw table)
Run pilot-agent istio-iptables with ISTIO_META_DNS_CAPTURE=true:
kubectl run istio-raw-test \
--image=docker.io/istio/proxyv2:1.28.0 \
--restart=Never \
--overrides='{
"spec": {
"runtimeClassName": "gvisor",
"nodeName": "<node-with-net-raw>",
"containers": [{
"name": "istio-raw-test",
"image": "docker.io/istio/proxyv2:1.28.0",
"command": ["pilot-agent"],
"args": ["istio-iptables", "-p", "15001", "-z", "15006", "-u", "1337", "-m", "REDIRECT", "-i", "10.0.0.0/8,172.16.0.0/12", "-x", "", "-b", "*", "-d", "15090,15021,15020", "-o", "11211,2181,25,9092", "--log_output_level=all:info"],
"env": [{"name": "ISTIO_META_DNS_CAPTURE", "value": "true"}],
"securityContext": {
"runAsUser": 0,
"capabilities": {"add": ["NET_ADMIN", "NET_RAW"]}
}
}]
}
}'Fails with:
iptables-restore: unable to initialize table 'raw'
Summary
| Blocker | Root cause | Status | Impact |
|---|---|---|---|
maxOptLen 8KB limit |
sys_socket.go hard-coded constant |
PR #12686 | Blocks nat table payloads > 8KB |
raw table missing |
Not in nameToID or TableID enum |
Open | Blocks Istio DNS capture (CT --zone) |
Both blockers must be resolved for full Istio 1.28+ compatibility with gVisor.