Skip to content

Istio istio-init fails on gVisor: maxOptLen 8KB limit + missing raw table block iptables-restore #12685

@a7i

Description

@a7i

Description

Istio's istio-init container fails to set up iptables traffic interception in gVisor pods. There are two independent blockers:

  1. maxOptLen hard-coded to 8KBsetsockopt(IPT_SO_SET_REPLACE) silently returns EINVAL for nat table payloads exceeding 8192 bytes. Istio 1.28+ generates ~13KB payloads due to TCP+UDP exclusion rules. Fix: PR fix(setsockopt): increase maxOptLen from 8KB to 32KB #12686.

  2. raw table not implemented — Istio with ISTIO_META_DNS_CAPTURE=true (the default) generates rules for both the nat and raw tables. gVisor only implements nat, mangle, and filter tables. iptables-restore fails with unable to initialize table 'raw'. The raw table rules use the CT target with --zone for conntrack zone isolation. No fix yet.

Even after fixing blocker 1, blocker 2 prevents Istio from working when DNS capture is enabled.

Environment

  • gVisor version: release-20260302.0
  • Architecture: arm64 (aarch64)
  • Runtime: containerd with containerd-shim-runsc-v1
  • runsc config: net-raw = "true" (default netstack, no --network=host)
  • Kubernetes: v1.34.2 with gVisor RuntimeClass
  • Istio version: 1.28.0 / 1.28.4
  • iptables: v1.8.10 (legacy)

Blocker 1: maxOptLen (8KB limit on setsockopt)

Root cause: In pkg/sentry/syscalls/linux/sys_socket.go:

const maxOptLen = 1024 * 8 // 8192 bytes

When iptables-restore calls setsockopt(SOL_IP, IPT_SO_SET_REPLACE, ...) with a payload larger than 8192 bytes, gVisor returns EINVAL before the buffer reaches the netfilter layer, with no log message.

Data points:

  • Istio 1.24 full nat ruleset (~30 rules, TCP-only exclusions): ~5-6KB — passes
  • Existing istio_blob test fixture in gVisor: 5,688 bytes — passes
  • Istio 1.28 full nat ruleset (~65 rules, TCP+UDP exclusions for 22 ports): ~13KB — fails

What Linux does: Linux limits setsockopt optval to INT_MAX (net/socket.c: do_sock_setsockopt). Other runtimes (runc, Kata) inherit this.

Fix: PR #12686 raises maxOptLen to 32KB.

Blocker 2: raw table not implemented

Root cause: gVisor's netfilter only implements 3 tables (pkg/sentry/socket/netfilter/netfilter.go):

var nameToID = map[string]stack.TableID{
    "nat":    stack.NATID,
    "mangle": stack.MangleID,
    "filter": stack.FilterID,
}

When Istio has ISTIO_META_DNS_CAPTURE=true (the default in most deployments), it generates rules for both nat and raw tables. The raw table rules use the CT target with --zone for conntrack zone isolation:

* raw
-N ISTIO_OUTPUT_DNS
-N ISTIO_INBOUND
-A OUTPUT -j ISTIO_OUTPUT_DNS
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -m owner --uid-owner 1337 -j CT --zone 1
-A ISTIO_OUTPUT_DNS -p udp --sport 15053 -m owner --uid-owner 1337 -j CT --zone 2
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -m owner --gid-owner 1337 -j CT --zone 1
-A ISTIO_OUTPUT_DNS -p udp --sport 15053 -m owner --gid-owner 1337 -j CT --zone 2
-A PREROUTING -j ISTIO_INBOUND
-A ISTIO_OUTPUT_DNS -p udp --dport 53 -d 169.254.1.1/32 -j CT --zone 2
-A ISTIO_INBOUND -p udp --sport 53 -s 169.254.1.1/32 -j CT --zone 1
COMMIT

iptables-restore passes both tables in a single call. When it reaches * raw, getsockopt(IPT_SO_GET_INFO, "raw") fails because gVisor doesn't recognize the table name, and iptables-restore reports unable to initialize table 'raw'.

Error:

iptables-restore v1.8.10 (legacy): iptables-restore: unable to initialize table 'raw'
Error occurred at line: 75

What would be needed:

  • Add RawID to the TableID enum in pkg/tcpip/stack/iptables.go
  • Add "raw": stack.RawID to the nameToID map
  • Create a default raw table with PREROUTING and OUTPUT hooks
  • Wire raw table processing into CheckPrerouting() and CheckOutput() (before mangle, matching Linux's table traversal order)
  • Implement the CT target (at minimum as a no-op that accepts the rules)

Steps to Reproduce

Prerequisite: net-raw = "true" in runsc.toml. Without it, the nat table is inaccessible (Table does not exist).

Reproducing Blocker 1 (maxOptLen)

Apply the full ~65-rule Istio nat ruleset via iptables-restore. With DNS_CAPTURE=false (so only nat table rules are generated), the setsockopt payload exceeds 8KB and fails:

iptables-restore: line 70 failed

Reproducing Blocker 2 (raw table)

Run pilot-agent istio-iptables with ISTIO_META_DNS_CAPTURE=true:

kubectl run istio-raw-test \
  --image=docker.io/istio/proxyv2:1.28.0 \
  --restart=Never \
  --overrides='{
    "spec": {
      "runtimeClassName": "gvisor",
      "nodeName": "<node-with-net-raw>",
      "containers": [{
        "name": "istio-raw-test",
        "image": "docker.io/istio/proxyv2:1.28.0",
        "command": ["pilot-agent"],
        "args": ["istio-iptables", "-p", "15001", "-z", "15006", "-u", "1337", "-m", "REDIRECT", "-i", "10.0.0.0/8,172.16.0.0/12", "-x", "", "-b", "*", "-d", "15090,15021,15020", "-o", "11211,2181,25,9092", "--log_output_level=all:info"],
        "env": [{"name": "ISTIO_META_DNS_CAPTURE", "value": "true"}],
        "securityContext": {
          "runAsUser": 0,
          "capabilities": {"add": ["NET_ADMIN", "NET_RAW"]}
        }
      }]
    }
  }'

Fails with:

iptables-restore: unable to initialize table 'raw'

Summary

Blocker Root cause Status Impact
maxOptLen 8KB limit sys_socket.go hard-coded constant PR #12686 Blocks nat table payloads > 8KB
raw table missing Not in nameToID or TableID enum Open Blocks Istio DNS capture (CT --zone)

Both blockers must be resolved for full Istio 1.28+ compatibility with gVisor.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions