Skip to main content

PREQUEL-2025-0026

Istio XDS GRPC FailureLow
Impact: 3/10
Mitigation: 6/10

PREQUEL-2025-0026View on GitHub

Description

Envoy sidecars or Ambient **ztunnel** keep retrying the control-plane\nstream and log \n\n```\nXDS client connection error: gRPC connection error:status: Unknown,\nmessage: \"...\", source: tcp connect error: Connection refused (os error 111)\n``` \n\nor \n\n```\n... source: tcp connect error: deadline has elapsed\n``` \n\nThe proxies never reach “ADS stream established”, so no\nconfiguration, certificates, or policy updates are delivered until this is mitigated. \n

Mitigation

1. **Verify Istiod availability**\n\n ```bash\n kubectl -n istio-system get endpoints istiod -o wide\n kubectl exec -n istio-system deploy/ztunnel \\\n -- nc -vz istiod.istio-system.svc.cluster.local 15012\n ```\n\n2. **Allow TCP 15012** everywhere: NetworkPolicies, security groups,\n on-node firewalls.\n\n3. **Ambient only:** disable strict RP-filtering (or Cilium BPF\n masquerade) on all nodes:\n\n ```bash\n sysctl -w net.ipv4.conf.all.rp_filter=0\n sysctl -w net.ipv4.conf.default.rp_filter=0\n ```\n\n4. **Restart the proxies** after restoring connectivity:\n\n ```bash\n kubectl rollout restart daemonset/ztunnel -n istio-system\n istioctl proxy-status # confirm ADS connected\n ```\n\n5. **Upgrade to Istio ≥ 1.24** – includes fixes for\n exponential-backoff bugs that masked the root cause of #53696.\n

References