Skip to main content

CRE-2025-0112

AWS VPC CNI Node IP Pool Depletion CrisisCritical
Impact: 10/10
Mitigation: 4/10

CRE-2025-0112View on GitHub

Description

Critical AWS VPC CNI node IP pool depletion detected causing cascading pod scheduling failures.

This pattern indicates severe subnet IP address exhaustion combined with ENI allocation failures,

leading to complete cluster networking breakdown. The failure sequence shows ipamd errors,

kubelet scheduling failures, and controller-level pod creation blocks that render clusters

unable to deploy new workloads, scale existing services, or recover from node failures.


This represents one of the most severe Kubernetes infrastructure failures, often requiring

immediate manual intervention including subnet expansion, secondary CIDR provisioning,

or emergency workload termination to restore cluster functionality.


Cause

Primary subnet IP address pool exhaustion in AWS VPC combined with ENI warm pool depletion

during traffic spikes or cluster scaling events. Root causes include undersized subnet CIDR

blocks, inefficient VPC CNI warm pool configuration, custom networking misconfigurations,

and hitting AWS service limits (AddressLimitExceeded, NetworkInterfaceLimitExceeded).


Secondary factors include cluster autoscaler thrashing, batch job IP consumption spikes,

failed pod cleanup leaving IPs allocated, and insufficient capacity planning for workload

growth patterns. The problem is exacerbated by VPC CNI's default warm pool behavior

which reserves significant IP overhead per node.


Mitigation

IMMEDIATE EMERGENCY RESPONSE:

  • Identify affected subnets: `aws ec2 describe-subnets --filters "Name=vpc-id,Values=$(aws eks describe-cluster --name CLUSTER --query cluster.resourcesVpcConfig.vpcId --output text)" --query 'Subnets[*].[SubnetId,AvailableIpAddressCount,CidrBlock]' --output table`
  • Check ENI allocation status: `aws ec2 describe-network-interfaces --filters "Name=status,Values=in-use" --query 'length(NetworkInterfaces)'`
  • Scale down non-critical workloads immediately: `kubectl scale deployment NON_CRITICAL_APP --replicas=0`
  • Monitor VPC CNI daemon logs: `kubectl logs -n kube-system -l k8s-app=aws-node --follow`

RECOVERY ACTIONS (Execute in order):

  1. Associate secondary VPC CIDR: `aws ec2 associate-vpc-cidr-block --vpc-id VPC_ID --cidr-block 100.64.0.0/16`
  2. Create additional subnets with enhanced discovery tags:
   for az in a b c; do     aws ec2 create-subnet --vpc-id VPC_ID --cidr-block 100.64.${az/a/1}.0/24 --availability-zone us-west-2${az} --tag-specifications 'ResourceType=subnet,Tags=[{Key=kubernetes.io/role/cni,Value=1},{Key=kubernetes.io/cluster/CLUSTER_NAME,Value=shared}]'   done
  1. Enable prefix delegation for maximum IP efficiency: `kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true WARM_PREFIX_TARGET=1`
  2. Optimize warm pool configuration: `kubectl set env daemonset aws-node -n kube-system WARM_IP_TARGET=3 MINIMUM_IP_TARGET=1 WARM_ENI_TARGET=1`
  3. Force VPC CNI restart to discover new subnets: `kubectl rollout restart daemonset/aws-node -n kube-system && kubectl rollout status daemonset/aws-node -n kube-system --timeout=300s`
  4. Verify recovery: `kubectl get pods --all-namespaces | grep Pending && kubectl get nodes -o wide`

PREVENTION AND MONITORING:

  • Implement subnet IP monitoring: CloudWatch alarm on `AvailableIpAddressCount < 50`
  • Enable Enhanced Subnet Discovery (VPC CNI v1.18.0+): `kubectl set env daemonset aws-node -n kube-system ENABLE_SUBNET_DISCOVERY=true`
  • Set up automated capacity planning with 6-month growth projections
  • Configure cluster autoscaler with IP-aware node provisioning
  • Implement emergency runbooks for IP exhaustion scenarios
  • Consider IPv6 adoption for long-term scalability: `kubectl set env daemonset aws-node -n kube-system ENABLE_IPv6=true`
  • Monitor warm pool efficiency: `kubectl get daemonset aws-node -n kube-system -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="WARM_IP_TARGET")].value}'`
  • Set up automated secondary CIDR provisioning triggers

References