Skip to main content

CRE-2025-0125

Kubelet EventedPLEG Panic Causes NodeFailureHigh
Impact: 9/10
Mitigation: 6/10

CRE-2025-0125View on GitHub

Description

Detects a critical kubelet panic in the EventedPLEG subsystem under rapid pod launch pressure. When triggered, the node's kubelet crashes, the node becomes NotReady and all resident pods are evicted resulting in a full node-level outage until manual intervention.


Cause

  • High pod creation rate on a constrained node
  • EventedPLEG feature gate enabled (default in v1.33+)
  • Concurrent CRI events exceeding kubelet's goroutine handling

Mitigation

IMMEDIATE ACTIONS:

  • Restart kubelet on the affected node
  • Distribute load across nodes

RECOVERY:

  • Disable `EventedPLEG` via feature gate if persistent
  • Monitor kubelet logs for panics related to evented.go

PREVENTION:

  • Test EventedPLEG in staging before enabling in production
  • Use pod admission control to slow pod floods on small nodes

References