Skip to main content

Tag: Kubernetes

Problems related to Kubernetes, such as pod failures, API errors, or scheduling issues

IDTitleDescriptionCategoryTechnologyTags
CRE-2025-0032
Low
Impact: 2/10
Mitigation: 4/10
Loki generates excessive logs when memcached service port name is incorrectLoki instances using memcached for caching may emit excessive warning or error logs when the configured`memcached_client` service port name does not match the actual Kubernetes service port. This does not cause a crash or failure, but it results in noisy logs and ineffective caching behavior.Observability ProblemslokiLokiMemcachedConfigurationServiceCacheKnown IssueKubernetesPublic
CRE-2025-0048
Low
Impact: 5/10
Mitigation: 3/10
Kubelet node not ready due to a DNS hostname resolution failureA Kubernetes worker node has entered the **NotReady** state.Kubernetes ProblemskubeletKubeletKubernetesDNSPublic
CRE-2025-0069
Medium
Impact: 6/10
Mitigation: 4/10
Kubernetes fsGroup ignored on NFS volumesPods that mount NFS volumes and set `securityContext.fsGroup` still have the directory owned by `root:root`. The kubelet does not chown the share, so non-root containers fail with \"Permission denied\".Kubernetes Storage ProblemsmanifestKubernetesNFSsecurityContext
CRE-2025-0071
High
Impact: 9/10
Mitigation: 8/10
CoreDNS unavailableCoreDNS deployment is unavailable or has no ready endpoints, indicating an imminent cluster-wide DNS outage.Kubernetes ProblemskubernetesKubernetesNetworkingDNSHigh Availability
CRE-2025-0112
Critical
Impact: 10/10
Mitigation: 4/10
AWS VPC CNI Node IP Pool Depletion CrisisCritical AWS VPC CNI node IP pool depletion detected causing cascading pod scheduling failures. This pattern indicates severe subnet IP address exhaustion combined with ENI allocation failures, leading to complete cluster networking breakdown. The failure sequence shows ipamd errors, kubelet scheduling failures, and controller-level pod creation blocks that render clusters unable to deploy new workloads, scale existing services, or recover from node failures. This represents one of the most severe Kubernetes infrastructure failures, often requiring immediate manual intervention including subnet expansion, secondary CIDR provisioning, or emergency workload termination to restore cluster functionality.VPC CNI Problemsaws-vpc-cniAWSEKSKubernetesNetworkingVPC CNIAWS CNIIP ExhaustionENI AllocationSubnet ExhaustionPod Scheduling FailureCluster ParalysisAWS API LimitsKnown ProblemCritical InfrastructureService OutageCascading FailureCapacity ExceededScalability IssueRevenue ImpactCompliance ViolationThreshold ExceededInfrastructurePublic
CRE-2025-0114
High
Impact: 0/10
Mitigation: 0/10
Nginx Ingress Controller rewritten URI has a zero lengthDetects rewrite error which leads to service unavailability. Wrong rewrite causes responses with HTTP code 500 or 400. This CRE detects empty rewrite.Load Balancer ProblemsnginxNginxReverse ProxyService OutageIngress ControllerNGINX IngressLoad BalancerKubernetes
CRE-2025-0121
Critical
Impact: 10/10
Mitigation: 7/10
NGINX Ingress Controller SSL Certificate FailureCritical NGINX Ingress Controller SSL certificate validation failure detected. This pattern indicates cascading SSL failures where certificate verification errors lead to upstream connection failures and service unavailability. The failure sequence shows SSL handshake failures, certificate verification errors, and resulting HTTP error responses that affect client connectivity.Load Balancer ProblemsnginxNginxIngress ControllerSSL CertificateTLS HandshakeCertificate VerificationLoad BalancerKubernetesSecurityHigh AvailabilityService Unavailability
CRE-2025-0122
Critical
Impact: 10/10
Mitigation: 6/10
AWS VPC CNI IP Address Exhaustion CrisisCritical AWS VPC CNI IP address exhaustion detected. This pattern indicates cascading failures where subnet IP exhaustion leads to ENI allocation failures, pod scheduling failures, and complete service unavailability. The failure sequence shows IP allocation errors, ENI attachment failures, and resulting pod startup failures that affect cluster scalability and workload deployment.Networking Problemsaws-vpc-cniAWSVPC CNIKubernetesNetworkingIP ExhaustionENI AllocationPod SchedulingCluster ScalingHigh AvailabilityService Unavailability
CRE-2025-0125
High
Impact: 9/10
Mitigation: 6/10
Kubelet EventedPLEG Panic Causes NodeFailureDetects a critical kubelet panic in the EventedPLEG subsystem under rapid pod launch pressure. When triggered, the node's kubelet crashes, the node becomes NotReady and all resident pods are evicted resulting in a full node-level outage until manual intervention.Kubernetes ProblemskubernetesKubernetesKubeletPanic