Tag: Infrastructure
Problems at the infrastructure level, such as resource outages or provisioning failures
ID | Title | Description | Category | Technology | Tags |
---|---|---|---|---|---|
CRE-2025-0038 Low Impact: 5/10 Mitigation: 3/10 | Loki fails to cache entries due to Memcached out-of-memory error | Grafana Loki may emit errors when attempting to write to a Memcached backend that has run out of available memory. This results in dropped index or query cache entries, which can degrade query performance but does not interrupt ingestion. | Observability Problems | loki | LokiMemcachedCacheMemoryInfrastructureKnown IssuePublic |
CRE-2025-0112 Critical Impact: 10/10 Mitigation: 4/10 | AWS VPC CNI Node IP Pool Depletion Crisis | Critical AWS VPC CNI node IP pool depletion detected causing cascading pod scheduling failures. This pattern indicates severe subnet IP address exhaustion combined with ENI allocation failures, leading to complete cluster networking breakdown. The failure sequence shows ipamd errors, kubelet scheduling failures, and controller-level pod creation blocks that render clusters unable to deploy new workloads, scale existing services, or recover from node failures. This represents one of the most severe Kubernetes infrastructure failures, often requiring immediate manual intervention including subnet expansion, secondary CIDR provisioning, or emergency workload termination to restore cluster functionality. | VPC CNI Problems | aws-vpc-cni | AWSEKSKubernetesNetworkingVPC CNIAWS CNIIP ExhaustionENI AllocationSubnet ExhaustionPod Scheduling FailureCluster ParalysisAWS API LimitsKnown ProblemCritical InfrastructureService OutageCascading FailureCapacity ExceededScalability IssueRevenue ImpactCompliance ViolationThreshold ExceededInfrastructurePublic |