Tag: Kubernetes

Problems related to Kubernetes, such as pod failures, API errors, or scheduling issues

ID	Title	Description	Category	Technology	Tags
prequel-2025-0001 Critical Impact: 7/10 Mitigation: 3/10	Telepresence.io Traffic Manager Excessive Client-side Kubernetes API Throttling	One or more cluster components (kubectl sessions, operators, controllers, CI/CD jobs, etc.) hit the default client-side rate-limiter in client-go (QPS = 5, Burst = 10). The client logs messages such as `Waited for <N>s due to client-side throttling, not priority and fairness` and delays each request until a token is available. Although the API server itself may still have spare capacity, and Priority & Fairness queueing is not the bottleneck, end-user actions and controllers feel sluggish or appear to “stall”.	Kubernetes Problems	traffic-manager	Kubernetes Telepresence Traffic Manager API Throttling
prequel-2025-0002 Medium Impact: 7/10 Mitigation: 3/10	Envoy metrics scraping failure with unexpected EOF	Prometheus is failing to scrape and write Envoy metrics from Istio sidecars due to an unexpected EOF error. This occurs when trying to collect metrics from services that don't have proper protocol selection configured in their Kubernetes Service definition	Service Mesh Monitoring	prometheus	Prometheus Istio Envoy Metrics Service Mesh Kubernetes
prequel-2025-0010 High Impact: 8/10 Mitigation: 4/10	Telepresence agent-injector certificate reload failure	Telepresence 2.5.x versions suffer from a critical TLS handshake error between the mutating webhook and the agent injector. When the certificate is rotated or regenerated, the agent-injector pod fails to reload the new certificate, causing all admission requests to fail with \"remote error: tls: bad certificate\". This effectively breaks the traffic manager's ability to inject the agent into workloads, preventing Telepresence from functioning properly.	Kubernetes Problems	traffic-manager	Known Problem Telepresence Kubernetes Certificate
prequel-2025-0020 High Impact: 8/10 Mitigation: 2/10	Too many replicas scheduled on the same node	80% or more of a deployment's replica pods are scheduled on the same Kubernetes node. If this node shuts down or experiences a problem, the service will experience an outage.	Fault Tolerance Problems	dru	Replica Kubernetes
prequel-2025-0027 Low Impact: 5/10 Mitigation: 2/10	Ingress Nginx Prefix Wildcard Error	The NGINX Ingress Controller rejects an Ingress manifest whose `pathType: Prefix` value contains a wildcard (``). Log excerpt: ``` ingress: default/api prefix path shouldn't contain wildcards ``` When the controller refuses the rule, it omits it from the generated `nginx.conf`; clients receive 404 / 502* responses even though the manifest was accepted by the Kubernetes API server. The problem appears most often after upgrading to ingress-nginx ≥ 1.8, where stricter validation was added.	Ingress Problems	ingress-nginx	Nginx Ingress Kubernetes
prequel-2025-0081 Medium Impact: 6/10 Mitigation: 4/10	ArgoCD RawExtension API Field Error with Datadog Operator	ArgoCD application controller fails to process certain custom resources due to being unable to find API fields in struct RawExtension. This commonly affects users deploying Datadog Operator CRDs, resulting in application sync errors for these resources.	Continuous Delivery Problems	argocd	ArgoCD Kubernetes Custom Resource Datadog
prequel-2025-0087 Medium Impact: 7/10 Mitigation: 5/10	Kyverno JMESPath query failure due to unknown key	Kyverno policies with JMESPath expressions are failing due to references to keys that don't exist in the target resources. This happens when policies attempt to access object properties that aren't present in the resources being validated, resulting in \"Unknown key\" errors during policy validation.	Policy Enforcement Issues	kyverno	Kyverno Kubernetes Policy Management
prequel-2025-0090 High Impact: 8/10 Mitigation: 5/10	Karpenter version incompatible with Kubernetes version; Pods cannot be scheduled	Karpenter is unable to provision new nodes because the current Karpenter version is not compatible with Kubernetes version . This incompatibility causes validation errors in the nodeclass controller and prevents pods from being scheduled properly in the cluster.	Kubernetes Provisioning Problems	karpenter	AWS Karpenter Kubernetes
prequel-2025-0093 Medium Impact: 8/10 Mitigation: 5/10	aws-load-balancer-controller rejects Ingress resource with wildcard path and Prefix pathType	The aws-load-balancer-controller is unable to translate an Ingress resource into an AWS ALB Listener Rule when the path contains a wildcard (*) and the pathType is set to Prefix.	Kubernetes Networking Problems	aws-load-balancer-controller	Kubernetes AWS Loadbalancer Controller Ingress Resource AWS Networking Configuration Path Validation ALB Routing