Tag: Kubernetes
Problems related to Kubernetes, such as pod failures, API errors, or scheduling issues
ID | Title | Description | Category | Technology | Tags |
---|---|---|---|---|---|
prequel-2025-0001 Critical Impact: 7/10 Mitigation: 3/10 | Telepresence.io Traffic Manager Excessive Client-side Kubernetes API Throttling | One or more cluster components (kubectl sessions, operators, controllers, CI/CD jobs, etc.) hit the **default client-side rate-limiter in client-go** (QPS = 5, Burst = 10). The client logs messages such as `Waited for <N>s due to client-side throttling, not priority and fairness` and delays each request until a token is available. Although the API server itself may still have spare capacity, and Priority & Fairness queueing is not the bottleneck, end-user actions and controllers feel sluggish or appear to “stall”. | Kubernetes Problems | traffic-manager | KubernetesTelepresenceTraffic ManagerAPI Throttling |
prequel-2025-0002 Medium Impact: 7/10 Mitigation: 3/10 | Envoy metrics scraping failure with unexpected EOF | Prometheus is failing to scrape and write Envoy metrics from Istio sidecars due to an unexpected EOF error. This occurs when trying to collect metrics from services that don't have proper protocol selection configured in their Kubernetes Service definition | Service Mesh Monitoring | prometheus | PrometheusIstioEnvoyMetricsService MeshKubernetes |
prequel-2025-0010 High Impact: 8/10 Mitigation: 4/10 | Telepresence agent-injector certificate reload failure | Telepresence 2.5.x versions suffer from a critical TLS handshake error between the mutating webhook and the agent injector. When the certificate is rotated or regenerated, the agent-injector pod fails to reload the new certificate, causing all admission requests to fail with \"remote error: tls: bad certificate\". This effectively breaks the traffic manager's ability to inject the agent into workloads, preventing Telepresence from functioning properly. | Kubernetes Problems | traffic-manager | Known ProblemTelepresenceKubernetesCertificate |
prequel-2025-0020 High Impact: 8/10 Mitigation: 2/10 | Too many replicas scheduled on the same node | 80% or more of a deployment's replica pods are scheduled on the same Kubernetes node. If this node shuts down or experiences a problem, the service will experience an outage. | Fault Tolerance Problems | dru | ReplicaKubernetes |
prequel-2025-0027 Low Impact: 5/10 Mitigation: 2/10 | Ingress Nginx Prefix Wildcard Error | The NGINX Ingress Controller rejects an Ingress manifest whose `pathType: Prefix` value contains a wildcard (`*`). Log excerpt: ``` ingress: default/api prefix path shouldn't contain wildcards ``` When the controller refuses the rule, it omits it from the generated `nginx.conf`; clients receive **404 / 502** responses even though the manifest was accepted by the Kubernetes API server. The problem appears most often after upgrading to ingress-nginx ≥ 1.8, where stricter validation was added. | Ingress Problems | ingress-nginx | NginxIngressKubernetes |
prequel-2025-0081 Medium Impact: 6/10 Mitigation: 4/10 | ArgoCD RawExtension API Field Error with Datadog Operator | ArgoCD application controller fails to process certain custom resources due to being unable to find API fields in struct RawExtension. This commonly affects users deploying Datadog Operator CRDs, resulting in application sync errors for these resources. | Continuous Delivery Problems | argocd | ArgoCDKubernetesCustom ResourceDatadog |
prequel-2025-0087 Medium Impact: 7/10 Mitigation: 5/10 | Kyverno JMESPath query failure due to unknown key | Kyverno policies with JMESPath expressions are failing due to references to keys that don't exist in the target resources. This happens when policies attempt to access object properties that aren't present in the resources being validated, resulting in \"Unknown key\" errors during policy validation. | Policy Enforcement Issues | kyverno | KyvernoKubernetesPolicy Management |
prequel-2025-0090 High Impact: 8/10 Mitigation: 5/10 | Karpenter version incompatible with Kubernetes version; Pods cannot be scheduled | Karpenter is unable to provision new nodes because the current Karpenter version is not compatible with Kubernetes version . This incompatibility causes validation errors in the nodeclass controller and prevents pods from being scheduled properly in the cluster. | Kubernetes Provisioning Problems | karpenter | AWSKarpenterKubernetes |