Skip to main content

Tag: Kubernetes

Problems related to Kubernetes, such as pod failures, API errors, or scheduling issues

IDTitleDescriptionCategoryTechnologyTags
prequel-2025-0001
Critical
Impact: 7/10
Mitigation: 3/10
Telepresence.io Traffic Manager Excessive Client-side Kubernetes API ThrottlingOne or more cluster components (kubectl sessions, operators, controllers, CI/CD jobs, etc.) hit the **default client-side rate-limiter in client-go** (QPS = 5, Burst = 10). The client logs messages such as `Waited for <N>s due to client-side throttling, not priority and fairness` and delays each request until a token is available. Although the API server itself may still have spare capacity, and Priority & Fairness queueing is not the bottleneck, end-user actions and controllers feel sluggish or appear to “stall”.Kubernetes Problemstraffic-managerKubernetesTelepresenceTraffic ManagerAPI Throttling
prequel-2025-0002
Medium
Impact: 7/10
Mitigation: 3/10
Envoy metrics scraping failure with unexpected EOFPrometheus is failing to scrape and write Envoy metrics from Istio sidecars due to an unexpected EOF error. This occurs when trying to collect metrics from services that don't have proper protocol selection configured in their Kubernetes Service definitionService Mesh MonitoringprometheusPrometheusIstioEnvoyMetricsService MeshKubernetes
prequel-2025-0010
High
Impact: 8/10
Mitigation: 4/10
Telepresence agent-injector certificate reload failureTelepresence 2.5.x versions suffer from a critical TLS handshake error between the mutating webhook and the agent injector. When the certificate is rotated or regenerated, the agent-injector pod fails to reload the new certificate, causing all admission requests to fail with \"remote error: tls: bad certificate\". This effectively breaks the traffic manager's ability to inject the agent into workloads, preventing Telepresence from functioning properly.Kubernetes Problemstraffic-managerKnown ProblemTelepresenceKubernetesCertificate
prequel-2025-0020
High
Impact: 8/10
Mitigation: 2/10
Too many replicas scheduled on the same node80% or more of a deployment's replica pods are scheduled on the same Kubernetes node. If this node shuts down or experiences a problem, the service will experience an outage.Fault Tolerance ProblemsdruReplicaKubernetes
prequel-2025-0027
Low
Impact: 5/10
Mitigation: 2/10
Ingress Nginx Prefix Wildcard ErrorThe NGINX Ingress Controller rejects an Ingress manifest whose `pathType: Prefix` value contains a wildcard (`*`). Log excerpt: ``` ingress: default/api prefix path shouldn't contain wildcards ``` When the controller refuses the rule, it omits it from the generated `nginx.conf`; clients receive **404 / 502** responses even though the manifest was accepted by the Kubernetes API server. The problem appears most often after upgrading to ingress-nginx ≥ 1.8, where stricter validation was added.Ingress Problemsingress-nginxNginxIngressKubernetes
prequel-2025-0081
Medium
Impact: 6/10
Mitigation: 4/10
ArgoCD RawExtension API Field Error with Datadog OperatorArgoCD application controller fails to process certain custom resources due to being unable to find API fields in struct RawExtension. This commonly affects users deploying Datadog Operator CRDs, resulting in application sync errors for these resources.Continuous Delivery ProblemsargocdArgoCDKubernetesCustom ResourceDatadog
prequel-2025-0087
Medium
Impact: 7/10
Mitigation: 5/10
Kyverno JMESPath query failure due to unknown keyKyverno policies with JMESPath expressions are failing due to references to keys that don't exist in the target resources. This happens when policies attempt to access object properties that aren't present in the resources being validated, resulting in \"Unknown key\" errors during policy validation.Policy Enforcement IssueskyvernoKyvernoKubernetesPolicy Management
prequel-2025-0090
High
Impact: 8/10
Mitigation: 5/10
Karpenter version incompatible with Kubernetes version; Pods cannot be scheduledKarpenter is unable to provision new nodes because the current Karpenter version is not compatible with Kubernetes version . This incompatibility causes validation errors in the nodeclass controller and prevents pods from being scheduled properly in the cluster.Kubernetes Provisioning ProblemskarpenterAWSKarpenterKubernetes