Commercial CREs
Welcome to the Commercial CRE feed, where you can explore and discover commercial CRES by tags, categories, or other details. Use the tabs below to navigate between different views.
- Categories
- Tags
- Technologies
- CREs
Observability Problems
6 CREs
Problems related to observability, like monitoring, logging, and tracing
Container Security
6 CREs
Problems related to container security, such as image vulnerabilities, deprecated repositories, insecure registries, or container image policy violations
GraphQL Problems
5 CREs
Problems related to GraphQL
Jetty Problems
4 CREs
Problems related to Java Jetty
Ingress Problems
4 CREs
Problems related to Ingress
Resource Management
4 CREs
Problems related to resource management in Kubernetes, such as missing CPU/memory requests, limits, or resource allocation issues
Istio Problems
3 CREs
Problems related to Istio
OTEL Problems
3 CREs
Problems related to OTEL
Message Queue Problems
2 CREs
Problems related to message queues, like Kafka, RabbitMQ, NATS, and others
Memory Problems
2 CREs
Problems related to memory
ArgoCD Problems
2 CREs
Problems related to ArgoCD
Kubernetes Problems
2 CREs
Problems related to Kubernetes
AWS Problems
2 CREs
Problems related to AWS
Kubernetes Provisioning Problems
2 CREs
Problems related to Kubernetes node provisioning and scaling, such as autoscaler failures, capacity issues, or provisioner configuration problems
Kubernetes Best Practices
2 CREs
Problems related to violations of Kubernetes best practices, such as missing health checks, resource specifications, or security configurations
API Service Problems
1 CREs
Problems related to API services, such as GraphQL validation errors, REST API issues, or service communication failures
Proxy Problems
1 CREs
Problems related to proxies, like NGINX, HAProxy, and others
Networking Problems
1 CREs
Connectivity, DNS, or routing issues affecting system communication.
Service Mesh Monitoring
1 CREs
Problems related to service mesh monitoring
Storage Problems
1 CREs
Problems related to storage
Service Mesh Problems
1 CREs
Problems related to service mesh
MongoDB Problems
1 CREs
Problems related to MongoDB
SQL Problems
1 CREs
Problems related to SQL
Fault Tolerance Problems
1 CREs
Problems related to fault tolerance
Kafka Problems
1 CREs
Problems related to Kafka
Secrets Problems
1 CREs
Problems related to secrets
Clickhouse Problems
1 CREs
Problems related to Clickhouse
Postgres Problems
1 CREs
Problems related to Postgres
Kubernetes Networking Problems
1 CREs
Problems related to Kubernetes networking, including Ingress, Services, and traffic routing
Traefik Problems
1 CREs
Problems related to Traefik
Prometheus Problems
1 CREs
Problems related to Prometheus
NATS Problems
1 CREs
Problems related to NATS.io
Application Error
1 CREs
Problems related to application errors
Continuous Delivery Problems
1 CREs
Problems related to continuous delivery and deployment pipelines
High Availability Problems
1 CREs
Problems related to high availability, such as cluster communication failures, quorum loss, or split-brain scenarios
Database Integrity Problems
1 CREs
Problems related to database integrity constraints, such as not-null violations, unique constraint violations, or foreign key violations
Message Broker Errors
1 CREs
Problems related to message brokers, such as message size limits, connection issues, or configuration problems
Database Problems
1 CREs
Problems related to databases, like MySQL, PostgreSQL, MongoDB, and others
Workflow Service Problems
1 CREs
Problems related to workflow orchestration services, such as task execution failures, archival issues, or service coordination problems
Policy Enforcement Issues
1 CREs
Problems related to policy enforcement systems, such as admission controllers, policy engines, or security policy validation failures
Autoscaling Problems
1 CREs
Problems related to autoscaling behavior and policies, including HPA/VPA/Karpenter budget limits or scale failures
Data Storage Problems
1 CREs
Problems related to data storage systems, such as Elasticsearch indexing failures, field limit exceeded, or data persistence issues
Kubernetes
24 CREs
Problems related to Kubernetes, such as pod failures, API errors, or scheduling issues
Configuration
7 CREs
Problems caused by incorrect or missing configuration settings
Deployment
7 CREs
Problems related to application deployment, such as pod scheduling failures, resource constraints, or configuration deployment issues
Errors
6 CREs
Problems with application errors
Observability
6 CREs
Problems in observability tooling, such as unintended performance impact or missing telemetry
GraphQL
6 CREs
Problems related to GraphQL, such as Apollo GraphQL errors.
Bitnami
6 CREs
Problems related to Bitnami container images, repositories, and deployment issues
Container Images
6 CREs
Problems related to container image pulls, registry issues, or image deployment failures
AWS
5 CREs
Amazon Web Services
Loki
5 CREs
Problems with Grafana Loki
Nginx
5 CREs
Problems related to Nginx, such as weak ciphers, configuration errors, or performance issues
Istio
5 CREs
Problems related to Istio, such as Istio Ingress Gateway, or Istio Sidecar.
Exceptions
5 CREs
Problems related to exceptions, such as unhandled exceptions, or uncaught exceptions.
Image Pulls
5 CREs
Problems related to container image pull operations, registry connectivity, or image availability
Docker Hub
5 CREs
Problems related to Docker Hub registry, such as image pull issues, repository access, or registry policy changes
OOM
4 CREs
Problems related to Out of Memory (OOM), such as process OOM, or container OOM.
Jetty
4 CREs
Problems related to Java Jetty, such as Jetty HTTP 500 errors, or Jetty LDAP timeout.
Ingress
4 CREs
Problems related to Ingress, such as Ingress timeout, or Ingress connection timeout.
Threshold Exceeded
3 CREs
An external API limit has been exceeded and API requests are currently failing or being throttled
Known Problem
3 CREs
This is a documented known problem with known mitigations
Kafka
3 CREs
Problems with Apache Kafka
PostgreSQL
3 CREs
Problems with PostgreSQL
Prometheus
3 CREs
Problems with scraping, rule evaluation, or querying Prometheus data.
Storage
3 CREs
Failures in block, object, or ephemeral storage backends.
Timeout
3 CREs
Operations that exceeded their allotted execution window.
Datadog
3 CREs
Problems related to Datadog integration, such as missing metrics, reporting failures, or misconfigurations
Apollo
3 CREs
Problems related to Apollo, such as Apollo GraphQL errors.
Error
3 CREs
Problems related to errors, such as errors in the application, or errors in the infrastructure.
ArgoCD
3 CREs
Problems related to ArgoCD, such as ArgoCD applications in a sync loop.
OTEL
3 CREs
Problems related to OpenTelemetry, such as OpenTelemetry timeout, or OpenTelemetry connection timeout.
Karpenter
2 CREs
Problems with Karpenter
Cache
2 CREs
Problems related to caching mechanisms, including stale data, cache misses, or eviction faults
Data Loss
2 CREs
Problems where data is lost or dropped due to system failures or processing errors
Memcached
2 CREs
Problems related to Memcached, such as cache evictions, connection errors, or stale entries
Memory
2 CREs
Problems related to memory usage, such as leaks, pressure, or out-of-memory crashes
Metrics
2 CREs
Problems related to metrics collection or reporting, such as missing, delayed, or incorrect data
Networking
2 CREs
Problems within networking components, such as interface misconfigurations or routing errors
Telepresence
2 CREs
Problems related to Telepresence, such as Telepresence.io Traffic Manager, or Telepresence.io Traffic Agent.
Certificate
2 CREs
Problems related to certificates, such as TLS handshake errors, or expired certificates.
Database
2 CREs
Problems related to databases, such as PostgreSQL, or MySQL.
LDAP
2 CREs
Problems related to LDAP, such as LDAP timeout, or LDAP connection timeout.
Autoscaling
2 CREs
Problems related to Autoscaling, such as Autoscaling timeout, or Autoscaling connection timeout.
Configuration Issue
2 CREs
Problems related to configuration issues, such as misconfigured settings, invalid parameters, or missing configuration
Ingester
2 CREs
Problems related to data ingestion components, such as Loki ingesters, log processors, or data pipeline components
Deprecated Repository
2 CREs
Problems related to using deprecated or unsupported container image repositories
Scheduling
2 CREs
Problems related to pod or container scheduling decisions and resource allocation
Performance
2 CREs
Problems related to application or system performance degradation
Health Checks
2 CREs
Problems related to application health check configuration or monitoring
Reliability
2 CREs
Problems related to service reliability, availability, or fault tolerance
Continuous Delivery
1 CREs
Problems related to continuous delivery processes, pipelines, and deployment automation
GitOps
1 CREs
Problems related to GitOps practices, tools, and workflows for infrastructure and application deployment
API Error
1 CREs
Problems related to API errors, such as validation failures, malformed requests, or service communication issues
EKS
1 CREs
Amazon Elastic Kubernetes Service
Crash
1 CREs
Problems with applications crashing
Misconfiguration
1 CREs
Problems with misconfigurations
Panic
1 CREs
Crashes due to unrecoverable errors, especially in Go or Rust applications.
Security
1 CREs
Misconfigurations or vulnerabilities in authentication, authorization, or encryption.
Service
1 CREs
Failures at the service or API layer of an application.
Telemetry
1 CREs
Issues with emitting, collecting, or transforming observability data.
Validation
1 CREs
Input or schema validation failures in form submissions or APIs.
Ingress Resource
1 CREs
Problems related to Kubernetes Ingress resources and routing rules.
Path Validation
1 CREs
Problems related to validation of URL paths and routing patterns.
ALB
1 CREs
Problems related to AWS Application Load Balancer (ALB).
Routing
1 CREs
Problems related to traffic routing and path matching.
Backpressure
1 CREs
Problems where producers overwhelm consumers, causing resource exhaustion or unhandled pressure
Django
1 CREs
Problems related to the Django framework, such as view errors, middleware faults, or misconfigurations
Grafana
1 CREs
Problems related to Grafana services, that may impact performance, or telemetry collection and storage
Known Issue
1 CREs
Problems already identified and documented as known issues
Network
1 CREs
Problems related to network communication, such as packet loss, latency spikes, or unreachable hosts
NATS
1 CREs
Problems related to NATS, such as authorization failures, message loss, or configuration issues
Public
1 CREs
Open source CREs contributed by the problem detection community
DNS
1 CREs
Problems related to DNS, such as hostname resolution failures, or DNS server misconfigurations
Strimzi
1 CREs
Problems related to Strimzi, such as Kafka Topic Operator thread blocking, or Kafka Topic Operator not being able to create or update topics.
API Throttling
1 CREs
Problems related to API throttling, such as excessive client-side throttling, or API server throttling.
Traffic Manager
1 CREs
Problems related to Telepresence.io Traffic Manager, such as excessive client-side throttling, or API server throttling.
Envoy
1 CREs
Problems related to Envoy, such as Envoy proxy, or Envoy metrics.
Service Mesh
1 CREs
Problems related to service mesh, such as Istio, or Envoy.
WAL
1 CREs
Problems related to the Write-Ahead Log (WAL)
Disk Space
1 CREs
Problems related to disk space, such as out of disk space, or disk full.
Out of Disk Space
1 CREs
Problems related to out of disk space, such as disk full, or disk space exhaustion.
Disk Full
1 CREs
Problems related to disk full, such as disk space exhaustion, or disk space full.
Tracing
1 CREs
Problems related to tracing, such as Jaeger, or Zipkin.
Kiali
1 CREs
Problems related to Kiali, such as Kiali not being able to fetch Istio traces.
Sync
1 CREs
Problems related to syncing, such as ArgoCD applications in a sync loop.
nestjs
1 CREs
Problems related to NestJS Node.js framework, such as unhandled exceptions in resolvers, dependency injection failures, misconfigured modules, or errors surfaced through internal helpers like external-context-creator.js.
Java
1 CREs
Problems related to Java, such as Java exceptions, or Java errors.
SQL
1 CREs
Problems related to SQL, such as SQL errors, or SQL timeout.
MongoDB
1 CREs
Problems related to MongoDB, such as MongoDB timeout, or MongoDB connection timeout.
Replica
1 CREs
Problems related to replicas, such as replicas not being scheduled, or replicas not being ready.
Clickhouse
1 CREs
Problems related to Clickhouse, such as Clickhouse timeout, or Clickhouse connection timeout.
Secrets
1 CREs
Problems related to secrets, such as secrets timeout, or secrets connection timeout.
Access Denied
1 CREs
Problems related to access denied, such as access denied, or access denied timeout.
Network Errors
1 CREs
Problems related to network errors, such as network errors, or network errors timeout.
XDS
1 CREs
Problems related to XDS, such as XDS errors, or XDS timeout.
Fargate
1 CREs
Problems related to Fargate, such as Fargate timeout, or Fargate connection timeout.
Traefik
1 CREs
Problems related to Traefik, such as Traefik timeout, or Traefik connection timeout.
Loadbalancer
1 CREs
Problems related to Loadbalancer, such as Loadbalancer timeout, or Loadbalancer connection timeout.
Security Group
1 CREs
Problems related to Security Group, such as Security Group timeout, or Security Group connection timeout.
AWS Loadbalancer Controller
1 CREs
Problems related to AWS Loadbalancer Controller, such as AWS Loadbalancer Controller timeout, or AWS Loadbalancer Controller connection timeout.
Capacity
1 CREs
Problems related to capacity constraints, quotas, and limits
Budgets
1 CREs
Problems related to cost or resource budgets and budget enforcement
Runtime Error
1 CREs
Problems related to runtime errors, such as unhandled exceptions, or application crashes
Application Exception
1 CREs
Problems related to application exceptions, such as unhandled exceptions, or application crashes
Custom Resource
1 CREs
Problems related to Kubernetes custom resources, such as CRD validation errors or controller failures
Ruby
1 CREs
Problems related to Ruby applications, such as runtime errors, exceptions, or framework-specific issues
Vault
1 CREs
Problems related to HashiCorp Vault, such as unsealing failures, authentication issues, or secret management problems
Raft
1 CREs
Problems related to Raft consensus protocol, such as leader election failures, quorum loss, or cluster communication issues
Consensus
1 CREs
Problems related to distributed consensus mechanisms, such as quorum loss, split-brain scenarios, or leader election failures
Data Error
1 CREs
Problems related to data errors, such as malformed data, encoding issues, or data validation failures
Producer Error
1 CREs
Problems related to message producers, such as message size limits, connection issues, or configuration problems
Data Integrity
1 CREs
Problems related to data integrity, such as constraint violations, data validation failures, or data consistency issues
Unicode
1 CREs
Problems related to Unicode encoding, decoding, or escape sequences in data or application logic
Temporal
1 CREs
Problems related to Temporal workflow orchestration service, including worker, server, and visibility issues
Archival
1 CREs
Problems related to data archival processes, storage, or retrieval operations
Data Retention
1 CREs
Issues involving data lifecycle management, retention policies, or cleanup processes
Policy Management
1 CREs
Issues related to policy definition, enforcement, validation, or compliance in systems like Kyverno, OPA, or other policy engines
Kyverno
1 CREs
Issues specific to Kyverno policy engine, including policy validation, admission control, and JMESPath query failures
Data Transforms
1 CREs
Problems related to data transforms, such as Redpanda data transforms, or Kafka data transforms.
Pod Termination
1 CREs
Problems related to pod termination, such as pod termination, or pod termination timeout.
WebAssembly
1 CREs
Problems related to WebAssembly, such as WebAssembly errors, or WebAssembly timeout.
Cloudflare
1 CREs
Problems related to Cloudflare services, such as DNS API changes, authentication issues, or configuration problems
Cert-Manager
1 CREs
Problems related to cert-manager, such as certificate generation failures, ACME challenge issues, or DNS provider integration problems
API Deprecation
1 CREs
Problems caused by deprecated API endpoints, removed features, or breaking changes in external service APIs
Elasticsearch
1 CREs
Problems related to Elasticsearch, such as indexing failures, field limit exceeded, or cluster communication issues
Logstash
1 CREs
Problems related to Logstash, such as pipeline failures, output errors, or configuration issues
Indexing Failure
1 CREs
Problems related to data indexing failures, such as Elasticsearch indexing errors, field limit exceeded, or mapping issues
Object Size Limit
1 CREs
Problems related to object size limits being exceeded, such as cache objects, message payloads, or data entries exceeding configured size thresholds
Compactor
1 CREs
Problems related to data compaction processes, such as Loki compactor, database compaction, or log file consolidation operations
Schema
1 CREs
Problems related to database or storage schema issues, such as schema mismatches, validation failures, or migration problems
Index
1 CREs
Problems related to database or storage indexes, such as index corruption, missing indexes, or index configuration issues
Replication
1 CREs
Problems related to data replication, such as replication factor mismatches, replica failures, or synchronization issues
Dev Only
1 CREs
Problems related to development-only images or resources being used inappropriately in production
Image Pull Errors
1 CREs
Problems related to failures when pulling container images from registries
Repository Deprecation
1 CREs
Problems related to container image repositories being deprecated or discontinued
Migration Required
1 CREs
Problems requiring immediate migration to supported alternatives or new systems
Migration Planning
1 CREs
Problems related to planning and executing migrations between systems or services
Catalog Changes
1 CREs
Problems related to changes in service or image catalogs affecting existing deployments
CPU Requests
1 CREs
Problems related to CPU resource requests in container or pod specifications
CPU Limits
1 CREs
Problems related to CPU resource limits in container or pod specifications
Memory Requests
1 CREs
Problems related to memory resource requests in container or pod specifications
Memory Limits
1 CREs
Problems related to memory resource limits in container or pod specifications
Memory Exhaustion
1 CREs
Problems related to memory exhaustion causing node instability or service disruption
Resource Exhaustion
1 CREs
Problems related to CPU or other resource exhaustion causing performance degradation
Liveness Probe
1 CREs
Problems related to liveness probe configuration or health check failures
Readiness Probe
1 CREs
Problems related to readiness probe configuration or traffic routing issues
Availability
1 CREs
Problems related to service availability, uptime, or accessibility
Traffic Routing
1 CREs
Problems related to traffic routing decisions and load balancing
v1
7 CREs
kubernetes
6 CREs
graphql
6 CREs
jetty
5 CREs
ingress-nginx
4 CREs
oom
3 CREs
argocd
3 CREs
istio
3 CREs
traffic-manager
2 CREs
prometheus
2 CREs
datadog
2 CREs
otel-collector
2 CREs
aws-load-balancer-controller
2 CREs
karpenter
2 CREs
ingester
2 CREs
distributor
2 CREs
loki
1 CREs
kiali
1 CREs
sql
1 CREs
pymongo
1 CREs
dru
1 CREs
external-secrets
1 CREs
clickhouse
1 CREs
traefik
1 CREs
nats
1 CREs
otel-operator
1 CREs
aws-cluster-autoscaler
1 CREs
ruby
1 CREs
vault
1 CREs
python
1 CREs
celery
1 CREs
psycopg2
1 CREs
kyverno
1 CREs
temporal
1 CREs
redpanda
1 CREs
aws-cni
1 CREs
cert-manager
1 CREs
logstash
1 CREs
compactor
1 CREs
ID | Title | Description | Category | Tags |
---|---|---|---|---|
prequel-2024-0006 Medium Impact: 8/10 Mitigation: 2/10 | Kafka Topic Operator Thread Blocked | There is a known issue in the Strimzi Kafka Topic Operator where the operator thread can become blocked. This can cause the operator to stop processing events and can lead to a backlog of events. This can cause the operator to become unresponsive and can lead to liveness probe failures and restarts of the Strimzi Kafka Topic Operator. | Message Queue Problems | Known ProblemKafkaStrimzi |
prequel-2025-0001 Critical Impact: 7/10 Mitigation: 3/10 | Telepresence.io Traffic Manager Excessive Client-side Kubernetes API Throttling | One or more cluster components (kubectl sessions, operators, controllers, CI/CD jobs, etc.) hit the **default client-side rate-limiter in client-go** (QPS = 5, Burst = 10). The client logs messages such as `Waited for ‹N›s due to client-side throttling, not priority and fairness` and delays each request until a token is available. Although the API server itself may still have spare capacity, and Priority & Fairness queueing is not the bottleneck, end-user actions and controllers feel sluggish or appear to “stall”. | Kubernetes Problems | KubernetesTelepresenceTraffic ManagerAPI Throttling |
prequel-2025-0002 Medium Impact: 7/10 Mitigation: 3/10 | Envoy metrics scraping failure with unexpected EOF | Prometheus is failing to scrape and write Envoy metrics from Istio sidecars due to an unexpected EOF error. This occurs when trying to collect metrics from services that don't have proper protocol selection configured in their Kubernetes Service definition | Service Mesh Monitoring | PrometheusIstioEnvoyMetricsService MeshKubernetes |
prequel-2025-0003 Low Impact: 4/10 Mitigation: 5/10 | Loki WAL Out of Disk Space | Loki is experiencing an out of disk space error due to the WAL (Write-Ahead Logging) filling up the disk. This can happen when the WAL is not properly configured or when the disk is full. | Storage Problems | LokiWALDisk SpaceOut of Disk SpaceDisk Full |
prequel-2025-0004 Low Impact: 7/10 Mitigation: 8/10 | Process Out of Memory | A pod OOM (Out Of Memory) crash in occurs when a container inside a pod tries to use more memory than has been allocated to it, causing the container to be terminated by the operating system. | Memory Problems | OOMCrash |
prequel-2025-0005 High Impact: 3/10 Mitigation: 3/10 | Kiali Unable to Fetch Istio Traces | Kiali is unable to fetch Istio traces due to a configuration error. | Service Mesh Problems | IstioTracingKiali |
prequel-2025-0006 Low Impact: 3/10 Mitigation: 7/10 | Apollo GraphQL Error | An application using Apollo GraphQL is experiencing an error. | GraphQL Problems | ApolloGraphQLError |
prequel-2025-0007 High Impact: 3/10 Mitigation: 7/10 | GraphQL "Cannot read properties of undefined" error | Indicates an error in a subgraph service query during query execution in a federated service. | GraphQL Problems | ApolloGraphQLError |
prequel-2025-0008 High Impact: 3/10 Mitigation: 7/10 | Apollo GraphQL DOWNSTREAM_SERVICE_ERROR | Indicates an error in a subgraph service query during query execution in a federated service. | GraphQL Problems | ApolloGraphQLError |
prequel-2025-0009 Low Impact: 4/10 Mitigation: 3/10 | ArgoCD Excessive Syncs | ArgoCD Reconciliation Storm | ArgoCD Problems | ArgoCDSync |
prequel-2025-0010 High Impact: 8/10 Mitigation: 4/10 | Telepresence agent-injector certificate reload failure | Telepresence 2.5.x versions suffer from a critical TLS handshake error between the mutating webhook and the agent injector. When the certificate is rotated or regenerated, the agent-injector pod fails to reload the new certificate, causing all admission requests to fail with "remote error: tls: bad certificate". This effectively breaks the traffic manager's ability to inject the agent into workloads, preventing Telepresence from functioning properly. | Kubernetes Problems | Known ProblemTelepresenceKubernetesCertificate |
prequel-2025-0011 Medium Impact: 7/10 Mitigation: 5/10 | GraphQL internal server error due to record not found | The application is experiencing internal server errors when GraphQL operations attempt to access records that do not exist in the database. This occurs when GraphQL queries reference entities that have been deleted, were never created, or are inaccessible due to permission issues. Instead of handling these cases gracefully with proper error responses, the API is escalating them to internal server errors that may impact client applications and user experience. | GraphQL Problems | GraphQLDatabaseErrors |
prequel-2025-0012 High Impact: 6/10 Mitigation: 5/10 | GraphQL internal server error due to unhandled exception in NestJS resolver | The application is generating internal server errors during GraphQL operations due to uncaught exceptions in resolver logic. These errors are not properly handled or transformed into structured GraphQL responses, resulting in unexpected 500-level failures for client applications. Stack traces often reference NestJS internal files like `external-context-creator.js`, indicating the framework attempted to execute resolver logic but encountered an exception that was not intercepted by the application code. | GraphQL Problems | GraphQLErrorsnestjs |
prequel-2025-0013 Critical Impact: 9/10 Mitigation: 6/10 | Deployment Replica OOM Caused HTTP 5xx Error | A deployment replica OOM caused HTTP 5xx error. | Memory Problems | OOMErrors |
prequel-2025-0014 Medium Impact: 2/10 Mitigation: 3/10 | Jetty IllegalStateException | A session object in an application thread is possibly being accessed outside the scope of a request. | Jetty Problems | JettyExceptionsErrors |
prequel-2025-0015 Medium Impact: 4/10 Mitigation: 5/10 | Java SQL Batch Exception | A SQL batch exception occurred. | SQL Problems | JavaSQLExceptions |
prequel-2025-0016 Medium Impact: 3/10 Mitigation: 4/10 | MongoDB Server Timeouts | A MongoDB server timeout occurred. | MongoDB Problems | MongoDBTimeoutExceptions |
prequel-2025-0017 Medium Impact: 3/10 Mitigation: 4/10 | Jetty HTTP 500 Errors | A Jetty HTTP 500 error occurred. | Jetty Problems | JettyErrors |
prequel-2025-0018 Low Impact: 5/10 Mitigation: 6/10 | Jetty LDAP Timeout | A Jetty LDAP timeout occurred. | Jetty Problems | JettyLDAPTimeout |
prequel-2025-0019 Medium Impact: 6/10 Mitigation: 7/10 | Jetty LDAP Closed Exception | A Jetty LDAP closed exception occurred. | Jetty Problems | JettyLDAPExceptions |
prequel-2025-0020 High Impact: 8/10 Mitigation: 2/10 | Too many replicas scheduled on the same node | 80% or more of a deployment's replica pods are scheduled on the same Kubernetes node. If this node shuts down or experiences a problem, the service will experience an outage. | Fault Tolerance Problems | ReplicaKubernetes |
prequel-2025-0021 High Impact: 8/10 Mitigation: 3/10 | Kafka Streams Exception | A Kafka Streams exception occurred. One or more source topics were missing during a Kafka rebalance. | Kafka Problems | KafkaExceptions |
prequel-2025-0022 High Impact: 5/10 Mitigation: 4/10 | External Secrets Access Denied due to IAM Policy | External Secrets access denied due to IAM policy misconfiguration. | Secrets Problems | SecretsAccess Denied |
prequel-2025-0023 High Impact: 8/10 Mitigation: 2/10 | Clickhouse Keeper Network Errors | Large ClickHouse queries can consume a significant amount of resources, triggering several NETWORK_ERROR or NO_REPLICA_HAS_PART errors. | Clickhouse Problems | ClickhouseNetwork Errors |
prequel-2025-0024 High Impact: 6/10 Mitigation: 7/10 | Istio Traffic Timeout | Connections routed through **ztunnel** stop after the default 10s deadline. Ztunnel logs show `error access connection complete ... error="io error: deadline has elapsed"` or `error="connection timed out, maybe a NetworkPolicy is blocking HBONE port 15008"` while clients see 504 Gateway Timeout or connection-reset errors. The issue is limited to workloads enrolled in Ambient mode; sidecar-injected or “no-mesh” pods continue to work. | Istio Problems | IstioTimeout |
prequel-2025-0025 Low Impact: 3/10 Mitigation: 6/10 | Istio CNI Ztunnel Connection Failure | The CNI plugin is not connected to Ztunnel. For pods in the mesh, Istio will run a CNI plugin during the pod 'sandbox' creation. This configures the networking rules. This may intermittently fail, in which case Kubernetes will automatically retry. | Istio Problems | Istio |
prequel-2025-0026 Low Impact: 3/10 Mitigation: 6/10 | Istio XDS GRPC Failure | Envoy sidecars or Ambient **ztunnel** keep retrying the control-plane stream and log ``` XDS client connection error: gRPC connection error:status: Unknown, message: "...", source: tcp connect error: Connection refused (os error 111) ``` or ``` ... source: tcp connect error: deadline has elapsed ``` The proxies never reach “ADS stream established”, so no configuration, certificates, or policy updates are delivered until this is mitigated. | Istio Problems | IstioXDS |
prequel-2025-0027 Low Impact: 5/10 Mitigation: 2/10 | Ingress Nginx Prefix Wildcard Error | The NGINX Ingress Controller rejects an Ingress manifest whose `pathType: Prefix` value contains a wildcard (`*`). Log excerpt: ``` ingress: default/api prefix path shouldn't contain wildcards ``` When the controller refuses the rule, it omits it from the generated `nginx.conf`; clients receive **404 / 502** responses even though the manifest was accepted by the Kubernetes API server. The problem appears most often after upgrading to ingress-nginx ≥ 1.8, where stricter validation was added. | Ingress Problems | NginxIngressKubernetes |
prequel-2025-0028 Low Impact: 2/10 Mitigation: 2/10 | Datadog Postgres Check Exception | The Datadog Agent’s *Postgres* integration throws an uncaught Python traceback while trying to run an `EXPLAIN (FORMAT JSON)` against a sampled query. After the first failure the underlying **psycopg2** cursor is closed, and every subsequent collection cycle logs ``` Traceback … File ".../datadog_checks/postgres/explain_parameterized_queries.py", … psycopg2.InterfaceError: cursor already closed ``` The check status flips to **ERROR**, and query metrics / samples stop flowing. | Postgres Problems | PostgreSQLDatadog |
prequel-2025-0071 Critical Impact: 8/10 Mitigation: 4/10 | CPU Cores Cause Silent ingress-nginx Worker Crashes | The ingress-nginx controller worker processes are crashing because there are too many for the limits specified for this deployment. | Proxy Problems | NginxKnown Problem |
prequel-2025-0072 Low Impact: 3/10 Mitigation: 2/10 | OTel Collector Dropped Data to to High Memory Usage | The OpenTelemetry Collector’s **memory_limiter** processor (added by default in most distro Helm charts) protects the process RSS by monitoring the Go heap and rejecting exports once the *soft limit* (default 85 % of container/VM memory) is exceeded. After a queue/exporter exhausts its retry budget you’ll see log records such as: ``` no more retries left: rpc error: code = Unavailable desc = data refused due to high memory usage ``` The batches being dropped can be traces, metrics, or logs, depending on which pipeline hit the limit. | OTEL Problems | OTELMemoryBackpressure |
prequel-2025-0073 Low Impact: 5/10 Mitigation: 1/10 | OTel Collector Resource Detection Failure | The **resource_detection** processor fails while trying to determine basic host attributes and repeatedly logs: ``` failed getting OS type: failed to fetch Docker OS type: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? ``` The Collector keeps running but exports traces, metrics, or logs without mandatory resource labels, leading to data loss or mis-grouping in the backend. | OTEL Problems | OTELKnown Issue |
prequel-2025-0074 Low Impact: 8/10 Mitigation: 1/10 | Traefik License Expired | Traefik Enterprise (or Traefik Hub-enabled Proxy) periodically “pings” Traefik’s SaaS platform to validate the node-level licence token. When the licence or trial period lapses the process logs ``` Unable to ping platform error="your trial or license expired, contact sales if you want to enable your account" ``` and disables all commercial-only features (dashboards, enterprise plugins, distributed rate-limits, Hub service directory). Plain reverse-proxy routes may continue for a short grace period, but new configuration reloads are rejected. | Traefik Problems | Traefik |
prequel-2025-0075 Low Impact: 2/10 Mitigation: 5/10 | Prometheus Config Reload Failed | The **prometheus-config-reloader** sidecar (used by the Prometheus Operator / kube-prometheus-stack) detected a change in the ConfigMap/Secret but cannot POST to the Prometheus `/-/reload` endpoint. It logs repeatedly: ``` Failed to trigger reload. Retrying. ``` While the main Prometheus container keeps serving traffic, **new scrape configs, alerting rules, and recording rules are NOT applied**, leaving the instance frozen on an outdated configuration set. | Prometheus Problems | Prometheus |
prequel-2025-0076 Medium Impact: 6/10 Mitigation: 4/10 | NATS Route Error caused by DNS Resolution Failure | A NATS server establishes a TCP route, logs **“Route connection created”**, but within milliseconds DNS resolution for its peer fails; the server reports ``` Error trying to connect to route [nats://......]: lookup for host ....... no such host ``` and immediately closes the socket. When this sequence happens repeatedly the cluster oscillates between **full mesh** and **partitioned** states, leading to intermittent publish / subscribe errors and duplicate message deliveries. | NATS Problems | NATSDNS |
prequel-2025-0077 Low Impact: 2/10 Mitigation: 2/10 | OTEL Target Allocator Could Not Find Collector on Fargate Node | The OTEL Collector is not scheduled on the Fargate node. | OTEL Problems | OTELAWSFargate |
prequel-2025-0078 Low Impact: 6/10 Mitigation: 5/10 | AWS LoadBalancer Security Group Failure | While reconciling a TargetGroupBinding the AWS Load Balancer Controller inspects the ENI attached to each pod (IP mode) or worker node (instance mode). If it finds **zero or more than one** security group carrying the cluster-ownership tag `kubernetes.io/cluster/‹cluster-name›: owned`, it aborts and logs: ``` Reconciler error … targetGroupBinding … expected exactly one securityGroup tagged … ``` When this happens the controller never attaches nodes/pods to target groups, so the load balancer comes up with **0 healthy targets**. | AWS Problems | AWSLoadbalancerSecurity Group |
prequel-2025-0079 Medium Impact: 3/10 Mitigation: 3/10 | AWS Cluster Autoscaler Access Denied | **Cluster Autoscaler** tries to fetch node-group metadata to decide whether it can scale a workload-affinityed pod. The call to the EKS control plane fails with ``` Failed to get labels from EKS DescribeNodegroup API for nodegroup ‹name› … AccessDeniedException: User ‹ARN› is not authorized to perform: eks:DescribeNodegroup on resource: arn:aws:eks:‹region›:‹acct›:nodegroup/… ``` Once the error is hit the Autoscaler marks the node-group **Not-Ready for scaling actions**, so pending pods remain unscheduled and scale-down decisions are skipped. | AWS Problems | AWSAutoscaling |
prequel-2025-0080 Medium Impact: 8/10 Mitigation: 4/10 | Ruby NoMethodError - undefined method | A Ruby application has encountered a NoMethodError exception, indicating that code is attempting to call a method that does not exist for a given object. This typically happens when referencing an undefined method, when method names are misspelled, or when interacting with nil/null objects. NoMethodError is one of the most common runtime errors in Ruby applications and can cause immediate crashes or unexpected behavior. | Application Error | RubyRuntime ErrorApplication Exception |
prequel-2025-0081 Medium Impact: 6/10 Mitigation: 4/10 | ArgoCD RawExtension API Field Error with Datadog Operator | ArgoCD application controller fails to process certain custom resources due to being unable to find API fields in struct RawExtension. This commonly affects users deploying Datadog Operator CRDs, resulting in application sync errors for these resources. | Continuous Delivery Problems | ArgoCDKubernetesCustom ResourceDatadog |
prequel-2025-0082 High Impact: 9/10 Mitigation: 7/10 | HashiCorp Vault Raft Cluster Communication Failure | HashiCorp Vault nodes in a Raft cluster are unable to communicate with each other for an extended period. This disrupts the Raft consensus mechanism which is critical for Vault's high availability and data consistency. When nodes can't communicate, the cluster may lose quorum, preventing operations like unsealing, authentication, or secret retrieval. | High Availability Problems | VaultRaftConsensusNetworking |
prequel-2025-0083 Medium Impact: 7/10 Mitigation: 5/10 | GraphQL schema validation failures | GraphQL validation errors occur when client requests fail to comply with the GraphQL schema. These errors typically happen during query parsing and validation phases, before execution begins. Common validation failures include unknown types, missing required arguments, incorrect field usage, or invalid input values. These errors prevent the operation from executing and return error messages that describe the validation problems to the client. | API Service Problems | GraphQLValidationAPI Error |
prequel-2025-0084 Medium Impact: 7/10 Mitigation: 4/10 | PostgreSQL unsupported Unicode escape sequence error | The application encounters errors when PostgreSQL attempts to process strings containing invalid or unsupported Unicode escape sequences. This commonly occurs in applications using psycopg2 to interact with PostgreSQL databases, resulting in queries failing with "unsupported Unicode escape sequence" errors. The underlying issue is that PostgreSQL's string parser attempts to interpret escape sequences like '\\\\uXXXX' according to Unicode standards, but rejects malformed or incomplete sequences. | Database Problems | PostgreSQLUnicodeData Error |
prequel-2025-0085 Medium Impact: 7/10 Mitigation: 5/10 | Kafka message size limit exceeded | The Kafka producer encountered a "Message size too large" error when attempting to send a message to a Kafka broker. This occurs when a message exceeds the configured maximum message size limit on the broker. Kafka has configurable message size limits at both broker and producer levels to protect system stability and prevent resource exhaustion. When this limit is hit, the message is rejected and not stored in the topic. | Message Broker Errors | KafkaProducer ErrorConfiguration Issue |
prequel-2025-0086 Medium Impact: 7/10 Mitigation: 3/10 | Database Not-Null Constraint Violation | An application is attempting to insert or update records in a database table with NULL values in columns that have NOT NULL constraints. This causes database operations to fail with integrity errors, typically surfacing as NotNullViolation exceptions in application logs. In Django applications, this commonly appears as django.db.utils.IntegrityError or psycopg2.errors.NotNullViolation when using PostgreSQL. | Database Integrity Problems | DatabasePostgreSQLDjangoData Integrity |
prequel-2025-0087 Medium Impact: 7/10 Mitigation: 5/10 | Kyverno JMESPath query failure due to unknown key | Kyverno policies with JMESPath expressions are failing due to references to keys that don't exist in the target resources. This happens when policies attempt to access object properties that aren't present in the resources being validated, resulting in "Unknown key" errors during policy validation. | Policy Enforcement Issues | KyvernoKubernetesPolicy Management |
prequel-2025-0088 Medium Impact: 7/10 Mitigation: 5/10 | Temporal visibility archival failures | Temporal Server is experiencing failures when attempting to archive workflow visibility records. These failures occur when the system encounters invalid search attribute types, specifically those marked as "Unspecified". Visibility archival is a critical component of Temporal's data retention strategy, allowing historical workflow execution records to be preserved while keeping the primary storage optimized for active workflows. | Workflow Service Problems | TemporalArchivalData Retention |
prequel-2025-0089 Medium Impact: 7/10 Mitigation: 5/10 | Argo CD Manifest Generation Errors | Argo CD is experiencing recurring manifest generation errors. These errors indicate that the GitOps system is unable to properly generate or resolve Kubernetes manifests from the source repositories. When manifest generation fails consistently, applications cannot be properly synchronized, leading to configuration drift and potential deployment failures. | ArgoCD Problems | ArgoCDGitOpsContinuous Delivery |
prequel-2025-0090 High Impact: 8/10 Mitigation: 5/10 | Karpenter version incompatible with Kubernetes version; Pods cannot be scheduled | Karpenter is unable to provision new nodes because the current Karpenter version is not compatible with Kubernetes version . This incompatibility causes validation errors in the nodeclass controller and prevents pods from being scheduled properly in the cluster. | Kubernetes Provisioning Problems | AWSKarpenterKubernetes |
prequel-2025-0091 High Impact: 2/10 Mitigation: 2/10 | Redpanda data transforms cannot be used because they are disabled | This rule triggers when Redpanda logs the error `invalid_argument: data transforms disabled - use \\`rpk cluster config set data_transforms_enabled true\\` to enable`. The message indicates that WebAssembly-powered **Data Transforms** are turned off at the cluster level, so any attempt to deploy or run transform functions fails. | Message Queue Problems | Data TransformsWebAssemblyMisconfiguration |
prequel-2025-0092 High Impact: 6/10 Mitigation: 4/10 | AWS CNI intermittent runtime panics and failure to destroy pod network | This rule fires when the kubelet reports a series of `FailedKillPod / KillPodSandboxError` events that contain `rpc error: code = Unknown desc = failed to destroy network for sandbox…` together with a **SIGSEGV / nil-pointer panic** from `routed-eni-cni-plugin/cni.go` or `PluginMainFuncsWithError`. These messages indicate that the Amazon VPC CNI plugin crashed while tearing down a Pod’s network namespace, leaving the sandbox in an indeterminate state. | Kubernetes Provisioning Problems | EKSPod TerminationNetworkPanic |
prequel-2025-0093 Medium Impact: 8/10 Mitigation: 5/10 | aws-load-balancer-controller rejects Ingress resource with wildcard path and Prefix pathType | The aws-load-balancer-controller is unable to translate an Ingress resource into an AWS ALB Listener Rule when the path contains a wildcard (*) and the pathType is set to Prefix. | Kubernetes Networking Problems | KubernetesAWS Loadbalancer ControllerIngress ResourceAWSNetworkingConfigurationPath ValidationALBRouting |
prequel-2025-0094 High Impact: 8/10 Mitigation: 4/10 | cert-manager Cloudflare DNS cleanup failure | cert-manager is unable to clean up Cloudflare DNS-01 challenges due to a change in the Cloudflare API, which no longer returns zone information in individual DNS records. This breaks the interaction when cert-manager attempts to delete the TXT record, resulting in a failed certificate generation. | Networking Problems | CloudflareCert-ManagerPublicAPI Deprecation |
prequel-2025-0095 High Impact: 7/10 Mitigation: 5/10 | Elasticsearch field limit exceeded causing Logstash indexing failures | Logstash is failing to index events to Elasticsearch due to the total fields limit of 1000 being exceeded. This occurs when the Elasticsearch index has reached its maximum field limit, preventing new fields from being added during document indexing. | Data Storage Problems | ElasticsearchLogstashIndexing Failure |
prequel-2025-0096 Medium Impact: 7/10 Mitigation: 6/10 | Loki Ingester Memcache Object Size Limit Exceeded | Loki ingester encounters "object too large for cache" errors when attempting to store log entries exceeding memcache's configured size limit (typically 1MB). Large log lines remain in the ingester buffer causing continuous failed ingest attempts, pod health degradation, and eventual recycling. The accumulation of oversized entries can lead to buffer exhaustion and ingester instability. | Observability Problems | LokiIngesterMemcachedObject Size LimitCacheStorageObservabilityTelemetryThreshold ExceededData LossConfiguration |
prequel-2025-0097 Medium Impact: 6/10 Mitigation: 5/10 | Loki Compactor Schema Table Mismatch | Loki compactor encounters schema configuration mismatches when it finds index tables in object storage that don't correspond to any configured schema period in the Loki configuration. This causes the compactor to skip compaction for those tables, leading to storage inefficiency and potential query performance degradation. The issue typically occurs after schema migrations, configuration changes, or when legacy data exists with different table naming conventions. | Observability Problems | LokiCompactorSchemaConfigurationStorageObservabilityIndex |
prequel-2025-0098 Medium Impact: 6/10 Mitigation: 4/10 | Loki Pattern Ingester Empty Ring | Loki distributor encounters "empty ring" errors when attempting to send streams to pattern ingesters. This occurs when pattern ingestion is enabled in the configuration but no pattern-ingester pods are running or properly registered in the ring. The distributor's pattern-tee component cannot find any available pattern ingesters to process pattern extraction, leading to high error spam in logs while normal log ingestion continues to function. | Observability Problems | LokiConfigurationObservabilityDeploymentReplication |
prequel-2025-0099 Medium Impact: 6/10 Mitigation: 3/10 | DataDog Agent Remote Configuration Error | DataDog Agent encounters "empty targets meta in director local store" errors when attempting to retrieve remote configuration. This issue affects APM (Application Performance Monitoring) remote configuration functionality in DataDog Agent versions between 7.61.0 and 7.68.0. The error prevents proper retrieval and parsing of remote configuration from DataDog's backend, causing APM tracer libraries to fail when attempting to fetch dynamic configuration updates. | Observability Problems | DatadogObservabilityConfiguration |
prequel-2025-0100 Medium Impact: 6/10 Mitigation: 4/10 | Prometheus ingestion failure due to too many labels | Grafana Mimir's distributor rejects incoming Prometheus series when the number of label names on a single series exceeds the configured per-tenant limit. When this occurs, logs contain the message "received a series whose number of labels exceeds the limit" and the affected samples are dropped. This typically arises from excessive or dynamic labeling in scrape targets or relabeling rules that generate many unique label names per series. To adjust the per-tenant limit, configure the distributor with `-validation.max-label-names-per-series`. When deploying via the `mimir-distributed` Helm chart, set `mimir.structuredConfig.limits.max_label_names_per_series` to a higher value (default is 30). Increase limits cautiously to avoid cardinality explosions and memory pressure. Prefer reducing label names at the source where possible. | Observability Problems | PrometheusGrafanaMetricsConfigurationConfiguration IssueThreshold ExceededObservability |
prequel-2025-0101 Medium Impact: 6/10 Mitigation: 5/10 | Loki Ingester Memcache Out of Memory | Loki ingester reports memcached errors indicating out-of-memory conditions while caching objects, logging messages such as "SERVER_ERROR out of memory storing object". When this occurs, cache writes fail and can lead to degraded ingestion performance, retries, and increased memory pressure on the ingester. | Observability Problems | LokiIngesterMemcachedStorageCacheMemoryData LossThreshold ExceededObservabilityConfiguration |
prequel-2025-0102 High Impact: 7/10 Mitigation: 6/10 | Ingress Nginx HTTP 5XX Error | The ingress-nginx controller is returning HTTP 5XX errors | Ingress Problems | NginxIngressErrors |
prequel-2025-0103 High Impact: 4/10 Mitigation: 5/10 | Ingress Nginx Backend Service Has No Active Endpoints | The ingress-nginx controller has detected that a service does not have any active endpoints. This typically happens when the service selector does not match any pods or the pods are not in a ready state. The controller logs a warning message indicating that the service does not have any active endpoints. | Ingress Problems | NginxIngressKubernetesService |
prequel-2025-0104 Medium Impact: 5/10 Mitigation: 4/10 | Ingress Nginx can't obtain X.509 certificate | The Nginx ingress encountered an error while trying to obtain an X.509 certificate from the Kubernetes secret. | Ingress Problems | KubernetesCertificateNginxIngress |
prequel-2025-0105 Medium Impact: 7/10 Mitigation: 5/10 | Karpenter NodePool budget exceeded; Pods cannot be scheduled | Karpenter is used to automatically provision Kubernetes nodes. NodePools can define a maximum budget for total resource usage to prevent unexpectedly expensive cloud bills. When the budget is reached, Karpenter will stop provisioning new nodes and new pods will fail to schedule. | Autoscaling Problems | KarpenterKubernetesAutoscalingCapacityBudgets |
prequel-2025-0106 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Bitnami Image Pull Events | - Detects Kubernetes events where Bitnami container images are being pulled from Docker Hub. - Monitors image pull operations for Bitnami images across all namespaces. - Identifies usage of Bitnami images that may be affected by upcoming catalog changes. - Tracks container deployments using Bitnami images for migration planning. | Container Security | KubernetesBitnamiContainer ImagesImage PullsDocker HubMigration PlanningCatalog Changes |
prequel-2025-0107 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Bitnami Image Pull Error | - Detects Kubernetes events where Bitnami container image pulls are failing due to repository deprecation. - Monitors image pull failures for Bitnami images as they approach the August 28, 2025 deprecation deadline. - Identifies specific error conditions when Bitnami images become unavailable from deprecated repositories. - Tracks container deployment failures due to Bitnami image repository deprecation. | Container Security | KubernetesBitnamiContainer ImagesImage Pull ErrorsDocker HubRepository DeprecationMigration Required |
prequel-2025-0108 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deprecated Bitnami Repository Image Pulls | - Detects Kubernetes events where container images are being pulled from the deprecated /bitnami repository on Docker Hub. - Monitors image pull operations specifically from docker.io/bitnami/* which will be discontinued. - Identifies usage of the deprecated Bitnami repository that requires immediate migration. - Tracks container deployments using the legacy /bitnami path for urgent migration planning. | Container Security | KubernetesBitnamiDeprecated RepositoryContainer ImagesImage PullsDocker Hub |
prequel-2025-0109 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Legacy Bitnami Repository Image Pulls | - Detects Kubernetes events where container images are being pulled from the unmaintaing /bitnamilegacy repository on Docker Hub. - Monitors image pull operations specifically from docker.io/bitnamilegacy/* which is no longer maintained. - Identifies usage of the deprecated Bitnami repository that requires immediate migration. - Tracks container deployments using the legacy /bitnamilegacy path for urgent migration planning. | Container Security | KubernetesBitnamiContainer ImagesImage PullsDocker HubSecurity |
prequel-2025-0110 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Bitnami Secure Image Pull Events - Designed for Non-Prod Usage Only | - Detects Kubernetes events where Bitnami Secure container images are being pulled. - Monitors image pull operations for Bitnami Secure images which cannot be pinned to specific versions. - Identifies usage of Bitnami Secure images that lack version pinning capabilities for production stability. - Tracks container deployments using unpinnable Bitnami Secure images for compliance monitoring. | Container Security | KubernetesBitnamiContainer ImagesImage PullsDev Only |
prequel-2025-0111 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deprecated Bitnami Repository Image Pulls | - Detects Kubernetes events where container images are being pulled from the deprecated /bitnami repository on Docker Hub. - Monitors image pull operations specifically from docker.io/bitnami/* which will be discontinued. - Identifies usage of the deprecated Bitnami repository that requires immediate migration. - Tracks container deployments using the legacy /bitnami path for urgent migration planning. | Container Security | KubernetesBitnamiDeprecated RepositoryContainer ImagesImage PullsDocker Hub |
prequel-2025-0112 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment CPU Requests Missing | - Detects Kubernetes Deployment resources without CPU requests configured on containers. - Monitors deployment specifications where containers lack proper CPU request definitions. - Identifies resource management violations that can lead to poor cluster scheduling. - Tracks deployments that may cause resource contention and performance issues. | Resource Management | KubernetesDeploymentCPU Requestsresource-managementSchedulingPerformance |
prequel-2025-0113 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment CPU Limits Missing | - Detects Kubernetes Deployment resources without CPU limits configured on containers. - Monitors deployment specifications where containers lack proper CPU limit definitions. - Identifies resource management violations that can lead to resource exhaustion. - Tracks deployments that may consume excessive CPU resources without bounds. | Resource Management | KubernetesDeploymentCPU Limitsresource-managementResource ExhaustionPerformance |
prequel-2025-0114 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment Memory Requests Missing | - Detects Kubernetes Deployment resources without memory requests configured on containers. - Monitors deployment specifications where containers lack proper memory request definitions. - Identifies resource management violations that can lead to poor scheduling decisions. - Tracks deployments that may cause memory pressure and OOM conditions. | Resource Management | KubernetesDeploymentMemory Requestsresource-managementSchedulingOOM |
prequel-2025-0115 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment Memory Limits Missing | - Detects Kubernetes Deployment resources without memory limits configured on containers. - Monitors deployment specifications where containers lack proper memory limit definitions. - Identifies resource management violations that can lead to memory exhaustion. - Tracks deployments that may consume excessive memory resources without bounds. | Resource Management | KubernetesDeploymentMemory Limitsresource-managementMemory ExhaustionOOM |
prequel-2025-0116 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment Liveness Probe Missing | - Detects Kubernetes Deployment resources without liveness probes configured on containers. - Monitors deployment specifications where containers lack proper health check definitions. - Identifies reliability violations that can lead to undetected application failures. - Tracks deployments that may run unhealthy containers without automatic recovery. | Kubernetes Best Practices | KubernetesDeploymentLiveness ProbeHealth ChecksReliabilityAvailability |
prequel-2025-0117 Medium Impact: 0/10 Mitigation: 0/10 | Kubernetes Deployment Readiness Probe Missing | - Detects Kubernetes Deployment resources without readiness probes configured on containers. - Monitors deployment specifications where containers lack proper readiness check definitions. - Identifies reliability violations that can lead to premature traffic routing. - Tracks deployments that may receive traffic before being fully ready to handle requests. | Kubernetes Best Practices | KubernetesDeploymentReadiness ProbeHealth ChecksReliabilityTraffic Routing |