Category: Message Queue Problems
Problems related to message queues, like Kafka, RabbitMQ, NATS and others
ID | Title | Description | Category | Technology | Tags |
---|---|---|---|---|---|
CRE-2024-0007 Critical Impact: 9/10 Mitigation: 8/10 | RabbitMQ Mnesia overloaded | The underlying Erlang process, Mnesia, is overloaded (`** WARNING ** Mnesia is overloaded`). | Message Queue Problems | rabbitmq | Known ProblemRabbitMQPublic |
CRE-2024-0008 High Impact: 9/10 Mitigation: 6/10 | RabbitMQ memory alarm | A RabbitMQ node has entered the “memory alarm” state because the total memory used by the Erlang VM (plus allocated binaries, ETS tables, and processes) has exceeded the configured `vm_memory_high_watermark`. While the alarm is active the broker applies flow-control, blocking publishers and pausing most ingress activity to protect itself from running out of RAM. | Message Queue Problems | rabbitmq | Known ProblemRabbitMQPublic |
CRE-2025-0025 Medium Impact: 6/10 Mitigation: 5/10 | Kafka broker replication mismatch | When the configured replication factor for a Kafka topic is greater than the actual number of brokers in the cluster, Kafka repeatedly fails to assign partitions and logs replication-related errors. This results in persistent warnings or an `InvalidReplicationFactorException` when the broker tries to create internal or user-defined topics. | Message Queue Problems | topic-operator | KafkaKnown ProblemPublic |
CRE-2025-0049 Low Impact: 2/10 Mitigation: 8/10 | NATS Payload Size Too Big | The NATS server is configured to publish messages with payloads that may exceed the recommended maximum of 8 MB (the server’s default hard limit is 1 MB but it can be raised to 64 MB). Large messages put disproportionate pressure on broker memory, network buffers, and client back-pressure mechanisms. This warning signals NATS is at risk of degraded throughput, slow consumers, and forced connection closures intended to protect cluster stability. | Message Queue Problems | nats | NATSPublic |
CRE-2025-0063 Medium Impact: 6/10 Mitigation: 3/10 | RabbitMQ disk monitor fails to initialize | - RabbitMQ's disk monitor process cannot start or retrieve free‐space metrics, preventing it from detecting low‐disk conditions. | Message Queue Problems | rabbitmq | RabbitMQDisk MonitorMonitoringPlugin |
CRE-2025-0070 Critical Impact: 10/10 Mitigation: 6/10 | Kafka Under-Replicated Partitions Crisis | Critical Kafka cluster degradation detected: Multiple partitions have lost replicas due to broker failure, resulting in an under-replicated state. This pattern indicates a broker has become unavailable, causing partition leadership changes and In-Sync Replica (ISR) shrinkage across multiple topics. | Message Queue Problems | kafka | KafkaReplicationData LossHigh AvailabilityBroker FailureCluster Degradation |
CRE-2025-0082 High Impact: 0/10 Mitigation: 8/10 | NATS JetStream HA failures: monitor goroutine, consumer stalls and unsynced replicas | Detects high-availability failures in NATS JetStream clusters due to: 1. **Monitor goroutine failure** — after node restarts, Raft group fails to elect a leader 2. **Consumer deadlock** — using DeliverPolicy=LastPerSubject + AckPolicy=Explicit with low MaxAckPending 3. **Unsynced replicas** — object store replication appears healthy but data is lost or inconsistent between nodes These issues lead to invisible data loss, stalled consumers, or stream unavailability. | Message Queue Problems | nats | NATSJetStreamRaftAck DeadlockUnsynced Replica |
CRE-2025-0088 High Impact: 9/10 Mitigation: 8/10 | NATS JetStream Storage Exhaustion Detection | Detects NATS JetStream storage exhaustion conditions when streams reach configured storage limits (maximum bytes, maximum messages) causing message storage failures. These patterns indicate insufficient stream storage capacity relative to message production rate, leading to message rejection and potential data loss. | Message Queue Problems | jetstream | NATSJetStreamStorage ExhaustionMessage Storage FailureCapacity ExceededData Loss Risk |
CRE-2025-0095 High Impact: 9/10 Mitigation: 7/10 | NATS Connection Exhaustion: Maximum Connections Exceeded | Detects NATS server connection exhaustion where the configured maximum connection limit is exceeded, preventing new clients from establishing connections. This represents a critical messaging infrastructure failure that can cause cascading outages across distributed systems. | Message Queue Problems | nats | NATSConnection ExhaustionCritical Infrastructure |
CRE-2025-0103 Medium Impact: 0/10 Mitigation: 0/10 | NATS Connection Failures and Network Partitions | Detects NATS connection failures and network partitions that can impact message delivery and system reliability. | Message Queue Problems | nats | NATSConnectivity |