Skip to main content

Category: Message Queue Problems

Problems related to message queues, like Kafka, RabbitMQ, NATS and others

IDTitleDescriptionCategoryTechnologyTags
CRE-2024-0007
Critical
Impact: 9/10
Mitigation: 8/10
RabbitMQ Mnesia overloadedThe underlying Erlang process, Mnesia, is overloaded (`** WARNING ** Mnesia is overloaded`).Message Queue ProblemsrabbitmqKnown ProblemRabbitMQPublic
CRE-2024-0008
High
Impact: 9/10
Mitigation: 6/10
RabbitMQ memory alarmA RabbitMQ node has entered the “memory alarm” state because the total memory used by the Erlang VM (plus allocated binaries, ETS tables, and processes) has exceeded the configured `vm_memory_high_watermark`. While the alarm is active the broker applies flow-control, blocking publishers and pausing most ingress activity to protect itself from running out of RAM.Message Queue ProblemsrabbitmqKnown ProblemRabbitMQPublic
CRE-2025-0025
Medium
Impact: 6/10
Mitigation: 5/10
Kafka broker replication mismatchWhen the configured replication factor for a Kafka topic is greater than the actual number of brokers in the cluster, Kafka repeatedly fails to assign partitions and logs replication-related errors. This results in persistent warnings or an `InvalidReplicationFactorException` when the broker tries to create internal or user-defined topics.Message Queue Problemstopic-operatorKafkaKnown ProblemPublic
CRE-2025-0049
Low
Impact: 2/10
Mitigation: 8/10
NATS Payload Size Too BigThe NATS server is configured to publish messages with payloads that may exceed the recommended maximum of 8 MB (the server’s default hard limit is 1 MB but it can be raised to 64 MB). Large messages put disproportionate pressure on broker memory, network buffers, and client back-pressure mechanisms. This warning signals NATS is at risk of degraded throughput, slow consumers, and forced connection closures intended to protect cluster stability.Message Queue ProblemsnatsNATSPublic
CRE-2025-0063
Medium
Impact: 6/10
Mitigation: 3/10
RabbitMQ disk monitor fails to initialize- RabbitMQ's disk monitor process cannot start or retrieve free‐space metrics, preventing it from detecting low‐disk conditions.Message Queue ProblemsrabbitmqRabbitMQDisk MonitorMonitoringPlugin
CRE-2025-0070
Critical
Impact: 10/10
Mitigation: 6/10
Kafka Under-Replicated Partitions CrisisCritical Kafka cluster degradation detected: Multiple partitions have lost replicas due to broker failure, resulting in an under-replicated state. This pattern indicates a broker has become unavailable, causing partition leadership changes and In-Sync Replica (ISR) shrinkage across multiple topics.Message Queue ProblemskafkaKafkaReplicationData LossHigh AvailabilityBroker FailureCluster Degradation
CRE-2025-0082
High
Impact: 0/10
Mitigation: 8/10
NATS JetStream HA failures: monitor goroutine, consumer stalls and unsynced replicasDetects high-availability failures in NATS JetStream clusters due to: 1. **Monitor goroutine failure** — after node restarts, Raft group fails to elect a leader 2. **Consumer deadlock** — using DeliverPolicy=LastPerSubject + AckPolicy=Explicit with low MaxAckPending 3. **Unsynced replicas** — object store replication appears healthy but data is lost or inconsistent between nodes These issues lead to invisible data loss, stalled consumers, or stream unavailability.Message Queue ProblemsnatsNATSJetStreamRaftAck DeadlockUnsynced Replica
CRE-2025-0088
High
Impact: 9/10
Mitigation: 8/10
NATS JetStream Storage Exhaustion DetectionDetects NATS JetStream storage exhaustion conditions when streams reach configured storage limits (maximum bytes, maximum messages) causing message storage failures. These patterns indicate insufficient stream storage capacity relative to message production rate, leading to message rejection and potential data loss.Message Queue ProblemsjetstreamNATSJetStreamStorage ExhaustionMessage Storage FailureCapacity ExceededData Loss Risk
CRE-2025-0095
High
Impact: 9/10
Mitigation: 7/10
NATS Connection Exhaustion: Maximum Connections ExceededDetects NATS server connection exhaustion where the configured maximum connection limit is exceeded, preventing new clients from establishing connections. This represents a critical messaging infrastructure failure that can cause cascading outages across distributed systems.Message Queue ProblemsnatsNATSConnection ExhaustionCritical Infrastructure
CRE-2025-0103
Medium
Impact: 0/10
Mitigation: 0/10
NATS Connection Failures and Network PartitionsDetects NATS connection failures and network partitions that can impact message delivery and system reliability.Message Queue ProblemsnatsNATSConnectivity