Skip to main content

Category: Message Queue Problems

Problems related to message queues, like Kafka, RabbitMQ, NATS, and others

IDTitleDescriptionCategoryTechnologyTags
CRE-2024-0007
Critical
Impact: 9/10
Mitigation: 8/10
RabbitMQ Mnesia overloaded recovering persistent queuesThe RabbitMQ cluster is processing a large number of persistent mirrored queues at boot. The underlying Erlang process, Mnesia, is overloaded (`** WARNING ** Mnesia is overloaded`).Message Queue ProblemsrabbitmqKnown ProblemRabbitMQPublic
CRE-2024-0008
High
Impact: 9/10
Mitigation: 6/10
RabbitMQ memory alarmA RabbitMQ node has entered the “memory alarm” state because the total memory used by the Erlang VM (plus allocated binaries, ETS tables, and processes) has exceeded the configured `vm_memory_high_watermark`. While the alarm is active the broker applies flow-control, blocking publishers and pausing most ingress activity to protect itself from running out of RAM.Message Queue ProblemsrabbitmqKnown ProblemRabbitMQPublic
CRE-2025-0025
Medium
Impact: 6/10
Mitigation: 5/10
Kafka broker replication mismatchWhen the configured replication factor for a Kafka topic is greater than the actual number of brokers in the cluster, Kafka repeatedly fails to assign partitions and logs replication-related errors. This results in persistent warnings or an `InvalidReplicationFactorException` when the broker tries to create internal or user-defined topics.Message Queue Problemstopic-operatorKafkaKnown ProblemPublic
CRE-2025-0049
Low
Impact: 2/10
Mitigation: 8/10
NATS Payload Size Too BigThe NATS server is configured to publish messages with payloads that may exceed the recommended maximum of 8 MB (the server’s default hard limit is 1 MB but it can be raised to 64 MB). Large messages put disproportionate pressure on broker memory, network buffers, and client back-pressure mechanisms. This warning signals NATS is at risk of degraded throughput, slow consumers, and forced connection closures intended to protect cluster stability.Message Queue ProblemsnatsNATSPublic
CRE-2025-0063
Medium
Impact: 6/10
Mitigation: 3/10
RabbitMQ disk monitor fails to initialize- RabbitMQ's disk monitor process cannot start or retrieve free‐space metrics, preventing it from detecting low‐disk conditions.Message Queue ProblemsrabbitmqRabbitMQDisk MonitorMonitoringPlugin
CRE-2025-0070
Critical
Impact: 10/10
Mitigation: 6/10
Kafka Under-Replicated Partitions CrisisCritical Kafka cluster degradation detected: Multiple partitions have lost replicas due to broker failure, resulting in an under-replicated state. This pattern indicates a broker has become unavailable, causing partition leadership changes and In-Sync Replica (ISR) shrinkage across multiple topics.Message Queue ProblemskafkaKafkaReplicationData LossHigh AvailabilityBroker FailureCluster Degradation