Skip to main content

Tag: Raft

Issues related to Raft consensus protocol, elections, and leader changes

IDTitleDescriptionCategoryTechnologyTags
CRE-2025-0080
High
Impact: 0/10
Mitigation: 9/10
Redpanda High Severity Issues
Detects when Redpanda hits any of these on startup or early runtime: 1. Fails to create its crash_reports directory (POSIX error 13). 2. Heartbeat or node-status RPC failures indicating a broker is down. 3. Raft group failure. 4. Data center failure
Data Streaming PlatformsredpandaRedpandaStartup FailurePermission FailureRPCRaftNode DownCluster DegradationData AvailabilityDatabase Corruption
CRE-2025-0082
High
Impact: 0/10
Mitigation: 8/10
NATS JetStream HA failures: monitor goroutine, consumer stalls and unsynced replicas
Detects high-availability failures in NATS JetStream clusters due to:
  • Monitor goroutine failure — after node restarts, Raft group fails to elect a leader
  • Consumer deadlock — using DeliverPolicy=LastPerSubject + AckPolicy=Explicit with low MaxAckPending
  • Unsynced replicas — object store replication appears healthy but data is lost or inconsistent between nodes

These issues lead to invisible data loss, stalled consumers, or stream unavailability.
Message Queue ProblemsnatsNATSJetStreamRaftAck DeadlockUnsynced Replica
CRE-2025-0092
High
Impact: 0/10
Mitigation: 9/10
Redpanda Quorum Loss
Detects when a Redpanda node becomes isolated (heartbeats fail) and triggers a Raft re-election, indicating quorum loss.
Redpanda High AvailabilityredpandaRedpandaRaftQuorumLeader Election