Skip to main content

Tag: Raft

Issues related to Raft consensus protocol, elections, and leader changes

IDTitleDescriptionCategoryTechnologyTags
CRE-2025-0080
High
Impact: 0/10
Mitigation: 9/10
Redpanda High Severity IssuesDetects when Redpanda hits any of these on startup or early runtime: 1. Fails to create its crash_reports directory (POSIX error 13). 2. Heartbeat or node-status RPC failures indicating a broker is down. 3. Raft group failure. 4. Data center failureData Streaming PlatformsredpandaRedpandaStartup FailurePermission FailureRPCRaftNode DownCluster DegradationData AvailabilityDatabase Corruption
CRE-2025-0082
High
Impact: 0/10
Mitigation: 8/10
NATS JetStream HA failures: monitor goroutine, consumer stalls and unsynced replicasDetects high-availability failures in NATS JetStream clusters due to: 1. **Monitor goroutine failure** — after node restarts, Raft group fails to elect a leader 2. **Consumer deadlock** — using DeliverPolicy=LastPerSubject + AckPolicy=Explicit with low MaxAckPending 3. **Unsynced replicas** — object store replication appears healthy but data is lost or inconsistent between nodes These issues lead to invisible data loss, stalled consumers, or stream unavailability.Message Queue ProblemsnatsNATSJetStreamRaftAck DeadlockUnsynced Replica
CRE-2025-0092
High
Impact: 0/10
Mitigation: 9/10
Redpanda Quorum LossDetects when a Redpanda node becomes isolated (heartbeats fail) and triggers a Raft re-election, indicating quorum loss.Redpanda High AvailabilityredpandaRedpandaRaftQuorumLeader Election