Tag: Raft
Issues related to Raft consensus protocol, elections, and leader changes
ID | Title | Description | Category | Technology | Tags |
---|---|---|---|---|---|
CRE-2025-0080 High Impact: 0/10 Mitigation: 9/10 | Redpanda High Severity Issues | Detects when Redpanda hits any of these on startup or early runtime: 1. Fails to create its crash_reports directory (POSIX error 13). 2. Heartbeat or node-status RPC failures indicating a broker is down. 3. Raft group failure. 4. Data center failure | Data Streaming Platforms | redpanda | RedpandaStartup FailurePermission FailureRPCRaftNode DownCluster DegradationData AvailabilityDatabase Corruption |
CRE-2025-0082 High Impact: 0/10 Mitigation: 8/10 | NATS JetStream HA failures: monitor goroutine, consumer stalls and unsynced replicas | Detects high-availability failures in NATS JetStream clusters due to: 1. **Monitor goroutine failure** — after node restarts, Raft group fails to elect a leader 2. **Consumer deadlock** — using DeliverPolicy=LastPerSubject + AckPolicy=Explicit with low MaxAckPending 3. **Unsynced replicas** — object store replication appears healthy but data is lost or inconsistent between nodes These issues lead to invisible data loss, stalled consumers, or stream unavailability. | Message Queue Problems | nats | NATSJetStreamRaftAck DeadlockUnsynced Replica |
CRE-2025-0092 High Impact: 0/10 Mitigation: 9/10 | Redpanda Quorum Loss | Detects when a Redpanda node becomes isolated (heartbeats fail) and triggers a Raft re-election, indicating quorum loss. | Redpanda High Availability | redpanda | RedpandaRaftQuorumLeader Election |