CRE-2025-0070
Critical Kafka cluster degradation detected\: Multiple partitions have lost replicas due to broker failure,
Critical Kafka cluster degradation detected\: Multiple partitions have lost replicas due to broker failure,
CoreDNS deployment is unavailable or has no ready endpoints, indicating an imminent cluster\-wide DNS outage.
Detects critical Nginx upstream failure cascades that lead to complete service unavailability.
Detects when Slurm's accounting daemon (slurmdbd) or controller (slurmctld) loses connection to its MySQL database, causing job scheduling and recording to halt.