Tag: Data Loss
Problems where data is lost or dropped due to system failures or processing errors
ID | Title | Description | Category | Technology | Tags |
---|---|---|---|---|---|
CRE-2025-0020 High Impact: 10/10 Mitigation: 6/10 | Self-hosted PostgreSQL HA: WAL Streaming & HA Controller Crisis (Replication Slot Loss, Disk Full, Etcd Quorum Failure) | Detects high-severity failures in self-hosted PostgreSQL high-availability clusters managed by Patroni, Zalando, or similar HA controllers. This rule targets catastrophic conditions that break replication or cluster consensus: - WAL streaming failures due to missing replication slots (usually after disk full or crash events) - Persistent errors resolving HA controller endpoints (etcd/consul) and loss of HA controller quorum - Disk saturation leading to WAL write errors and replication breakage | PostgreSQL High Availability | postgresql | High AvailabilityPatroniZalandoEtcdReplicationWALStorageQuorumCrashData LossTimeout |
CRE-2025-0033 Low Impact: 7/10 Mitigation: 4/10 | OpenTelemetry Collector refuses to scrape due to memory pressure | The OpenTelemetry Collector may refuse to ingest metrics during a Prometheus scrape if it exceeds its configured memory limits. When the `memory_limiter` processor is enabled, the Collector actively drops data to prevent out-of-memory errors, resulting in log messages indicating that data was refused due to high memory usage. | Observability Problems | opentelemetry-collector | Otel CollectorPrometheusMemoryMetricsBackpressureData LossKnown IssuePublic |
CRE-2025-0070 Critical Impact: 10/10 Mitigation: 6/10 | Kafka Under-Replicated Partitions Crisis | Critical Kafka cluster degradation detected: Multiple partitions have lost replicas due to broker failure, resulting in an under-replicated state. This pattern indicates a broker has become unavailable, causing partition leadership changes and In-Sync Replica (ISR) shrinkage across multiple topics. | Message Queue Problems | kafka | KafkaReplicationData LossHigh AvailabilityBroker FailureCluster Degradation |
CRE-2025-0073 High Impact: 9/10 Mitigation: 6/10 | Redis Rejects Writes Due to Reaching 'maxmemory' Limit | The Redis instance has reached its configured 'maxmemory' limit. Because its active memory management policy does not permit the eviction of existing keys to free up space (as is the case when the 'noeviction' policy is in effect, which is often the default), Redis rejects new write commands by sending an \"OOM command not allowed\" error to the client. | Database Problems | redis-cli | RedisRedis CLIMemory PressureMemoryData LossPublic |