Skip to main content

Tag: Data Loss

Problems where data is lost or dropped due to system failures or processing errors

IDTitleDescriptionCategoryTechnologyTags
CRE-2025-0020
High
Impact: 10/10
Mitigation: 6/10
Self-hosted PostgreSQL HA: WAL Streaming & HA Controller Crisis (Replication Slot Loss, Disk Full, Etcd Quorum Failure)
Detects high-severity failures in self-hosted PostgreSQL high-availability clusters managed by Patroni, Zalando, or similar HA controllers.This rule targets catastrophic conditions that break replication or cluster consensus:
  • WAL streaming failures due to missing replication slots (usually after disk full or crash events)
  • Persistent errors resolving HA controller endpoints (etcd/consul) and loss of HA controller quorum
  • Disk saturation leading to WAL write errors and replication breakage
PostgreSQL High AvailabilitypostgresqlHigh AvailabilityPatroniZalandoEtcdReplicationWALStorageQuorumCrashData LossTimeout
CRE-2025-0033
Low
Impact: 7/10
Mitigation: 4/10
OpenTelemetry Collector refuses to scrape due to memory pressure
The OpenTelemetry Collector may refuse to ingest metrics during a Prometheus scrape if it exceeds its configured memory limits. When the `memory_limiter` processor is enabled, the Collector actively drops data to prevent out-of-memory errors, resulting in log messages indicating that data was refused due to high memory usage.
Observability Problemsopentelemetry-collectorOtel CollectorPrometheusMemoryMetricsBackpressureData LossKnown IssuePublic
CRE-2025-0070
Critical
Impact: 10/10
Mitigation: 6/10
Kafka Under-Replicated Partitions Crisis
Critical Kafka cluster degradation detected: Multiple partitions have lost replicas due to broker failure,resulting in an under-replicated state. This pattern indicates a broker has become unavailable, causingpartition leadership changes and In-Sync Replica (ISR) shrinkage across multiple topics.
Message Queue ProblemskafkaKafkaReplicationData LossHigh AvailabilityBroker FailureCluster Degradation
CRE-2025-0073
High
Impact: 9/10
Mitigation: 6/10
Redis Rejects Writes Due to Reaching 'maxmemory' Limit
The Redis instance has reached its configured 'maxmemory' limit. Because its active memorymanagement policy does not permit the eviction of existing keys to free up space (as is thecase when the 'noeviction' policy is in effect, which is often the default), Redis rejectsnew write commands by sending an "OOM command not allowed" error to the client.
Database Problemsredis-cliRedisRedis CLIMemory PressureMemoryData LossPublic
CRE-2025-0126
High
Impact: 10/10
Mitigation: 7/10
MongoDB Replica Set Primary Election Failure
Detects high-severity MongoDB replica set primary election failures that result in no primary node being available,causing complete service unavailability. This rule targets catastrophic conditions that break replica set consensus:
  • Primary node failures followed by election timeouts where no secondary can become primary
  • Network partitions isolating replica set members and preventing quorum formation
  • Heartbeat failures and connectivity issues leading to election failures
  • Replica set state transitions indicating election problems
Database ProblemsmongodbHigh AvailabilityQuorumLeader ElectionNetworkTimeoutCrashData Loss
CRE-2025-0179
Critical
Impact: 9/10
Mitigation: 7/10
N8N Workflow Silent Data Loss During Execution
N8N workflow automation platform experiences critical silent data loss where items disappear between workflow nodes without generating error messages. This high-severity issue affects long-running workflows (60-115+ minutes) and can cause workflows to randomly cancel mid-execution, leading to incomplete processing and data integrity problems. Items silently vanish between nodes, with different item counts across the workflow pipeline, making the issue particularly dangerous for production systems that rely on complete data processing.
Workflow Automation Problemsn8nN8NWorkflow AutomationData LossSilent FailureProduction CriticalData IntegrityPublic