Skip to main content

CRE-2025-0072

Redis Out-Of-Memory → Persistence Crash → Replica/ACL Write FailuresCritical
Impact: 10/10
Mitigation: 7/10

CRE-2025-0072View on GitHub

Description

Detects a cascade of critical Redis failure modes in a single session:\n- Redis refuses writes when maxmemory is exceeded (OOM).\n- RDB snapshot (BGSAVE) fails (MISCONF) due to simulated full-disk.\n- Replica refuses writes (READONLY).\n- ACL denies a write (NOPERM).\n

Mitigation

IMMEDIATE:\n- Check Redis memory usage: `INFO memory`\n- Inspect `maxmemory` / `maxmemory-policy`: `CONFIG GET maxmemory maxmemory-policy`\n- Free up memory or increase `maxmemory`.\n- Clear disk space so BGSAVE can succeed (remove dummy “/data/filler”).\n- Restart Redis if it was killed.\n\nRECOVERY ACTIONS (15-60 minutes):\n- Restore from last valid RDB/AOF snapshot.\n- Change eviction policy (e.g. `volatile-lru`).\n- Monitor memory, disk, and persist errors (e.g. via RedisExporter → Prometheus alerts).\n- Scale out / shard large keys to avoid a single Redis hitting 100 MB.\n\nPREVENTION STRATEGIES:\n- Avoid `noeviction` unless absolutely needed; use a TTL/eviction policy.\n- Ensure persistence disk has enough headroom.\n- Configure `stop-writes-on-bgsave-error` carefully.\n- Use ACLs and replica roles deliberately, but monitor for “READONLY” or “NOPERM” events.\n

References