CRE-2025-0072
Redis Out-Of-Memory → Persistence Crash → Replica/ACL Write FailuresCriticalImpact: 10/10Mitigation: 7/10
CRE-2025-0072View on GitHub
Description
Detects a cascade of critical Redis failure modes in a single session:
- Redis refuses writes when maxmemory is exceeded (OOM).
- RDB snapshot (BGSAVE) fails (MISCONF) due to simulated full-disk.
- Replica refuses writes (READONLY).
- ACL denies a write (NOPERM).
Cause
ROOT CAUSES:
- Redis is configured with `maxmemory 100mb` + `noeviction`.
- A Lua EVAL pushes memory usage over that cap → OOM refusal.
- A manual BGSAVE is then forced while disk is full → MISCONF.
- The instance is switched to replica mode → READONLY on write.
- A read-only ACL user attempts a SET → NOPERM.
Mitigation
IMMEDIATE:
- Check Redis memory usage: `INFO memory`
- Inspect `maxmemory` / `maxmemory-policy`: `CONFIG GET maxmemory maxmemory-policy`
- Free up memory or increase `maxmemory`.
- Clear disk space so BGSAVE can succeed (remove dummy “/data/filler”).
- Restart Redis if it was killed.
RECOVERY ACTIONS (15-60 minutes):
- Restore from last valid RDB/AOF snapshot.
- Change eviction policy (e.g. `volatile-lru`).
- Monitor memory, disk, and persist errors (e.g. via RedisExporter → Prometheus alerts).
- Scale out / shard large keys to avoid a single Redis hitting 100 MB.
PREVENTION STRATEGIES:
- Avoid `noeviction` unless absolutely needed; use a TTL/eviction policy.
- Ensure persistence disk has enough headroom.
- Configure `stop-writes-on-bgsave-error` carefully.
- Use ACLs and replica roles deliberately, but monitor for “READONLY” or “NOPERM” events.