CRE-2025-0200
Redis Comprehensive Troubleshooting - Multiple Common Issues DetectionCriticalImpact: 10/10Mitigation: 7/10
Description
Comprehensive detection rule for multiple common Redis troubleshooting scenarios including:\n1. Out-of-Memory (OOM) errors when maxmemory limit exceeded\n2. Connection timeouts and connectivity issues\n3. Authentication failures and permission denials\n4. Invalid commands and argument errors\n5. Background save (BGSAVE) conflicts and persistence issues\n6. Slow query performance problems\n7. Read-only replica write attempts\n8. Disk persistence failures (MISCONF errors)\n9. Client connection limits exceeded\n10. Memory pressure and eviction warnings\n
Mitigation
IMMEDIATE ACTIONS:\n- Check Redis server status: `redis-cli ping`\n- Monitor memory usage: `redis-cli info memory`\n- Review error logs for specific failure patterns\n- Verify authentication and ACL configuration\n- Check disk space and persistence settings\n\nRECOVERY STRATEGIES BY ISSUE TYPE:\n\n1. OOM ERRORS:\n - Increase maxmemory limit: `CONFIG SET maxmemory 500mb`\n - Change eviction policy: `CONFIG SET maxmemory-policy volatile-lru`\n - Clear unnecessary keys or restart Redis\n\n2. CONNECTION ISSUES:\n - Restart Redis service: `systemctl restart redis`\n - Check firewall and network configuration\n - Adjust client timeout settings\n\n3. AUTHENTICATION FAILURES:\n - Verify credentials: `redis-cli -a password ping`\n - Update ACL permissions: `ACL SETUSER username +@all`\n - Rotate and update client credentials\n\n4. COMMAND ERRORS:\n - Fix application code with correct Redis syntax\n - Update Redis client libraries\n - Check for renamed/disabled commands\n\n5. PERSISTENCE ISSUES:\n - Wait for current BGSAVE to complete\n - Free disk space for RDB/AOF files\n - Optimize backup scheduling\n\n6. SLOW QUERIES:\n - Optimize data structures and access patterns\n - Use SCAN instead of KEYS for iteration\n - Monitor and tune slowlog settings\n\n7. READONLY ERRORS:\n - Redirect writes to master Redis instance\n - Check replication configuration\n - Verify client connection routing\n\nPREVENTION:\n- Implement comprehensive Redis monitoring\n- Set up memory, performance, and error alerting\n- Use Redis clustering for high availability\n- Regular capacity planning and performance reviews\n- Automate backup and persistence monitoring\n- Implement proper error handling in applications\n
References
- https://redis.io/docs/latest/operate/oss_and_stack/management/troubleshooting/
- https://redis.io/docs/latest/operate/oss_and_stack/management/persistence/
- https://redis.io/docs/latest/operate/oss_and_stack/management/security/acl/
- https://redis.io/docs/latest/operate/oss_and_stack/management/optimization/latency/
- https://www.site24x7.com/learn/redis-troubleshooting-guide.html