Skip to main content

CRE-2025-0039

OpenTelemetry Collector exporter experiences retryable errors due to backend unavailabilityMedium
Impact: 5/10
Mitigation: 3/10

CRE-2025-0039View on GitHub

Description

The OpenTelemetry Collector may intermittently fail to export telemetry data when the backend API is unavailable or overloaded. These failures manifest as timeouts (`context deadline exceeded`) or transient HTTP 502 responses. While retry logic is typically enabled, repeated failures can introduce delay or backpressure.

Mitigation

- Ensure `retry_on_failure` is enabled for all exporters. - Tune `timeout` and `sending_queue` settings to absorb temporary backend disruptions. - Monitor retry counts and dropped items via Collector metrics (e.g., `exporter/send_failed_spans`). - Consider buffering via Kafka or persistent queues for high-reliability workloads.

References