feat(transport): add HTTP retry with exponential backoff by jpnurmi · Pull Request #1520 · getsentry/sentry-native

jpnurmi · 2026-02-13T15:44:54Z

Add sentry_options_set_http_retries() to configure retry attempts for transient network errors. Failed envelopes are stored as <db>/cache/<ts>-<n>-<uuid>.envelope and retried with exponential backoff (15min, 30min, 1h, 2h, 8h) modeled after Crashpad's upload retry behavior. When retries are exhausted, and offline caching is enabled, envelopes are stored as <db>/cache/<uuid>.envelope instead of being discarded.

flowchart TD
    startup --> R{retry?}
    R -->|yes| throttle
    R -->|no| C{cache?}
    throttle -. 100ms .-> resend
    resend -->|success| C
    resend -->|fail| C2[&lt;db&gt;/cache/<br/>&lt;ts&gt;-&lt;n&gt;-&lt;uuid&gt;.envelope]
    C2 --> backoff
    backoff -. 2ⁿ×15min .-> resend
    C -->|yes| CACHE[&lt;db&gt;/cache/<br/>&lt;uuid&gt;.envelope]
    C -->|no| discard

See also: https://develop.sentry.dev/sdk/expected-features/#buffer-to-disk

Depends on:

See also:

Support Offline Caching of Envelopes #1316

github-actions · 2026-02-13T15:45:22Z

	Messages
📖	Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by 🚫 dangerJS against 0902da2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The deferred startup retry scan (100ms delay) could pick up files written by the current session. Filter by startup_time so only previous-session files are processed. Also ensure the cache directory exists when cache_keep is enabled, since sentry__process_old_runs only creates it conditionally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Monotonic time is process-relative and doesn't work across restarts. Retry envelope timestamps need to persist across sessions, so use time() (seconds since epoch) for file timestamps, startup_time, and backoff comparison. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rename SENTRY_RETRY_BACKOFF_BASE_MS to SENTRY_RETRY_BACKOFF_BASE_S and sentry__retry_backoff_ms to sentry__retry_backoff, since file timestamps are now in seconds. The bgworker delay sites multiply by 1000 to convert to the milliseconds it expects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move startup_time initialization into sentry__retry_new and remove the unnecessary sentry__retry_set_startup_time indirection. Tests now use write_retry_file with timestamps well in the past to match production behavior where retry files are from previous sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When files exist but aren't eligible yet (backoff not elapsed), foreach was returning 0 causing the retry polling task to stop. Return total valid retry files found instead of just the eligible count so the caller keeps rescheduling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Make handle_result return bool (true = file rescheduled for retry, false = file consumed) and use it in foreach to decrement the total count. This avoids one extra no-op poll cycle after the last retry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…Y_THROTTLE Replace SENTRY_RETRY_BACKOFF_BASE_S and SENTRY_RETRY_STARTUP_DELAY_MS with ms-based constants so the transport uses them directly without leaking unit conversion details. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Give the retry module a bgworker ref and send callback so it owns all scheduling. Transport just calls _start and _enqueue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sentry__retry_new only returns NULL on failure, not based on options. sentry__retry_start and _enqueue require non-NULL retry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deduplicate prepare/send/free sequence shared by retry_send_cb and http_send_task. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Already covered by retry_throttle and retry_result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ashpad Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pass startup_time directly to _foreach as a `before` filter instead of a bool. Clear it after the first run so subsequent polls use backoff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deduplicate filename construction across write_envelope, handle_result, and tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When the transport supports retry and http_retries > 0, sentry__process_old_runs now skips caching .envelope files from old runs. The retry system handles persistence, so duplicating into cache/ is unnecessary. Also simplifies sentry__retry_handle_result: only cache on max retries exhausted, not on successful send. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move the retry-aware check before cache_dir creation so we avoid mkdir when the retry system handles persistence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The retry callback now receives a sentry_envelope_t and returns a status code. The retry system handles deserialization and file lifecycle internally, keeping path concerns out of the transport. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add test case for successful send at max retry count with cache_keep enabled, confirming envelopes are cached regardless of send outcome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…lopes The startup poll used `ts >= startup_time` to skip envelopes written after startup. With second-precision timestamps, this also skipped cross-session envelopes written in the same second as a fast restart. Reset `startup_time` in `sentry__retry_enqueue` so the startup poll falls through to the backoff path for same-session envelopes. The bgworker processes the send task (immediate) before the startup poll (delayed), so by the time the poll fires, `startup_time` is already 0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Submit a one-shot retry send task before bgworker shutdown to ensure pre-existing retry files are sent even if the startup poll hasn't fired yet. The flush checks startup_time on the worker thread to avoid re-sending files already handled by enqueue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace `time(NULL)` (1-second granularity) with `sentry__usec_time() / 1000` (millisecond granularity) to avoid timestamp collisions that caused flaky `>=` vs `>` comparison behavior in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Make sentry__retry_flush block until the flush task completes by adding a bgworker_flush call, and subtract the elapsed time from the shutdown timeout. This ensures retries are actually sent before the worker stops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Break out of the send loop on the first network error to avoid wasting time on a dead connection. Remaining envelopes stay untouched for the next retry poll. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When bgworker shutdown times out, persist any remaining queued envelopes to the retry directory so they are not lost. The retry module provides sentry__retry_dump_queue to keep retry internals out of the transport. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

After shutdown timeout, the bgworker thread is detached but may still be executing an http_send_task. Since dump_queue already saves that task's envelope to the retry dir, the worker's subsequent call to retry_enqueue would create a duplicate file. Seal the retry module after dumping so that any late enqueue calls are silently skipped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… logic Remove count_eligible_files helper that duplicated filtering logic from sentry__retry_send. The retry_backoff test now exercises the actual send path for both backoff and startup modes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Store parsed fields (ts, count, uuid) alongside the path during the filter phase so handle_result and future debug logging can use them without re-parsing. Also improves sort performance by comparing numeric fields before falling back to string comparison. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Log retry attempts at DEBUG level and max-retries-reached at WARN level to make retry behavior observable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…writes Three places independently constructed <database>/cache and wrote envelopes there. Add cache_path to sentry_run_t and introduce sentry__run_write_cache() and sentry__run_move_cache() to centralize the cache directory creation and file operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CURLOPT_TIMEOUT_MS is a total transfer timeout that could cut off large envelopes. Use CURLOPT_CONNECTTIMEOUT_MS instead so only connection establishment is bounded. For winhttp, limit resolve and connect to 15s but leave send/receive at their defaults. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Without this, sentry__retry_send overcounts remaining files, causing an unnecessary extra poll cycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Restructure handle_result so "max retries reached" warnings only fire on actual network failures, not on successful delivery at the last attempt. Separate the warning logic from the cache/discard actions and put the re-enqueue branch first for clarity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the `can_retry` bool on the transport with a `retry_func` callback, and expose `sentry_transport_retry()` as an experimental public API for explicitly retrying all pending envelopes, e.g. when coming back online. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move retry envelopes from a separate retry/ directory into cache/ so that sentry__cleanup_cache() enforces disk limits for both file formats out of the box. The two formats are distinguishable by length: retry files use <ts>-<count>-<uuid>.envelope (49+ chars) while cache files use <uuid>.envelope (45 chars). Default http_retries to 0 (opt-in). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jpnurmi force-pushed the jpnurmi/feat/http-retry branch from df2be97 to b083a57 Compare February 13, 2026 16:59

jpnurmi and others added 28 commits February 13, 2026 18:47

feat(sync): add delayed task submission for throttling (#1506)

00467c6

ref: parse event_id from raw envelopes (#1512)

6803c4f

ref(transport): unify and restructure HTTP transport layer (#1514)

ee59010

feat(transport): add HTTP retry logic with exponential backoff

9d0bd28

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): replace scan+free_paths with foreach callback API

9ad1146

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(transport): set 15s request timeout for curl and winhttp

d20d385

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(retry): avoid duplicate delayed retry tasks on startup

3befa69

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): take options in sentry__retry_new, own path construction

061aee1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): encapsulate retry scheduling into the retry module

d068438

Give the retry module a bgworker ref and send callback so it owns all scheduling. Transport just calls _start and _enqueue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(transport): remove unnecessary includes, restore blank line

cee51c9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): move precondition checks to callers

ee87b8a

sentry__retry_new only returns NULL on failure, not based on options. sentry__retry_start and _enqueue require non-NULL retry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(transport): extract http_send_envelope helper

8813f7c

Deduplicate prepare/send/free sequence shared by retry_send_cb and http_send_task. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test(retry): remove redundant retry_no_duplicate_rescan test

d1708ad

Already covered by retry_throttle and retry_result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(curl): use CURLOPT_TIMEOUT_MS for consistency with winhttp and cr…

63785de

…ashpad Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): unify startup and poll into a single task

cd7a104

Pass startup_time directly to _foreach as a `before` filter instead of a bool. Clear it after the first run so subsequent polls use backoff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(retry): extract sentry__retry_make_path helper

2bc0c34

Deduplicate filename construction across write_envelope, handle_result, and tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(database): derive can_cache flag to skip cache dir creation early

fb20a4c

Move the retry-aware check before cache_dir creation so we avoid mkdir when the retry system handles persistence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test(retry): verify cache_keep preserves envelopes on successful send

7c8817e

Add test case for successful send at max retry count with cache_keep enabled, confirming envelopes are cached regardless of send outcome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(retry): use PRIu64 format specifier for uint64_t

427e4f8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jpnurmi and others added 2 commits February 13, 2026 18:47

fix(retry): guard against unsigned underflow in backoff check

60f836c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jpnurmi force-pushed the jpnurmi/feat/http-retry branch from b083a57 to a264f66 Compare February 13, 2026 17:47

jpnurmi and others added 20 commits February 14, 2026 10:40

fix(retry): stop retrying on network failure

9d52727

Break out of the send loop on the first network error to avoid wasting time on a dead connection. Remaining envelopes stay untouched for the next retry poll. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test(retry): update expectations for stop-on-failure behavior

6ead2e6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style(retry): fix line length in unit tests

7aa2d21

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add changelog entry for HTTP retry feature

18ede32

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(retry): raise backoff cap from 2h to 8h to match crashpad

a888a37

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(retry): add debug and warning output for HTTP retries

da75ba9

Log retry attempts at DEBUG level and max-retries-reached at WARN level to make retry behavior observable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(retry): decrement total count when removing corrupt envelope files

03ea224

Without this, sentry__retry_send overcounts remaining files, causing an unnecessary extra poll cycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs(retry): add doc comments to sentry_retry.h declarations

a581ba5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(transport): add HTTP retry with exponential backoff#1520

feat(transport): add HTTP retry with exponential backoff#1520
jpnurmi wants to merge 50 commits intomasterfrom
jpnurmi/feat/http-retry

jpnurmi commented Feb 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jpnurmi commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jpnurmi commented Feb 13, 2026 •

edited

Loading

github-actions bot commented Feb 13, 2026 •

edited

Loading