fix: ignore negative Retry-After header instead of disabling 429 back-off#1995
Merged
Conversation
…r_header`
The `delay-seconds` branch accepted a negative value (e.g. `Retry-After: -5`)
and returned a negative `timedelta`, while the HTTP-date branch already rejects
non-positive delays. RFC 7231 defines `delay-seconds` as a non-negative integer.
A negative delta flows into `ThrottlingRequestManager.record_domain_delay`,
setting `throttled_until` in the past, so `_is_domain_throttled` returns False
and the server's HTTP 429 backoff is silently skipped. Guard the integer branch
to ignore negative values (falling through to `None`), consistent with the
HTTP-date branch. `0` ("retry immediately") and positive values are unchanged.
Pijukatel
approved these changes
Jun 30, 2026
Pijukatel
left a comment
Collaborator
There was a problem hiding this comment.
Looks good, thank you for the fix.
vdusek
approved these changes
Jun 30, 2026
vdusek
left a comment
Collaborator
There was a problem hiding this comment.
LGTM - I just used try-except-else structure
Retry-After delay-seconds in parse_retry_after_header
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
parse_retry_after_header(src/crawlee/_utils/http.py) accepted a negativeRetry-After: -Nvalue and returned a negativetimedelta, while the HTTP-datebranch in the same function already rejects non-positive delays
(
if delay.total_seconds() > 0). RFC 7231 §7.1.3defines
delay-secondsas a non-negative integer, so a value like-5is malformed.This is not cosmetic. The result feeds
ThrottlingRequestManager.record_domain_delay,where
state.throttled_until = datetime.now(timezone.utc) + delay(
src/crawlee/request_loaders/_throttling_request_manager.py). A negativedelaysets
throttled_untilin the past, so_is_domain_throttledreturnsFalseandthe server's HTTP 429 back-off is silently skipped, i.e. the crawler does not
rate-limit itself after a 429 that carries a malformed negative
Retry-After.The fix guards the integer branch with
if seconds >= 0:. A negative value nowfalls through to the HTTP-date branch (where
parsedate_to_datetime('-5')raisesand is caught) and the function returns
None, consistent with the HTTP-datebranch.
0("retry immediately") and all positive values are unchanged.Issues
per-domain 429 throttling added by feat: Add opt-in per-domain request throttling for HTTP 429 backoff #1762.
Testing
Added two focused unit tests in the existing
# ── Utility Tests ──block oftests/unit/test_throttling_request_manager.py, next totest_parse_retry_after_integer_seconds:test_parse_retry_after_negative_seconds:parse_retry_after_header('-5') is None(the bug; fails on
master, passes with the fix).test_parse_retry_after_zero_seconds:parse_retry_after_header('0') == timedelta(0)(boundary guard so the
>= 0fix does not regress the valid "retry immediately" case).Commands run locally (offline):
Fail-first confirmed: with only the source reverted to
master, the negative-secondstest fails (
assert datetime.timedelta(days=-1, seconds=86395) is None) while thezero-seconds test still passes.
Checklist