Version Packages by github-actions[bot] · Pull Request #1822 · cloudflare/agents

github-actions · 2026-06-27T12:18:06Z

This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.

Releases

agents@0.17.1

Patch Changes

#1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.
#1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).

@cloudflare/ai-chat@0.9.1

Patch Changes

#1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.
#1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).

@cloudflare/think@0.11.1

Patch Changes

#1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.
#1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).
#1821 de6a695 Thanks @threepointone! - Add an opt-in, read-only HTTP fetch capability for Think agents via the new @cloudflare/think/tools/fetch export and a fetchTools property on Think.

createFetchTools() generates a generic, allowlisted fetch_url tool plus one fetch_<name> tool per named service-binding/Fetcher target. It is GET-only with Workers-grounded SSRF defenses (private/loopback/link-local/*.internal blocking, URL normalization, credential rejection), separate download/model/workspace size limits (maxBytes, maxModelChars, response: "workspace" spill), an allowlist-aware redirect policy with cross-origin header stripping, a model header allowlist, and a tool:fetch observability event. Disabled by default.
#1823 b58b5a3 Thanks @threepointone! - Improve Think's tool-call lifecycle hooks (follow-ups from #1343):
- Preserve preliminary streaming through beforeToolCall. Tools whose execute is an async generator (async function* execute(...)) now stream their preliminary tool-results to the model even though Think wraps execute to consult beforeToolCall first. Non-streaming tools keep a scalar wrapper, so they never emit a synthetic preliminary chunk. The non-canonical async () => makeIterator() form (a Promise<AsyncIterable>) still collapses to its last yielded value, matching the raw AI SDK.
- Per-tool typing on the lifecycle contexts. When an explicit TOOLS generic is passed, narrowing on ctx.toolName now narrows ctx.input on beforeToolCall and — new — ctx.output on afterToolCall's success branch to that tool's inferred output type. Dynamic tools stay unknown. Behavior with the default ToolSet is unchanged.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

devin-ai-integration Bot reviewed Jun 27, 2026

View reviewed changes

github-actions Bot force-pushed the changeset-release/main branch 4 times, most recently from 1f7a6e1 to 3dedab6 Compare June 28, 2026 11:12

Version Packages

bcc9f1a

github-actions Bot force-pushed the changeset-release/main branch from 3dedab6 to bcc9f1a Compare June 28, 2026 11:15

threepointone added 2 commits June 28, 2026 04:17

raise agents floor

1634eac

Update pnpm-lock.yaml

69876d5

threepointone merged commit 688d722 into main Jun 28, 2026
6 checks passed

threepointone deleted the changeset-release/main branch June 28, 2026 11:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Version Packages#1822

Version Packages#1822
threepointone merged 3 commits into
mainfrom
changeset-release/main

github-actions Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

github-actions Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Releases

agents@0.17.1

Patch Changes

@cloudflare/ai-chat@0.9.1

Patch Changes

@cloudflare/think@0.11.1

Patch Changes

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 27, 2026 •

edited

Loading