fix(context): prevent context overflow by clamping max_tokens before API call by yishuiliunian · Pull Request #116 · AgentsMesh/Loopal

yishuiliunian · 2026-04-16T07:54:39Z

Summary

Fix input_tokens + max_tokens > context_window API rejection that permanently blocks the agent
Root cause: budget system capped output_reserve at 16K for compaction thresholds but sent 64K max_tokens to API, creating a ~37K token gap
Implements 4-layer defense: pre-flight clamp, overflow detection fix, runtime recovery

Changes

Layer 1 — Budget awareness (crates/loopal-context/src/budget.rs)

Add max_output_tokens field to ContextBudget (uncapped, for API constraint validation)
Add clamp_output_tokens(estimated_input) method for pre-flight max_tokens clamping

Layer 2 — Pre-flight validation (crates/loopal-runtime/src/agent_loop/llm_params.rs)

Estimate input tokens (system + tools + messages) before each LLM call
Dynamically reduce max_tokens via clamp_output_tokens() when headroom is tight

Layer 3 — Error detection (crates/loopal-provider/src/anthropic/mod.rs, crates/loopal-error/src/helpers.rs)

Add "exceed context limit" pattern to ContextOverflow detection (previously classified as generic 400)

Layer 4 — Runtime recovery (crates/loopal-runtime/src/agent_loop/run.rs)

Catch ContextOverflow → emergency compact → retry once (prevents permanent agent block)

Test plan

bazel build //... --config=clippy — zero warnings
bazel build //... --config=rustfmt — passes
loopal-context tests (budget clamp, degradation, store) — all pass
loopal-error tests (overflow detection patterns) — all pass
loopal-provider tests — all pass
loopal-runtime tests (241 tests including LLM params) — all pass
loopal-tui tests (compact edge cases) — all pass
CI passes

…API call The budget system capped output_reserve at 16K for compaction thresholds but sent the full max_output_tokens (64K) to the API, creating a ~37K token gap where input could grow beyond the API's hard constraint (input + max_tokens <= context_window). Four-layer defense: - Add max_output_tokens to ContextBudget with clamp_output_tokens() for pre-flight validation against the real API constraint - Estimate input tokens before each LLM call and dynamically reduce max_tokens when headroom is insufficient - Fix ContextOverflow detection to match "exceed context limit" pattern from Anthropic API (previously unrecognized, classified as generic 400) - Add runtime recovery: catch ContextOverflow, emergency compact, retry once instead of permanently blocking the agent

yishuiliunian merged commit 2809d65 into main Apr 16, 2026
4 checks passed

yishuiliunian deleted the fix/context-overflow-max-tokens-clamp branch April 16, 2026 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(context): prevent context overflow by clamping max_tokens before API call#116

fix(context): prevent context overflow by clamping max_tokens before API call#116
yishuiliunian merged 1 commit intomainfrom
fix/context-overflow-max-tokens-clamp

yishuiliunian commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yishuiliunian commented Apr 16, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant