Skip to content

fix: stop output_pydantic from leaking into tool-calling loop (#5472)#5821

Open
mahasarabesh wants to merge 2 commits into
crewAIInc:mainfrom
mahasarabesh:fix/issue-5472-output-pydantic-tool-loop
Open

fix: stop output_pydantic from leaking into tool-calling loop (#5472)#5821
mahasarabesh wants to merge 2 commits into
crewAIInc:mainfrom
mahasarabesh:fix/issue-5472-output-pydantic-tool-loop

Conversation

@mahasarabesh
Copy link
Copy Markdown

@mahasarabesh mahasarabesh commented May 15, 2026

Fixes #5472

What's wrong

Since v1.9.0, Task(output_pydantic=...) and Task(output_json=...) get mapped onto response_model in the agent executor. This means every LLM call in the tool loop sends both tools and response_format at the same time.

OpenAI kinda handles this, but vLLM, Gemini, Anthropic, and anything going through LiteLLM's InternalInstructor -- they all treat response_format as higher priority. The LLM returns structured JSON immediately and never calls any tools.

The post-processing path (Task._export_output() -> convert_to_model()) already handles converting raw output to Pydantic models after the loop. That's how it worked before v1.9.0 and it works fine.

What this PR does

  1. agent/core.py -- Only task.response_model (explicit opt-in) gets passed as response_model. output_pydantic and output_json stay in post-processing where they belong.

  2. crew_agent_executor.py -- Pass response_model=None in _invoke_loop_native_tools (sync + async). When tools are present, structured output shouldn't compete with them.

  3. experimental/agent_executor.py -- Same fix for call_llm_native_tools().

  4. Tests -- 4 regression tests covering output_pydantic, output_json, explicit response_model, and _update_executor_parameters.

How I tested

  • All 4 new tests pass
  • Full test_agent.py suite passes (84 passed, pre-existing failures from missing optional deps unchanged)

Note

There are two existing PRs for this -- #5680 (stops the mapping at source) and #5767 (suppresses during loop iterations). This PR combines both approaches: fix the mapping AND defensively suppress during tool loops as a safety net.

Summary by CodeRabbit

  • Bug Fixes

    • Executor response-model is now derived only from an explicit task response model and no longer falls back to legacy output formats.
    • Disabled response-model parsing during native tool invocation to prevent unnecessary validation of tool-call responses.
  • Tests

    • Added regression tests to ensure correct response-model behavior across task variants and executor updates.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0e9636f4-06e4-4b04-9c24-49a6f3ac3adf

📥 Commits

Reviewing files that changed from the base of the PR and between fc9f4c2 and 3ecf73d.

📒 Files selected for processing (4)
  • lib/crewai/src/crewai/agent/core.py
  • lib/crewai/src/crewai/agents/crew_agent_executor.py
  • lib/crewai/src/crewai/experimental/agent_executor.py
  • lib/crewai/tests/agents/test_agent.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • lib/crewai/src/crewai/experimental/agent_executor.py
  • lib/crewai/src/crewai/agents/crew_agent_executor.py
  • lib/crewai/tests/agents/test_agent.py
  • lib/crewai/src/crewai/agent/core.py

📝 Walkthrough

Walkthrough

This PR prevents Task.output_pydantic/output_json from setting the agent executor's response_model and ensures native tool-calling paths call get_llm_response with response_model=None so structured-output validation does not interfere with tool invocation.

Changes

Response Model Separation from Tool-Calling Loop

Layer / File(s) Summary
Agent executor response_model initialization and updates
lib/crewai/src/crewai/agent/core.py
create_agent_executor and _update_executor_parameters now derive executor response_model only from explicit task.response_model (or None), removing the fallback to task.output_pydantic and task.output_json.
Native tool execution response_model fixes
lib/crewai/src/crewai/agents/crew_agent_executor.py, lib/crewai/src/crewai/experimental/agent_executor.py
Native tool-calling paths in _invoke_loop_native_tools, _ainvoke_loop_native_tools, and call_llm_native_tools now pass response_model=None to get_llm_response to prevent structured output validation from interfering with tool invocation.
Regression test coverage
lib/crewai/tests/agents/test_agent.py
New TestOutputPydanticDoesNotLeakIntoResponseModel class with four test methods verifying that output_pydantic/output_json do not set executor response_model, explicit response_model is still respected, and executor parameter updates with different task types produce correct final values.

Sequence Diagram(s)

sequenceDiagram
  participant AgentExecutor
  participant LLM
  participant ToolRunner
  AgentExecutor->>LLM: get_llm_response(..., response_model=None)
  LLM->>AgentExecutor: return tool-call list OR final-answer
  AgentExecutor->>ToolRunner: run indicated tools (if any)
  ToolRunner->>AgentExecutor: tool outputs
  AgentExecutor->>LLM: (final formatting or post-process) optional
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

size/M

Suggested reviewers

  • greysonlalonde

Poem

🐰 I nibbled code beneath the moon,
Where JSON schemas hummed a tune.
I freed the tools to run and play,
Then shaped the answer on its way.
Hooray — the crew can work today!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: preventing output_pydantic from leaking into the tool-calling loop, which is the primary objective of this PR.
Linked Issues check ✅ Passed The PR fully addresses the objectives from issue #5472: removes output_pydantic/output_json mapping to response_model in agent/core.py, passes response_model=None during tool loops in crew_agent_executor.py and experimental/agent_executor.py, and adds comprehensive regression tests covering all scenarios.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the output_pydantic leakage issue: agent/core.py fixes response_model assignment, executor files fix tool-loop response_model passing, and tests verify the fix. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

…Inc#5472)

Since v1.9.0, Task(output_pydantic=...) and Task(output_json=...) get
mapped onto response_model in the agent executor. This means every LLM
call in the tool loop sends both tools and response_format at the same
time. Non-OpenAI providers (vLLM, Gemini, Anthropic, LiteLLM) treat
response_format as higher priority - the LLM returns structured JSON
immediately and never calls any tools.

Fix:
- agent/core.py: Only task.response_model (explicit opt-in) gets passed
  as response_model. output_pydantic/output_json stay in post-processing.
- crew_agent_executor.py: Pass response_model=None in native tool loops
  (sync + async) so structured output never competes with tools.
- experimental/agent_executor.py: Same fix for call_llm_native_tools().
- 4 regression tests added.
@mahasarabesh mahasarabesh force-pushed the fix/issue-5472-output-pydantic-tool-loop branch from fc9f4c2 to 3ecf73d Compare May 17, 2026 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] output_pydantic / response_model leaks into agent tool-calling loop, causing tools to be skipped on non-OpenAI LLMs

1 participant