fix: suppress response_model during native tool loop for non-OpenAI providers#5767
fix: suppress response_model during native tool loop for non-OpenAI providers#5767HrushiYadav wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughThe PR defers application of ChangesResponse Model Post-Tool Application
Sequence Diagram(s)sequenceDiagram
participant AgentExecutor
participant LLM
participant Tools
AgentExecutor->>LLM: get_llm_response(messages, tools=Tools, response_model=None)
alt LLM returns tool call
LLM->>AgentExecutor: tool_call
AgentExecutor->>Tools: run tool_call
Tools-->>AgentExecutor: tool_result
AgentExecutor->>LLM: get_llm_response(updated_messages, tools=Tools, response_model=None)
else LLM returns final string
LLM-->>AgentExecutor: final_text
end
alt response_model is configured
AgentExecutor->>LLM: get_llm_response(messages_with_final_text, tools=None, response_model=Model)
LLM-->>AgentExecutor: BaseModel or string
AgentExecutor-->>AgentExecutor: wrap into AgentFinish (JSON text if BaseModel else raw string)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/crewai/src/crewai/agents/crew_agent_executor.py (1)
475-520:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAsync native-tools flow still has the original regression
The sync path now suppresses
response_modelduring tool iterations, butCrewAgentExecutor._ainvoke_loop_native_toolsstill sendsresponse_model=self.response_model(Line 1327). Async agents can still skip tools on non-OpenAI providers, so this fix is only partial.Suggested parity fix for async native-tools loop
@@ async def _ainvoke_loop_native_tools(self) -> AgentFinish: - answer = await aget_llm_response( + answer = await aget_llm_response( llm=cast("BaseLLM", self.llm), messages=self.messages, callbacks=self.callbacks, printer=PRINTER, tools=openai_tools, available_functions=None, from_task=self.task, from_agent=self.agent, - response_model=self.response_model, + response_model=None, executor_context=self, verbose=self.agent.verbose, ) @@ if isinstance(answer, str): + if self.response_model is not None: + enforce_rpm_limit(self.request_within_rpm_limit) + answer = await aget_llm_response( + llm=cast("BaseLLM", self.llm), + messages=self.messages, + callbacks=self.callbacks, + printer=PRINTER, + from_task=self.task, + from_agent=self.agent, + response_model=self.response_model, + executor_context=self, + verbose=self.agent.verbose, + ) + + if isinstance(answer, BaseModel): + output_json = answer.model_dump_json() + formatted_answer = AgentFinish( + thought="", + output=answer, + text=output_json, + ) + await self._ainvoke_step_callback(formatted_answer) + self._append_message(output_json) + self._show_logs(formatted_answer) + return formatted_answer + + answer_str = answer if isinstance(answer, str) else str(answer) formatted_answer = AgentFinish( thought="", - output=answer, - text=answer, + output=answer_str, + text=answer_str, ) await self._ainvoke_step_callback(formatted_answer) - self._append_message(answer) + self._append_message(answer_str) self._show_logs(formatted_answer) return formatted_answer🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/agents/crew_agent_executor.py` around lines 475 - 520, The async native-tools loop still passes self.response_model into get_llm_response, so change _ainvoke_loop_native_tools to mirror the sync path: when invoking get_llm_response inside the tool-iteration loop, set response_model=None (use get_llm_response(..., response_model=None, ...)), detect tool-call lists with _is_tool_call_list and handle them via _handle_native_tool_calls as before, and only after the tool loop completes make one final async get_llm_response call with response_model=self.response_model to apply the schema; ensure you update the calls to get_llm_response and preserve existing arguments (callbacks, printer, executor_context, verbose) and flow control.
🧹 Nitpick comments (1)
lib/crewai/src/crewai/agents/crew_agent_executor.py (1)
509-520: ⚡ Quick winRate-limit the post-loop extraction call too
Line 510 adds a second LLM call in the same iteration, but there is no second
enforce_rpm_limit(...)before it. That can bypass request throttling accounting.Suggested guard before the extraction call
if self.response_model is not None: + enforce_rpm_limit(self.request_within_rpm_limit) answer = get_llm_response( llm=cast("BaseLLM", self.llm), messages=self.messages, callbacks=self.callbacks, printer=PRINTER,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/agents/crew_agent_executor.py` around lines 509 - 520, The post-loop extraction call using get_llm_response (when self.response_model is not None) is missing the same rate-limit guard used earlier; before invoking get_llm_response in crew_agent_executor.py add a call to enforce_rpm_limit(...) with the same parameters used for the main LLM call so this second request is counted and throttled too (keep the call in the same conditional branch that checks self.response_model and use the same executor/context values such as self.llm, self.task, and self.agent to mirror the existing rate-limit invocation).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@lib/crewai/src/crewai/agents/crew_agent_executor.py`:
- Around line 475-520: The async native-tools loop still passes
self.response_model into get_llm_response, so change _ainvoke_loop_native_tools
to mirror the sync path: when invoking get_llm_response inside the
tool-iteration loop, set response_model=None (use get_llm_response(...,
response_model=None, ...)), detect tool-call lists with _is_tool_call_list and
handle them via _handle_native_tool_calls as before, and only after the tool
loop completes make one final async get_llm_response call with
response_model=self.response_model to apply the schema; ensure you update the
calls to get_llm_response and preserve existing arguments (callbacks, printer,
executor_context, verbose) and flow control.
---
Nitpick comments:
In `@lib/crewai/src/crewai/agents/crew_agent_executor.py`:
- Around line 509-520: The post-loop extraction call using get_llm_response
(when self.response_model is not None) is missing the same rate-limit guard used
earlier; before invoking get_llm_response in crew_agent_executor.py add a call
to enforce_rpm_limit(...) with the same parameters used for the main LLM call so
this second request is counted and throttled too (keep the call in the same
conditional branch that checks self.response_model and use the same
executor/context values such as self.llm, self.task, and self.agent to mirror
the existing rate-limit invocation).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 567bc033-6475-4457-a308-ed19ced60766
📒 Files selected for processing (1)
lib/crewai/src/crewai/agents/crew_agent_executor.py
|
Good catch - applied the same fix to |
…roviders Since v1.9.0, response_model is passed alongside tools on every iteration of _invoke_loop_native_tools. Providers such as Gemini and Anthropic treat response_format (structured output) as higher priority than tools, so the LLM returns a structured response immediately without making any tool calls. Fix: pass response_model=None while tools are active. Once the tool loop completes and the LLM returns a text answer, make one final call without tools to apply the structured schema if response_model is set. This restores pre-v1.9.0 behavior where structured output was applied only in post-processing after tool calls were exhausted. Fixes crewAIInc#5472
_ainvoke_loop_native_tools had the same regression as the sync path: response_model was passed alongside tools on every async iteration, causing non-OpenAI providers to skip tool calls. Apply the same two-phase fix: suppress response_model=None during tool iterations, then make one final async call without tools once the loop produces a text answer.
73a6a23 to
a0cd32b
Compare
…utor Same regression as CrewAgentExecutor: call_llm_native_tools passed response_model alongside tools on every iteration. Providers like Gemini and Anthropic skip tool calls when response_format is set, so tools were silently never called. Applied the same two-phase fix: suppress response_model=None during tool iterations, apply it in a final extraction call once the loop produces a text answer. Consistent with how the Plan-and-Execute path already handles it in the synthesis step.
|
Also extended the fix to |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lib/crewai/src/crewai/experimental/agent_executor.py (1)
1362-1370: 💤 Low valueConsider extracting BaseModel finalization to a helper method.
The block at lines 1362-1370 duplicates the logic from lines 1337-1345. Both handle the same pattern: wrapping a
BaseModelinAgentFinish, invoking callbacks, appending to state, and returning the route.♻️ Proposed refactor to reduce duplication
+ def _finalize_native_basemodel_answer(self, answer: BaseModel) -> Literal["native_finished", "todo_satisfied"]: + """Wrap a BaseModel answer in AgentFinish and route appropriately.""" + self.state.current_answer = AgentFinish( + thought="", + output=answer, + text=answer.model_dump_json(), + ) + self._invoke_step_callback(self.state.current_answer) + self._append_message_to_state(answer.model_dump_json()) + return self._route_finish_with_todos("native_finished")Then replace both occurrences (lines 1337-1345 and 1362-1370) with:
if isinstance(answer, BaseModel): return self._finalize_native_basemodel_answer(answer)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/experimental/agent_executor.py` around lines 1362 - 1370, Extract the duplicated BaseModel finalization logic into a helper method (e.g., _finalize_native_basemodel_answer) that takes the BaseModel instance and performs: create AgentFinish with thought="", output=answer, text=answer.model_dump_json(), set self.state.current_answer, call self._invoke_step_callback(self.state.current_answer), call self._append_message_to_state(answer.model_dump_json()), and return self._route_finish_with_todos("native_finished"); then replace both duplicated blocks that check isinstance(answer, BaseModel) with a single call return self._finalize_native_basemodel_answer(answer).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@lib/crewai/src/crewai/experimental/agent_executor.py`:
- Around line 1362-1370: Extract the duplicated BaseModel finalization logic
into a helper method (e.g., _finalize_native_basemodel_answer) that takes the
BaseModel instance and performs: create AgentFinish with thought="",
output=answer, text=answer.model_dump_json(), set self.state.current_answer,
call self._invoke_step_callback(self.state.current_answer), call
self._append_message_to_state(answer.model_dump_json()), and return
self._route_finish_with_todos("native_finished"); then replace both duplicated
blocks that check isinstance(answer, BaseModel) with a single call return
self._finalize_native_basemodel_answer(answer).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: dd8c0381-999a-4ded-9dc5-f56c6de5fc96
📒 Files selected for processing (1)
lib/crewai/src/crewai/experimental/agent_executor.py
Fixes #5472.
Since v1.9.0,
response_modelis passed alongsidetoolson every iteration of the native tool loop in bothCrewAgentExecutorand the newAgentExecutor. Providers like Gemini and Anthropic treatresponse_formatas higher priority than tools, so the LLM skips tool calls entirely and returns structured output on the first call.Changed both
_invoke_loop_native_tools(CrewAgentExecutor) andcall_llm_native_tools(AgentExecutor) to passresponse_model=Noneduring tool iterations, then make one final extraction call without tools once the loop produces a text answer. The new AgentExecutor already uses this pattern correctly in its Plan-and-Execute synthesis step - this brings the native tools path in line with it. Also applied the same fix to the async path inCrewAgentExecutor.Summary by CodeRabbit