refactor(a2a): use tool calling for delegation instead of structured output#5751
refactor(a2a): use tool calling for delegation instead of structured output#5751greysonlalonde wants to merge 4 commits into
Conversation
…output Each remote A2A agent is now exposed to the local LLM as a BaseTool (delegate_to_<card_name>); the local agent's tool-call loop drives multi-turn delegation. The Literal-constrained AgentResponse model and the explicit per-turn re-prompting loop are gone. Closes #3897. The original failure mode — Pydantic literal_error when skill.id != endpoint URL, and Gemini flash-lite hallucinating out-of-enum values — is structurally impossible: provider-side tool-call validation enforces the tool name, and there's no competing identifier.
📝 WalkthroughWalkthroughThis PR refactors Agent-to-Agent (A2A) delegation from a protocol-based response-parsing architecture to a tool-driven mechanism. Removes ChangesA2A Tool-Based Delegation Architecture
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
|
|
||
| try: | ||
| from a2a.types import Message, Role | ||
| from a2a.types import TaskState # noqa: F401 |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit f32fe81. Configure here.
| a2a_agents=a2a_agents, | ||
| original_fn=task_to_kickoff_adapter, | ||
| task=fake_task, | ||
| agent_response_model=agent_response_model, |
There was a problem hiding this comment.
LiteAgent adapter ignores A2A delegation tools
High Severity
The task_to_kickoff_adapter in _kickoff_with_a2a_support ignores the tools parameter entirely, calling original_kickoff(messages, response_format, input_files) with the original arguments. Since _execute_task_with_a2a now passes A2A delegation tools via combined_tools, those tools are silently discarded for the LiteAgent path. The LLM sees an augmented prompt referencing delegate_to_* tools that aren't available, making A2A delegation completely non-functional for LiteAgent.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit f32fe81. Configure here.
| agent_card=agent_card.model_dump() if agent_card else None, | ||
| ), | ||
| ) | ||
| return _apply_response_extensions(state, result_text, extension_states) |
There was a problem hiding this comment.
trust_remote_completion_status config field silently ignored
Medium Severity
The trust_remote_completion_status field still exists on both A2AConfig and A2AClientConfig, but _finalize_turn never reads it. Previously, when set to True, the remote agent's completed result bypassed further LLM processing. Now the flag is silently ignored — a behavioral regression for any user who relied on it.
Reviewed by Cursor Bugbot for commit f32fe81. Configure here.
| message: str, | ||
| *, | ||
| sync: bool, | ||
| ) -> str: |
There was a problem hiding this comment.
Unused sync parameter in _run_delegation
Low Severity
_run_delegation accepts a keyword-only sync: bool parameter that is never referenced in the function body. The only caller passes sync=True. This is dead code that adds confusion about whether the function has sync/async branching logic.
Reviewed by Cursor Bugbot for commit f32fe81. Configure here.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/a2a/wrapper.py`:
- Around line 412-427: The context manager _temporarily_extend_tools currently
mutates agent.tools and always restores the original reference, which can
clobber concurrent changes; change it to set agent.tools to a new list object
(never mutate the original in-place) and on exit only restore original_tools if
agent.tools is still the same object you assigned (use identity comparison,
e.g., if agent.tools is temp_tools: agent.tools = original_tools) so concurrent
kickoff calls that replaced tools won't be overwritten; ensure you still handle
original_tools being None and the early-return when not extra.
In `@lib/crewai/src/crewai/lite_agent.py`:
- Around line 124-128: task_to_kickoff_adapter is calling
original_kickoff(messages, ...) and ignoring the wrapped task returned by
_execute_task_with_a2a, so the augmented A2A prompt and delegate_to_* tools are
dropped; fix by capturing the result of _execute_task_with_a2a (the wrapped
task/description and tools) and pass its description and tools into
original_kickoff (use the wrapped task.description and the returned tools
instead of the original messages/empty tools), ensuring the delegation context
from _execute_task_with_a2a is forwarded to original_kickoff.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: c860202e-d141-4c3b-b1f3-56013be240ad
📒 Files selected for processing (8)
lib/crewai/src/crewai/a2a/templates.pylib/crewai/src/crewai/a2a/tools.pylib/crewai/src/crewai/a2a/types.pylib/crewai/src/crewai/a2a/utils/response_model.pylib/crewai/src/crewai/a2a/wrapper.pylib/crewai/src/crewai/agent/core.pylib/crewai/src/crewai/lite_agent.pylib/crewai/tests/agents/test_a2a_trust_completion_status.py
💤 Files with no reviewable changes (1)
- lib/crewai/src/crewai/a2a/types.py
| @contextlib.contextmanager | ||
| def _temporarily_extend_tools(agent: Agent, extra: list[BaseTool]) -> Iterator[None]: | ||
| """Append ``extra`` to ``agent.tools`` for the lifetime of the context.""" | ||
| if not extra: | ||
| yield | ||
| return | ||
| original_tools = agent.tools | ||
| if original_tools is None: | ||
| agent.tools = list(extra) | ||
| else: | ||
| agent.tools = [*original_tools, *extra] | ||
| try: | ||
| yield | ||
| finally: | ||
| task.description = original_description | ||
| task.output_pydantic = original_output_pydantic | ||
| task.response_model = original_response_model | ||
| agent.tools = original_tools | ||
|
|
There was a problem hiding this comment.
Guard temporary agent.tools mutation against concurrent kickoff calls
Line 418 mutates shared agent.tools in-place for the whole call window; overlapping kickoff executions on the same Agent can interleave and restore the wrong tool list. That can leak/miss delegate_to_* tools across requests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/a2a/wrapper.py` around lines 412 - 427, The context
manager _temporarily_extend_tools currently mutates agent.tools and always
restores the original reference, which can clobber concurrent changes; change it
to set agent.tools to a new list object (never mutate the original in-place) and
on exit only restore original_tools if agent.tools is still the same object you
assigned (use identity comparison, e.g., if agent.tools is temp_tools:
agent.tools = original_tools) so concurrent kickoff calls that replaced tools
won't be overwritten; ensure you still handle original_tools being None and the
early-return when not extra.
| from crewai.a2a.utils.response_model import extract_a2a_client_configs | ||
| from crewai.a2a.wrapper import _execute_task_with_a2a | ||
| from crewai.task import Task | ||
|
|
||
| a2a_agents, agent_response_model = get_a2a_agents_and_response_model(agent.a2a) | ||
| a2a_agents = extract_a2a_client_configs(agent.a2a) |
There was a problem hiding this comment.
LiteAgent A2A path discards the delegation context/tools
In task_to_kickoff_adapter, Line 155 calls original_kickoff(messages, ...) and ignores the wrapped task.description plus tools provided by _execute_task_with_a2a. In the new tool-driven flow, that means the LLM never gets the augmented A2A prompt and delegate_to_* tools.
💡 Suggested fix
def task_to_kickoff_adapter(
self: Any, task: Task, context: str | None, tools: list[Any] | None
) -> str:
- result = original_kickoff(messages, response_format, input_files)
- return result.raw
+ wrapped_messages: str | list[LLMMessage]
+ if isinstance(messages, str):
+ wrapped_messages = task.description
+ else:
+ wrapped_messages = [*messages, {"role": "user", "content": task.description}]
+
+ original_parsed_tools = self._parsed_tools
+ if tools:
+ self._parsed_tools = [*self._parsed_tools, *parse_tools(cast(list[BaseTool], tools))]
+ try:
+ result = original_kickoff(wrapped_messages, response_format, input_files)
+ return result.raw
+ finally:
+ self._parsed_tools = original_parsed_toolsAlso applies to: 152-166
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/lite_agent.py` around lines 124 - 128,
task_to_kickoff_adapter is calling original_kickoff(messages, ...) and ignoring
the wrapped task returned by _execute_task_with_a2a, so the augmented A2A prompt
and delegate_to_* tools are dropped; fix by capturing the result of
_execute_task_with_a2a (the wrapped task/description and tools) and pass its
description and tools into original_kickoff (use the wrapped task.description
and the returned tools instead of the original messages/empty tools), ensuring
the delegation context from _execute_task_with_a2a is forwarded to
original_kickoff.


Why
Closes #3897.
A2A delegation currently relies on a
Literal[endpoint_url, ...]-constrainedAgentResponsemodel to pick a remote agent. The prompt shows the LLM each agent's card (skill IDs, names, URLs), but the only valid value fora2a_idsis the well-known endpoint URL — which is never explicitly labeled as the identifier. Predictable failures:skills[0].id(e.g."Research") instead of the endpoint URL →pydantic ValidationError: literal_error.Literal/enum constraints in JSON Schema emit out-of-set values → same error.A fuzzy-match fallback would paper over the symptom; the structural fix is to make the identifier set itself unambiguous and provider-enforced.
What
Each remote A2A agent is now exposed to the local LLM as a
BaseTool(delegate_to_<sanitized_card_name>); the local agent's tool-call loop drives multi-turn delegation.AgentResponse(a2a_ids, message, is_a2a)and the explicit per-turn re-prompting loop are gone.crewai/a2a/tools.py:A2ADelegationTool+A2ADelegationState(per-task shared state with per-endpoint history, IDs, turn counts).crewai/a2a/wrapper.pycollapsed from 1772 → ~530 LOC. Deleted_delegate_to_a2a/_adelegate_to_a2a,_prepare_delegation_context,_parse_agent_response,_handle_agent_response_and_continue,_handle_max_turns_exceeded,_emit_delegation_failed,_process_response_result,_init_delegation_state,_get_turn_context,_handle_task_completion,DelegationContext,DelegationState. Each of the four entry points (sync/async × execute_task/kickoff) now augments the prompt with agent cards, builds A2A tools, merges them into the call'stoolslist (or temporarily extendsself.toolsfor kickoff), and callsoriginal_fn.PREVIOUS_A2A_CONVERSATION_TEMPLATE,CONVERSATION_TURN_INFO_TEMPLATE,REMOTE_AGENT_*_NOTICE.AVAILABLE_AGENTS_TEMPLATEnow describes the tool-call protocol.response_model.py:create_agent_response_model/get_a2a_agents_and_response_modelreplaced with a singleextract_a2a_client_configs().types.py:AgentResponseProtocolremoved.agent/core.py+lite_agent.pyupdated to drop theAgentResponseProtocolbranch and theagent_response_modelarg.The original failure is now structurally impossible: provider-side tool-call validation (OpenAI / Anthropic / Gemini) enforces the tool name; there's no competing identifier set for the model to confuse.
A2AConfig.max_turnswires through toBaseTool.max_usage_count, so the existing per-agent turn limit is preserved without an explicit Python-side loop.Notes
tools.py. Net ≈ −600 LOC.pip-audit); will retarget tomainonce that lands.mypyclean across 473 files.ruffclean.Note
Medium Risk
Refactors core A2A delegation flow in
wrapper.pyto rely on tool-calling and shared per-endpoint state; behavior changes around multi-turn delegation, turn limits, and event emission could impact remote-agent interactions.Overview
A2A delegation is reworked to use tool calling instead of a structured
AgentResponsemodel: each remote agent is now exposed as adelegate_to_*BaseTool, and each tool call advances one remote turn using sharedA2ADelegationState.This removes the dynamic response model/protocol and the bespoke multi-turn re-prompting loop, simplifying
wrapper.pyto: fetch agent cards, augment the prompt with agent cards + new tool-call instructions, inject delegation tools intoexecute_task/kickoff(sync+async), and let the normal tool loop drive delegation.Supporting updates trim templates to delegation-only messaging, replace response-model helpers with
extract_a2a_client_configs(), simplify agent completion event output formatting, update LiteAgent integration, and rewrite A2A tests to validate the new tool behavior (completion, reference task tracking, failure surfaces, andmax_turnsviamax_usage_count).Reviewed by Cursor Bugbot for commit f32fe81. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit