Skip to content

refactor(a2a): use tool calling for delegation instead of structured output#5751

Open
greysonlalonde wants to merge 4 commits into
mainfrom
gl/refactor/a2a-tool-based-delegation
Open

refactor(a2a): use tool calling for delegation instead of structured output#5751
greysonlalonde wants to merge 4 commits into
mainfrom
gl/refactor/a2a-tool-based-delegation

Conversation

@greysonlalonde
Copy link
Copy Markdown
Contributor

@greysonlalonde greysonlalonde commented May 8, 2026

Why

Closes #3897.

A2A delegation currently relies on a Literal[endpoint_url, ...]-constrained AgentResponse model to pick a remote agent. The prompt shows the LLM each agent's card (skill IDs, names, URLs), but the only valid value for a2a_ids is the well-known endpoint URL — which is never explicitly labeled as the identifier. Predictable failures:

  1. Original report (gpt-4.1): the LLM picks skills[0].id (e.g. "Research") instead of the endpoint URL → pydantic ValidationError: literal_error.
  2. Reopened thread (Gemini flash-lite): small models that don't reliably honor Literal/enum constraints in JSON Schema emit out-of-set values → same error.

A fuzzy-match fallback would paper over the symptom; the structural fix is to make the identifier set itself unambiguous and provider-enforced.

What

Each remote A2A agent is now exposed to the local LLM as a BaseTool (delegate_to_<sanitized_card_name>); the local agent's tool-call loop drives multi-turn delegation. AgentResponse(a2a_ids, message, is_a2a) and the explicit per-turn re-prompting loop are gone.

  • New crewai/a2a/tools.py: A2ADelegationTool + A2ADelegationState (per-task shared state with per-endpoint history, IDs, turn counts).
  • crewai/a2a/wrapper.py collapsed from 1772 → ~530 LOC. Deleted _delegate_to_a2a / _adelegate_to_a2a, _prepare_delegation_context, _parse_agent_response, _handle_agent_response_and_continue, _handle_max_turns_exceeded, _emit_delegation_failed, _process_response_result, _init_delegation_state, _get_turn_context, _handle_task_completion, DelegationContext, DelegationState. Each of the four entry points (sync/async × execute_task/kickoff) now augments the prompt with agent cards, builds A2A tools, merges them into the call's tools list (or temporarily extends self.tools for kickoff), and calls original_fn.
  • Templates trimmed: dropped PREVIOUS_A2A_CONVERSATION_TEMPLATE, CONVERSATION_TURN_INFO_TEMPLATE, REMOTE_AGENT_*_NOTICE. AVAILABLE_AGENTS_TEMPLATE now describes the tool-call protocol.
  • response_model.py: create_agent_response_model / get_a2a_agents_and_response_model replaced with a single extract_a2a_client_configs().
  • types.py: AgentResponseProtocol removed.
  • agent/core.py + lite_agent.py updated to drop the AgentResponseProtocol branch and the agent_response_model arg.

The original failure is now structurally impossible: provider-side tool-call validation (OpenAI / Anthropic / Gemini) enforces the tool name; there's no competing identifier set for the model to confuse.

A2AConfig.max_turns wires through to BaseTool.max_usage_count, so the existing per-agent turn limit is preserved without an explicit Python-side loop.

Notes

  • Diff stat: +730 / −1709 across existing files; +400 in the new tools.py. Net ≈ −600 LOC.
  • Stacked on chore(deps): bump mem0ai to >=2.0.0 #5750 (mem0ai bump for pip-audit); will retarget to main once that lands.
  • All a2a tests pass (20 passed, 7 skipped). mypy clean across 473 files. ruff clean.

Note

Medium Risk
Refactors core A2A delegation flow in wrapper.py to rely on tool-calling and shared per-endpoint state; behavior changes around multi-turn delegation, turn limits, and event emission could impact remote-agent interactions.

Overview
A2A delegation is reworked to use tool calling instead of a structured AgentResponse model: each remote agent is now exposed as a delegate_to_* BaseTool, and each tool call advances one remote turn using shared A2ADelegationState.

This removes the dynamic response model/protocol and the bespoke multi-turn re-prompting loop, simplifying wrapper.py to: fetch agent cards, augment the prompt with agent cards + new tool-call instructions, inject delegation tools into execute_task/kickoff (sync+async), and let the normal tool loop drive delegation.

Supporting updates trim templates to delegation-only messaging, replace response-model helpers with extract_a2a_client_configs(), simplify agent completion event output formatting, update LiteAgent integration, and rewrite A2A tests to validate the new tool behavior (completion, reference task tracking, failure surfaces, and max_turns via max_usage_count).

Reviewed by Cursor Bugbot for commit f32fe81. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Refactor
    • Redesigned agent-to-agent delegation using a tool-based approach for simpler, more efficient delegation execution.
    • Enhanced delegation instructions with clearer guidance for remote agent communication.
    • Streamlined internal delegation state management and execution flow for improved reliability.

Review Change Stack

…output

Each remote A2A agent is now exposed to the local LLM as a BaseTool
(delegate_to_<card_name>); the local agent's tool-call loop drives
multi-turn delegation. The Literal-constrained AgentResponse model and
the explicit per-turn re-prompting loop are gone.

Closes #3897. The original failure mode — Pydantic literal_error when
skill.id != endpoint URL, and Gemini flash-lite hallucinating
out-of-enum values — is structurally impossible: provider-side tool-call
validation enforces the tool name, and there's no competing identifier.
Base automatically changed from gl/chore/bump-mem0ai to main May 8, 2026 16:17
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

📝 Walkthrough

Walkthrough

This PR refactors Agent-to-Agent (A2A) delegation from a protocol-based response-parsing architecture to a tool-driven mechanism. Removes AgentResponseProtocol, dynamic Pydantic response models, and multi-turn control flow; introduces A2ADelegationTool and A2ADelegationState for managing delegation as LLM tools; updates wrapper, agent core, and lite-agent consumers to use simplified config extraction and tool building.

Changes

A2A Tool-Based Delegation Architecture

Layer / File(s) Summary
Protocol and Type Cleanup
lib/crewai/src/crewai/a2a/types.py, lib/crewai/src/crewai/a2a/utils/response_model.py, lib/crewai/src/crewai/agent/core.py
AgentResponseProtocol protocol type is removed from types module. Response-model utilities are reduced to extract_a2a_client_configs(...), which normalizes mixed A2A config inputs and filters to client-capable endpoints. Agent core's output finalization stops special-casing protocol-aware BaseModel extraction and always stringifies non-string results.
A2A Tool-Based Delegation Implementation
lib/crewai/src/crewai/a2a/tools.py
New module implementing tool-based delegation. A2ADelegationState maintains per-endpoint conversation history, task metadata, and turn count. A2ADelegationTool wraps sync/async delegation invocations. build_a2a_tools(...) creates one delegatable tool per configured A2A endpoint, deconflicts names, and generates descriptions. Delegation execution via _run_delegation and _run_delegation_async prepares extension state and metadata, calls execute_a2a_delegation, and finalizes turns by updating history, emitting completion events, and applying response extensions.
Wrapper: Task Execution with Card Fetching and Tool Extension
lib/crewai/src/crewai/a2a/wrapper.py
Execution wrappers fetch remote agent cards (sync via thread pool, async via asyncio.gather), augment task descriptions with available agent details, and build A2A tools. New _temporarily_extend_tools(...) context manager safely appends delegation tools to agent's tool list for the wrapped invocation. Removes prior structured LLM response parsing, extension response-processing state, and multi-turn delegation loop logic.
Wrapper: Kickoff Integration with Tool Extension
lib/crewai/src/crewai/a2a/wrapper.py
Kickoff wrappers (sync and async) extract latest user message, augment it with remote agent-card context, build delegation state and tools, then temporarily extend tools before calling original kickoff. Removes direct protocol parsing and delegation control flow, delegating remote agent interaction entirely to the tool mechanism.
A2A Prompt Templates
lib/crewai/src/crewai/a2a/templates.py
AVAILABLE_AGENTS_TEMPLATE format updated with clearer delegation instructions and placeholder for rendered agent cards. Removes PREVIOUS_A2A_CONVERSATION_TEMPLATE, CONVERSATION_TURN_INFO_TEMPLATE, REMOTE_AGENT_COMPLETED_NOTICE, and REMOTE_AGENT_RESPONSE_NOTICE that were tied to the prior multi-turn control-flow approach.
Lite Agent Consumer Update
lib/crewai/src/crewai/lite_agent.py
_kickoff_with_a2a_support switches to extract_a2a_client_configs(...) instead of get_a2a_agents_and_response_model(...) and removes agent_response_model argument from delegation execution call.
Tool-Based Delegation Test Coverage
lib/crewai/tests/agents/test_a2a_trust_completion_status.py
Mock agent card helper is simplified. New tests validate tool-based delegation: successful remote completion returns result, completed remote tasks are recorded in delegation state reference IDs, remote failures surface error messages, and max_turns configuration wires to tool usage limits. Removes prior wrapper-focused multi-turn conversation flow tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 From protocol dreams to tools so bright,
A2A delegates with pure delight,
No more response models, parsing dance,
Tools take the stage—let agents prance!
One turn per call, the state stays true,
Extensible, clean, the refactor's new. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: refactoring A2A delegation to use tool calling instead of structured output, which is the primary objective of this PR.
Linked Issues check ✅ Passed The PR addresses the core issue #3897: replacing the Literal-constrained AgentResponse model with tool-based delegation, eliminating skill.id validation errors by using endpoint URLs as unambiguous identifiers enforced at the provider level.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing tool-based A2A delegation: new tools.py module, updated templates, removed response-model generation, refactored wrapper, and updated related modules. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gl/refactor/a2a-tool-based-delegation

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.


try:
from a2a.types import Message, Role
from a2a.types import TaskState # noqa: F401
@greysonlalonde greysonlalonde marked this pull request as ready for review May 13, 2026 00:22
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f32fe81. Configure here.

a2a_agents=a2a_agents,
original_fn=task_to_kickoff_adapter,
task=fake_task,
agent_response_model=agent_response_model,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LiteAgent adapter ignores A2A delegation tools

High Severity

The task_to_kickoff_adapter in _kickoff_with_a2a_support ignores the tools parameter entirely, calling original_kickoff(messages, response_format, input_files) with the original arguments. Since _execute_task_with_a2a now passes A2A delegation tools via combined_tools, those tools are silently discarded for the LiteAgent path. The LLM sees an augmented prompt referencing delegate_to_* tools that aren't available, making A2A delegation completely non-functional for LiteAgent.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f32fe81. Configure here.

agent_card=agent_card.model_dump() if agent_card else None,
),
)
return _apply_response_extensions(state, result_text, extension_states)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trust_remote_completion_status config field silently ignored

Medium Severity

The trust_remote_completion_status field still exists on both A2AConfig and A2AClientConfig, but _finalize_turn never reads it. Previously, when set to True, the remote agent's completed result bypassed further LLM processing. Now the flag is silently ignored — a behavioral regression for any user who relied on it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f32fe81. Configure here.

message: str,
*,
sync: bool,
) -> str:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused sync parameter in _run_delegation

Low Severity

_run_delegation accepts a keyword-only sync: bool parameter that is never referenced in the function body. The only caller passes sync=True. This is dead code that adds confusion about whether the function has sync/async branching logic.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f32fe81. Configure here.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/a2a/wrapper.py`:
- Around line 412-427: The context manager _temporarily_extend_tools currently
mutates agent.tools and always restores the original reference, which can
clobber concurrent changes; change it to set agent.tools to a new list object
(never mutate the original in-place) and on exit only restore original_tools if
agent.tools is still the same object you assigned (use identity comparison,
e.g., if agent.tools is temp_tools: agent.tools = original_tools) so concurrent
kickoff calls that replaced tools won't be overwritten; ensure you still handle
original_tools being None and the early-return when not extra.

In `@lib/crewai/src/crewai/lite_agent.py`:
- Around line 124-128: task_to_kickoff_adapter is calling
original_kickoff(messages, ...) and ignoring the wrapped task returned by
_execute_task_with_a2a, so the augmented A2A prompt and delegate_to_* tools are
dropped; fix by capturing the result of _execute_task_with_a2a (the wrapped
task/description and tools) and pass its description and tools into
original_kickoff (use the wrapped task.description and the returned tools
instead of the original messages/empty tools), ensuring the delegation context
from _execute_task_with_a2a is forwarded to original_kickoff.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c860202e-d141-4c3b-b1f3-56013be240ad

📥 Commits

Reviewing files that changed from the base of the PR and between 264da82 and f32fe81.

📒 Files selected for processing (8)
  • lib/crewai/src/crewai/a2a/templates.py
  • lib/crewai/src/crewai/a2a/tools.py
  • lib/crewai/src/crewai/a2a/types.py
  • lib/crewai/src/crewai/a2a/utils/response_model.py
  • lib/crewai/src/crewai/a2a/wrapper.py
  • lib/crewai/src/crewai/agent/core.py
  • lib/crewai/src/crewai/lite_agent.py
  • lib/crewai/tests/agents/test_a2a_trust_completion_status.py
💤 Files with no reviewable changes (1)
  • lib/crewai/src/crewai/a2a/types.py

Comment on lines +412 to 427
@contextlib.contextmanager
def _temporarily_extend_tools(agent: Agent, extra: list[BaseTool]) -> Iterator[None]:
"""Append ``extra`` to ``agent.tools`` for the lifetime of the context."""
if not extra:
yield
return
original_tools = agent.tools
if original_tools is None:
agent.tools = list(extra)
else:
agent.tools = [*original_tools, *extra]
try:
yield
finally:
task.description = original_description
task.output_pydantic = original_output_pydantic
task.response_model = original_response_model
agent.tools = original_tools

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard temporary agent.tools mutation against concurrent kickoff calls

Line 418 mutates shared agent.tools in-place for the whole call window; overlapping kickoff executions on the same Agent can interleave and restore the wrong tool list. That can leak/miss delegate_to_* tools across requests.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/a2a/wrapper.py` around lines 412 - 427, The context
manager _temporarily_extend_tools currently mutates agent.tools and always
restores the original reference, which can clobber concurrent changes; change it
to set agent.tools to a new list object (never mutate the original in-place) and
on exit only restore original_tools if agent.tools is still the same object you
assigned (use identity comparison, e.g., if agent.tools is temp_tools:
agent.tools = original_tools) so concurrent kickoff calls that replaced tools
won't be overwritten; ensure you still handle original_tools being None and the
early-return when not extra.

Comment on lines +124 to +128
from crewai.a2a.utils.response_model import extract_a2a_client_configs
from crewai.a2a.wrapper import _execute_task_with_a2a
from crewai.task import Task

a2a_agents, agent_response_model = get_a2a_agents_and_response_model(agent.a2a)
a2a_agents = extract_a2a_client_configs(agent.a2a)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

LiteAgent A2A path discards the delegation context/tools

In task_to_kickoff_adapter, Line 155 calls original_kickoff(messages, ...) and ignores the wrapped task.description plus tools provided by _execute_task_with_a2a. In the new tool-driven flow, that means the LLM never gets the augmented A2A prompt and delegate_to_* tools.

💡 Suggested fix
 def task_to_kickoff_adapter(
     self: Any, task: Task, context: str | None, tools: list[Any] | None
 ) -> str:
-    result = original_kickoff(messages, response_format, input_files)
-    return result.raw
+    wrapped_messages: str | list[LLMMessage]
+    if isinstance(messages, str):
+        wrapped_messages = task.description
+    else:
+        wrapped_messages = [*messages, {"role": "user", "content": task.description}]
+
+    original_parsed_tools = self._parsed_tools
+    if tools:
+        self._parsed_tools = [*self._parsed_tools, *parse_tools(cast(list[BaseTool], tools))]
+    try:
+        result = original_kickoff(wrapped_messages, response_format, input_files)
+        return result.raw
+    finally:
+        self._parsed_tools = original_parsed_tools

Also applies to: 152-166

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/lite_agent.py` around lines 124 - 128,
task_to_kickoff_adapter is calling original_kickoff(messages, ...) and ignoring
the wrapped task returned by _execute_task_with_a2a, so the augmented A2A prompt
and delegate_to_* tools are dropped; fix by capturing the result of
_execute_task_with_a2a (the wrapped task/description and tools) and pass its
description and tools into original_kickoff (use the wrapped task.description
and the returned tools instead of the original messages/empty tools), ensuring
the delegation context from _execute_task_with_a2a is forwarded to
original_kickoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] A2A - pydantic error, when AgentCard-skill-id <> endpoint url. (1.3.0)

1 participant