Skip to content

Feat+Fix: Let Tools returning multimodal data#5804

Open
konbraphat51 wants to merge 16 commits into
crewAIInc:mainfrom
konbraphat51:feat/multimodal_tool
Open

Feat+Fix: Let Tools returning multimodal data#5804
konbraphat51 wants to merge 16 commits into
crewAIInc:mainfrom
konbraphat51:feat/multimodal_tool

Conversation

@konbraphat51
Copy link
Copy Markdown

@konbraphat51 konbraphat51 commented May 14, 2026

Summary

This PR introduces a first-class public API for tool authors to return multimodal content (images, audio, video, PDFs) from any BaseTool, and wires up the full pipeline so those files reach the LLM in the correct provider format.

Previously, AddImageTool was the only path for injecting images into the conversation, implemented via a fragile name-based special case that called str() on the result dict — meaning it silently produced a Python repr string instead of actual image content. Any other tool returning a FileInput would have its multimodal structure destroyed by _format_result()'s unconditional str() conversion.

Before — tool authors had no supported way to return multimodal data:

# No public API existed; only AddImageTool had a (broken) special case
class MyTool(BaseTool):
    def _run(self, query: str) -> str:
        ...  # forced to return text only

After — any tool can return multimodal content:

from crewai.tools import BaseTool, MultimodalToolResult
from crewai_files import ImageFile

class ScreenshotTool(BaseTool):
    name: str = "take_screenshot"
    description: str = "Capture a screenshot of a URL"

    def _run(self, url: str) -> MultimodalToolResult:
        data = capture(url)
        return MultimodalToolResult(
            text="Screenshot captured",
            files={"screenshot": ImageFile(source=data)},
        )

Changes

New public API (tools/tool_types.py, tools/__init__.py)

  • Added MultimodalToolResult dataclass: text: str, files: dict[str, FileInput], result_as_answer: bool
  • Added files: dict[str, FileInput] field to the internal ToolResult dataclass
  • Exported MultimodalToolResult from crewai.tools

Core pipeline (tools/tool_usage.py)

  • Added self._result_files: dict[str, Any] = {} side-channel to ToolUsage
  • Added _extract_multimodal_files() helper: intercepts MultimodalToolResult and bare BaseFile objects before str() conversion, extracts files into _result_files
  • Updated _format_result() to use the new helper; non-multimodal results unchanged
  • Removed the broken add_image-named special-case branches from use() and ause()

Propagation (utilities/tool_utils.py)

  • execute_tool_and_check_finality and aexecute_tool_and_check_finality now read tool_usage._result_files after execution and pass them as files= to the returned ToolResult

Agent executors

  • agents/crew_agent_executor.py: removed add_image branch from _handle_agent_action(); added _extract_native_tool_result() static method for the NativeTools path; added _inject_files_into_last_user_message() helper; both ReAct loops now inject tool_result.files into the last user message after each tool call
  • experimental/agent_executor.py: same removals and additions mirrored for the experimental executor
  • agents/step_executor.py: _execute_text_tool_with_events() return type changed from str to ToolResult; caller attaches files to the observation message when present

Bug fix (tools/agent_tools/add_image_tool.py)

  • Fixed ImageFile(image_url)ImageFile(source=image_url): ImageFile is a Pydantic model and does not accept positional arguments; the previous call raised TypeError at runtime

Tests (tests/tools/test_multimodal_tool_result.py)

  • 19 new unit tests covering MultimodalToolResult and ToolResult.files dataclasses, _extract_multimodal_files() with all input types, _format_result() integration, and AddImageTool._run()

Documentation (docs/*/learn/multimodal-agents.mdx)

  • Added "Building Tools That Return Multimodal Content" section to all four language variants (en, ar, ko, pt-BR) explaining MultimodalToolResult, supported file types, file source formats, and the single-file shorthand

How it works

The existing files pipeline in BaseLLM._process_message_files() already converts dict[str, FileInput] on any message into provider-specific content blocks (OpenAI image_url, Anthropic image source, etc.). This PR closes the gap between tool execution and that pipeline:

Tool._run() → MultimodalToolResult
  → ToolUsage._format_result() extracts files into _result_files
  → execute_tool_and_check_finality() wraps into ToolResult(files=...)
  → Agent executor injects files onto the LLM message
  → BaseLLM._process_message_files() converts to provider blocks  ✓

Notes

  • Backward compatible: tools returning plain str or any other type are unaffected
  • Cache hits return files={} (files are not cached); this is acceptable
  • result_as_answer=True with files: the final text answer is returned as-is; including files in AgentFinish.output is out of scope for this PR
  • crewai-files is an optional dependency; all multimodal code paths are guarded with try/except ImportError fallbacks

Summary by CodeRabbit

Release Notes

  • New Features

    • Tools can now return multimodal content (images, audio, video, PDFs) alongside text to the LLM.
    • Custom tools support returning file attachments via structured results, enabling richer agent interactions.
  • Documentation

    • Added comprehensive guides in multiple languages with examples for building custom tools that handle multimodal outputs.
  • Tests

    • Added test coverage for multimodal tool result handling and file propagation.

Review Change Stack

Additional Information

This tackles issue #5758, and I bet this PR is better than auto-fix by #5789

konbraphat51 and others added 13 commits May 13, 2026 19:56
Introduces MultimodalToolResult as a public API for tool authors who need
to return file attachments (images, audio, video, PDFs) alongside a text
observation. Adds a files field to the internal ToolResult so the multimodal
payload can travel through the execution pipeline unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… remove add_image special case

_format_result now intercepts MultimodalToolResult and BaseFile values before
the str() fallback, stashing the extracted FileInput dict in self._result_files
so callers can propagate it into ToolResult.files.

The add_image name-based branch in use()/ause() is removed; all tools now go
through the same path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iles

After use()/ause() returns, read _result_files from the ToolUsage instance
and forward it as files= on the returned ToolResult so the multimodal payload
reaches the agent executor unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Returns MultimodalToolResult(text, files={image: ImageFile(url)}) so the
image travels through the standard files pipeline instead of a raw dict.
Falls back to text-only when crewai-files is not installed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pecial handling

- Remove the add_image name-based branch from _handle_agent_action; all
  tools now use the same path.
- Add _extract_native_tool_result() to detect MultimodalToolResult/BaseFile
  in the NativeTools path and separate text from files.
- Carry files through execution_result dict into _append_tool_result_and
  _check_finality, where they are stored on the tool message's files field
  so _process_message_files() converts them to provider content blocks.
- Add _inject_files_into_last_user_message() helper and call it from both
  sync and async ReAct loops after _append_message().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirror the crew_agent_executor changes: drop the add_image name-based
branch from _handle_agent_action and add _inject_files_into_last_user
_message so MultimodalToolResult files reach the LLM via the files pipeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_execute_text_tool_with_events now returns ToolResult instead of str so
the caller can attach tool_result.files to the observation LLMMessage.
_process_message_files() then converts them to provider content blocks.
VISION_IMAGE sentinel handling is preserved for backward compatibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ImageFile is a Pydantic model and does not accept positional arguments;
positional call raised TypeError at runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers:
- MultimodalToolResult and ToolResult.files dataclass fields
- ToolUsage._extract_multimodal_files with MultimodalToolResult, BaseFile, and plain values
- ToolUsage._format_result integration
- AddImageTool._run returning MultimodalToolResult with correct ImageFile

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the MultimodalToolResult API so library users can build custom
tools that pass images, audio, video, and PDFs to the LLM. Covers file
sources (bytes, path, URL), supported file types, and the single-file
shorthand.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…comments

ruff format reformatted tool_types.py, tool_usage.py, and
crew_agent_executor.py. Removed four # type: ignore[...] comments that
mypy reported as unused after the TypedDict was updated to include the
files key.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@konbraphat51
Copy link
Copy Markdown
Author

Hmm, how to add llm-generated label?

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

📝 Walkthrough

Walkthrough

Adds a MultimodalToolResult type and threads file attachments returned by tools through ToolUsage, tool execution helpers, and executors so file payloads are attached to message history and available to the LLM.

Changes

Multimodal File Support

Layer / File(s) Summary
Multimodal result types and public API
lib/crewai/src/crewai/tools/tool_types.py, lib/crewai/src/crewai/tools/__init__.py
New MultimodalToolResult dataclass with text, files (mapping of FileInput attachments), and result_as_answer; ToolResult extended with optional files field; type exported from crewai.tools.
Tool result extraction and AddImageTool
lib/crewai/src/crewai/tools/tool_usage.py, lib/crewai/src/crewai/tools/agent_tools/add_image_tool.py
ToolUsage extracts text and files from MultimodalToolResult and BaseFile via _extract_multimodal_files() and stores them in _result_files; AddImageTool._run() returns MultimodalToolResult with an image when crewai-files is available (text fallback otherwise).
Tool execution pipeline file capture
lib/crewai/src/crewai/utilities/tool_utils.py
execute_tool_and_check_finality and async variant capture result_files from ToolUsage and include them in returned ToolResult.files.
CrewAgentExecutor file injection
lib/crewai/src/crewai/agents/crew_agent_executor.py
ReAct loops (sync/async) and native tool call paths parse multimodal/file returns, attach files to tool messages, and inject tool_result.files into the most-recent role="user" message.
StepExecutor & experimental executor integration
lib/crewai/src/crewai/agents/step_executor.py, lib/crewai/src/crewai/experimental/agent_executor.py
StepExecutor consumes structured ToolResult (result + files); experimental AgentExecutor provides _inject_files_into_last_user_message() and removes add_image special-casing in favor of shared handling.
Multimodal result tests
lib/crewai/tests/tools/test_multimodal_tool_result.py
Tests for MultimodalToolResult/ToolResult defaults, ToolUsage._extract_multimodal_files() behavior, _format_result() file handling/clearing, and AddImageTool._run() multimodal contract.
User-facing documentation
docs/en/learn/multimodal-agents.mdx, docs/ar/learn/multimodal-agents.mdx, docs/ko/learn/multimodal-agents.mdx, docs/pt-BR/learn/multimodal-agents.mdx
Multilingual guides showing how to return images/audio/video/PDFs via MultimodalToolResult, ScreenshotTool example, supported file sources (bytes, local paths, URLs), and alternative of returning a single FileInput.

Sequence Diagram(s)

sequenceDiagram
  participant Tool as Custom Tool
  participant Usage as ToolUsage
  participant Extract as _extract_multimodal_files
  participant Utils as tool_utils
  participant Executor as CrewAgentExecutor
  participant Inject as _inject_files_into_last_user_message
  Tool->>Usage: _run returns MultimodalToolResult(text, files={...})
  Usage->>Extract: _format_result(multimodal_result)
  Extract->>Usage: return text, populate _result_files
  Usage->>Utils: tool execution completes, _result_files observed
  Utils->>Executor: return ToolResult(result=text, files=_result_files)
  Executor->>Executor: append tool message including files field
  Executor->>Inject: tool_result.files not empty
  Inject->>Executor: merge files into last role=user message
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through types and test reports bright,
Found screenshots tucked in code at night,
Now tools can bring images, sound, and more—
I stash them in messages, snug to the core.
A little rabbit clap for files in flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Feat+Fix: Let Tools returning multimodal data' directly relates to the main change—adding multimodal file support to tools via MultimodalToolResult and updating the entire pipeline.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/agents/step_executor.py`:
- Around line 315-321: The loop in StepExecutor (around variables
last_tool_result, messages, answer_str, and obs_msg in step_executor.py) ignores
tool_result.result_as_answer and therefore allows a further LLM turn; update the
logic so that when tool_result.result_as_answer is True you append the assistant
message with answer_str (as already done), skip building/appending the
observation message and immediately return or break to end execution (i.e., do
not call _build_observation_message or proceed to another LLM turn); preserve
adding tool_result.files only for non-final steps as needed and ensure
last_tool_result is still set when returning.

In `@lib/crewai/src/crewai/tools/tool_usage.py`:
- Around line 655-663: Reset the ToolUsage instance's attachment state at the
start of result formatting: in _format_result, set self._result_files = [] (or
appropriate empty container) immediately when the method begins so stale
attachments from prior calls are cleared; then proceed to call
_extract_multimodal_files(result) and the existing branches. Do the same
clearing at the start of any other formatting branches referenced around the
same area (the alternative formatting paths around lines 669–693) so every code
path begins with a fresh self._result_files before calling
_extract_multimodal_files or _remember_format.

In `@lib/crewai/tests/tools/test_multimodal_tool_result.py`:
- Around line 181-196: The test lacks a final assertion to validate that a plain
second call to _format_result does not clear _result_files; update
test_result_files_cleared_on_plain_result to capture the files after the first
_format_result call (e.g., previous_files = tu._result_files.copy()) then after
calling tu._format_result("plain follow-up") assert that tu._result_files ==
previous_files (or at least that tu._result_files is not empty), referencing the
test name test_result_files_cleared_on_plain_result and the symbols
_format_result and _result_files so the intent is explicitly checked.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bab10ed3-01f4-4233-99f8-669115e938f1

📥 Commits

Reviewing files that changed from the base of the PR and between c36827b and dd25a32.

📒 Files selected for processing (13)
  • docs/ar/learn/multimodal-agents.mdx
  • docs/en/learn/multimodal-agents.mdx
  • docs/ko/learn/multimodal-agents.mdx
  • docs/pt-BR/learn/multimodal-agents.mdx
  • lib/crewai/src/crewai/agents/crew_agent_executor.py
  • lib/crewai/src/crewai/agents/step_executor.py
  • lib/crewai/src/crewai/experimental/agent_executor.py
  • lib/crewai/src/crewai/tools/__init__.py
  • lib/crewai/src/crewai/tools/agent_tools/add_image_tool.py
  • lib/crewai/src/crewai/tools/tool_types.py
  • lib/crewai/src/crewai/tools/tool_usage.py
  • lib/crewai/src/crewai/utilities/tool_utils.py
  • lib/crewai/tests/tools/test_multimodal_tool_result.py

Comment thread lib/crewai/src/crewai/agents/step_executor.py
Comment thread lib/crewai/src/crewai/tools/tool_usage.py
Comment thread lib/crewai/tests/tools/test_multimodal_tool_result.py
konbraphat51 and others added 3 commits May 14, 2026 17:06
…xecutor

When a tool signals result_as_answer=True the loop was still appending
an observation message and proceeding to another LLM turn. Now we
append the assistant message and immediately return the tool result,
consistent with how crew_agent_executor handles the same flag.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
If _format_result is called multiple times on the same ToolUsage instance
(e.g. during error-handling retries), stale files from a prior multimodal
result would persist and be attached to the wrong observation. Clearing
_result_files before each call ensures every code path begins clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sult call

The previous test had no assertion after the second call, leaving the
behaviour undocumented. After the _format_result reset fix, a plain call
must produce an empty _result_files; assert that explicitly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/crewai/src/crewai/tools/tool_usage.py (1)

127-137: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reset _result_files at the start of each tool execution path.

On Line 127 and Line 162, use()/ause() can return early on errors before _format_result() runs, so previous call attachments may leak into the next ToolResult.files.

Suggested fix
 def use(
     self, calling: ToolCalling | InstructorToolCalling, tool_string: str
 ) -> str:
+    self._result_files = {}
     if isinstance(calling, ToolUsageError):
         error = calling.message
         if self.agent and self.agent.verbose:
             PRINTER.print(content=f"\n\n{error}\n", color="red")
         if self.task:
@@
 async def ause(
     self, calling: ToolCalling | InstructorToolCalling, tool_string: str
 ) -> str:
+    self._result_files = {}
     """Execute a tool asynchronously.

Also applies to: 162-169

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/tools/tool_usage.py` around lines 127 - 137, Reset the
tool result attachments at the start of each execution path to avoid leaking
files between calls: in the Tool class methods use and ause, set
self._result_files = [] (or call the existing reset helper if present)
immediately at the top of each method before any early-return branches
(including the isinstance(calling, ToolUsageError) branch and any other
error/early-return checks) so that when those branches return (e.g., returning
error strings) the next ToolResult will not include files from the previous
invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/tools/tool_usage.py`:
- Around line 674-679: The try/except currently bundles both imports so a
missing optional package (`BaseFile` from crewai_files.core.types) causes early
return and prevents handling of MultimodalToolResult; update the import block so
that `from crewai.tools.tool_types import MultimodalToolResult` is imported
unconditionally (or in its own try) outside the try that imports `BaseFile`, and
only the `BaseFile` import is wrapped in a try/except that sets a local sentinel
(e.g., BaseFile = None) on ImportError; then adjust the subsequent logic that
inspects tool results to use isinstance(result, MultimodalToolResult) to return
result.text and only use BaseFile-related handling when BaseFile is not None.
Ensure references to MultimodalToolResult and BaseFile in the surrounding
function/class are updated accordingly.

---

Outside diff comments:
In `@lib/crewai/src/crewai/tools/tool_usage.py`:
- Around line 127-137: Reset the tool result attachments at the start of each
execution path to avoid leaking files between calls: in the Tool class methods
use and ause, set self._result_files = [] (or call the existing reset helper if
present) immediately at the top of each method before any early-return branches
(including the isinstance(calling, ToolUsageError) branch and any other
error/early-return checks) so that when those branches return (e.g., returning
error strings) the next ToolResult will not include files from the previous
invocation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 93cdfa73-cf53-4579-9962-b447d540b401

📥 Commits

Reviewing files that changed from the base of the PR and between dd25a32 and a44a840.

📒 Files selected for processing (3)
  • lib/crewai/src/crewai/agents/step_executor.py
  • lib/crewai/src/crewai/tools/tool_usage.py
  • lib/crewai/tests/tools/test_multimodal_tool_result.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/crewai/src/crewai/agents/step_executor.py
  • lib/crewai/tests/tools/test_multimodal_tool_result.py

Comment on lines +674 to +679
try:
from crewai_files.core.types import BaseFile

from crewai.tools.tool_types import MultimodalToolResult
except ImportError:
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd tool_types.py
rg -n -C3 'class MultimodalToolResult|FileInput|ImportError|crewai_files' lib/crewai/src/crewai/tools/tool_types.py
rg -n -C3 '_extract_multimodal_files|MultimodalToolResult|crewai_files.core.types' lib/crewai/src/crewai/tools/tool_usage.py
rg -n -C2 'MultimodalToolResult|crewai_files|ImportError' lib/crewai/tests/tools

Repository: crewAIInc/crewAI

Length of output: 9361


🏁 Script executed:

sed -n '670,700p' lib/crewai/src/crewai/tools/tool_usage.py

Repository: crewAIInc/crewAI

Length of output: 1230


Decouple MultimodalToolResult extraction from optional crewai_files import.

Lines 674-679 couple two imports in a single try block: BaseFile (from optional crewai_files) and MultimodalToolResult (from the crewai package). When crewai_files is missing, the entire method returns None, preventing MultimodalToolResult handling and causing fallback to str(result) instead of returning result.text.

The fix moves MultimodalToolResult outside the try block and handles the optional BaseFile separately, allowing proper MultimodalToolResult handling regardless of whether crewai_files is installed.

Suggested fix
-        try:
-            from crewai_files.core.types import BaseFile
-
-            from crewai.tools.tool_types import MultimodalToolResult
-        except ImportError:
-            return None
+        from crewai.tools.tool_types import MultimodalToolResult
+        try:
+            from crewai_files.core.types import BaseFile
+        except ImportError:
+            BaseFile = None  # type: ignore[assignment]
 
         if isinstance(result, MultimodalToolResult):
             self._result_files = dict(result.files)
             return result.text
 
-        if isinstance(result, BaseFile):
+        if BaseFile is not None and isinstance(result, BaseFile):
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/tools/tool_usage.py` around lines 674 - 679, The
try/except currently bundles both imports so a missing optional package
(`BaseFile` from crewai_files.core.types) causes early return and prevents
handling of MultimodalToolResult; update the import block so that `from
crewai.tools.tool_types import MultimodalToolResult` is imported unconditionally
(or in its own try) outside the try that imports `BaseFile`, and only the
`BaseFile` import is wrapped in a try/except that sets a local sentinel (e.g.,
BaseFile = None) on ImportError; then adjust the subsequent logic that inspects
tool results to use isinstance(result, MultimodalToolResult) to return
result.text and only use BaseFile-related handling when BaseFile is not None.
Ensure references to MultimodalToolResult and BaseFile in the surrounding
function/class are updated accordingly.

@konbraphat51 konbraphat51 reopened this May 14, 2026
@konbraphat51 konbraphat51 marked this pull request as draft May 14, 2026 08:20
@konbraphat51 konbraphat51 changed the title Feat+Fix: Let Tools returning multimodal data [WIP] Feat+Fix: Let Tools returning multimodal data May 14, 2026
@konbraphat51 konbraphat51 changed the title [WIP] Feat+Fix: Let Tools returning multimodal data Feat+Fix: Let Tools returning multimodal data May 15, 2026
@konbraphat51 konbraphat51 marked this pull request as ready for review May 15, 2026 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant