Skip to content

feat(mcp): forward embedded resources to model providers#2935

Merged
rumpl merged 1 commit into
docker:mainfrom
rumpl:mcp-embedded-resource-media
Jun 1, 2026
Merged

feat(mcp): forward embedded resources to model providers#2935
rumpl merged 1 commit into
docker:mainfrom
rumpl:mcp-embedded-resource-media

Conversation

@rumpl
Copy link
Copy Markdown
Member

@rumpl rumpl commented May 29, 2026

Preserve MCP embedded image, PDF, and text resources as tool-result attachments instead of reducing them to text markers. Convert those attachments into provider-native content blocks for Anthropic, OpenAI, Bedrock, and Gemini, including Anthropic image/document tool_result blocks and OpenAI Responses input_file data URIs. Also ensure tool-result content is never empty and deep-copy document attachments in session branches.

@rumpl rumpl requested a review from a team as a code owner May 29, 2026 20:44
@aheritier aheritier added area/mcp MCP protocol, MCP tool servers, integration area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/tools For features/issues/fixes related to the usage of built-in and MCP tools kind/feat PR adds a new feature (maps to feat: commit prefix) labels May 29, 2026
@docker-agent
Copy link
Copy Markdown

PR Review Failed — The review agent encountered an error and could not complete the review. View logs.

Copy link
Copy Markdown
Contributor

@aheritier aheritier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/model/provider/openai/attachments.goFileData for PDF now sends a data URI instead of raw base64

The PR changes the PDF encoding from raw base64 to a data: URI:

// Before:
b64Data := base64.StdEncoding.EncodeToString(doc.Source.InlineData)
FileData: param.NewOpt(b64Data),

// After (this PR):
dataURI := fmt.Sprintf("data:%s;base64,%s", doc.MimeType, base64.StdEncoding.EncodeToString(doc.Source.InlineData))
FileData: param.NewOpt(dataURI),

The openai-go v3.37.0 SDK godoc for ResponseInputFileParam.FileData (and ResponseInputFileContentParam.FileData) reads:

"The base64-encoded data of the file to be sent to the model."

Raw base64, not a data URI. Sending data:application/pdf;base64,... as the value means the data: prefix will be treated as part of the file content, corrupting the PDF. The previous code and its comment ("not a data URI") were correct per the SDK. This change will silently break PDF forwarding for OpenAI Responses API users.


pkg/model/provider/gemini/client.go — text document MIME type is hardcoded as "text/plain"

In functionResponsePartsFromMultiContent:

} else if docPart.Text !=  {
    parts = append(parts, genai.NewFunctionResponsePartFromBytes([]byte(docPart.Text), "text/plain"))
}

This path is taken for any text/* document (text/markdown, text/html, text/csv, etc.), but the MIME is hardcoded as "text/plain" regardless of the document's actual MimeType. The original MIME is available via part.Document.MimeType and should be used here instead.


pkg/model/provider/bedrock/convert.go — empty text parts appended without filtering

In convertToolResultContent, text parts from MultiContent are appended unconditionally:

case chat.MessagePartTypeText:
    blocks = append(blocks, &types.ToolResultContentBlockMemberText{Value: part.Text})

The Anthropic paths (both standard client.go and beta beta_converter.go) skip whitespace-only text parts before including them in tool-result content. The Bedrock path has no such guard, so a whitespace-only text part produces an empty ToolResultContentBlockMemberText block that Bedrock may reject.


pkg/model/provider/anthropic/attachments.go — Claude capability fallback is unexplained

if !mc.Supports(doc.MimeType) && modelinfo.IsClaude(ctx, store, id) {
    mc = modelinfo.CapsWith(true, true)
}

When the models.dev store has no entry for a Claude model, this silently overrides capability checks to enable all image and PDF types. There is no comment explaining why this heuristic is safe or intentional. A future unknown Claude model that genuinely lacks vision/PDF support would have attachments forwarded to it anyway, resulting in API errors instead of graceful degradation.

Preserve MCP embedded image, PDF, and text resources as tool-result attachments instead of reducing them to text markers. Convert those attachments into provider-native content blocks for Anthropic, OpenAI, Bedrock, and Gemini, including Anthropic image/document tool_result blocks and OpenAI Responses input_file data URIs. Also ensure tool-result content is never empty and deep-copy document attachments in session branches.

Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
@rumpl rumpl force-pushed the mcp-embedded-resource-media branch from 8a85f05 to 93b8b04 Compare June 1, 2026 09:40
@rumpl rumpl merged commit 342276b into docker:main Jun 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP protocol, MCP tool servers, integration area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) area/tools For features/issues/fixes related to the usage of built-in and MCP tools kind/feat PR adds a new feature (maps to feat: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants