feat(mcp): forward embedded resources to model providers#2935
Conversation
|
❌ PR Review Failed — The review agent encountered an error and could not complete the review. View logs. |
aheritier
left a comment
There was a problem hiding this comment.
pkg/model/provider/openai/attachments.go — FileData for PDF now sends a data URI instead of raw base64
The PR changes the PDF encoding from raw base64 to a data: URI:
// Before:
b64Data := base64.StdEncoding.EncodeToString(doc.Source.InlineData)
FileData: param.NewOpt(b64Data),
// After (this PR):
dataURI := fmt.Sprintf("data:%s;base64,%s", doc.MimeType, base64.StdEncoding.EncodeToString(doc.Source.InlineData))
FileData: param.NewOpt(dataURI),The openai-go v3.37.0 SDK godoc for ResponseInputFileParam.FileData (and ResponseInputFileContentParam.FileData) reads:
"The base64-encoded data of the file to be sent to the model."
Raw base64, not a data URI. Sending data:application/pdf;base64,... as the value means the data: prefix will be treated as part of the file content, corrupting the PDF. The previous code and its comment ("not a data URI") were correct per the SDK. This change will silently break PDF forwarding for OpenAI Responses API users.
pkg/model/provider/gemini/client.go — text document MIME type is hardcoded as "text/plain"
In functionResponsePartsFromMultiContent:
} else if docPart.Text != {
parts = append(parts, genai.NewFunctionResponsePartFromBytes([]byte(docPart.Text), "text/plain"))
}This path is taken for any text/* document (text/markdown, text/html, text/csv, etc.), but the MIME is hardcoded as "text/plain" regardless of the document's actual MimeType. The original MIME is available via part.Document.MimeType and should be used here instead.
pkg/model/provider/bedrock/convert.go — empty text parts appended without filtering
In convertToolResultContent, text parts from MultiContent are appended unconditionally:
case chat.MessagePartTypeText:
blocks = append(blocks, &types.ToolResultContentBlockMemberText{Value: part.Text})The Anthropic paths (both standard client.go and beta beta_converter.go) skip whitespace-only text parts before including them in tool-result content. The Bedrock path has no such guard, so a whitespace-only text part produces an empty ToolResultContentBlockMemberText block that Bedrock may reject.
pkg/model/provider/anthropic/attachments.go — Claude capability fallback is unexplained
if !mc.Supports(doc.MimeType) && modelinfo.IsClaude(ctx, store, id) {
mc = modelinfo.CapsWith(true, true)
}When the models.dev store has no entry for a Claude model, this silently overrides capability checks to enable all image and PDF types. There is no comment explaining why this heuristic is safe or intentional. A future unknown Claude model that genuinely lacks vision/PDF support would have attachments forwarded to it anyway, resulting in API errors instead of graceful degradation.
Preserve MCP embedded image, PDF, and text resources as tool-result attachments instead of reducing them to text markers. Convert those attachments into provider-native content blocks for Anthropic, OpenAI, Bedrock, and Gemini, including Anthropic image/document tool_result blocks and OpenAI Responses input_file data URIs. Also ensure tool-result content is never empty and deep-copy document attachments in session branches. Signed-off-by: Djordje Lukic <djordje.lukic@docker.com>
8a85f05 to
93b8b04
Compare
Preserve MCP embedded image, PDF, and text resources as tool-result attachments instead of reducing them to text markers. Convert those attachments into provider-native content blocks for Anthropic, OpenAI, Bedrock, and Gemini, including Anthropic image/document tool_result blocks and OpenAI Responses input_file data URIs. Also ensure tool-result content is never empty and deep-copy document attachments in session branches.