Skip to content

Gemma 4 (e4b) tool calling fails via Ollama OpenAI-compatible API — streaming tool_calls not recognized #20995

@noxgle

Description

@noxgle

Description

Gemma 4 (e4b) tool calling fails via Ollama OpenAI-compatible API — streaming tool_calls not recognized

Summary

When using Gemma 4 (e4b) via Ollama (ollama/gemma4:e4b) as a provider, the model correctly returns tool_calls in the response, but OpenCode does not recognize them and the model responds with "I do not have the capability to execute system commands" instead of calling tools.

Reproduction

  1. Configure Ollama as a provider in opencode.json:
{
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "gemma4:e4b": {
          "name": "Gemma 4 e4b",
          "tools": true,
          "limit": {
            "context": 128000,
            "output": 8192
          }
        }
      }
    }
  }
}
  1. Run opencode -m ollama/gemma4:e4b
  2. Ask: "list files in /tmp"

Expected behavior

The model should call the bash tool with ls /tmp.

Actual behavior

The model responds with: "I do not have the capability to execute system commands like listing directory contents."

Root cause analysis

I investigated and found that:

  1. Ollama API works correctly — both native (/api/chat) and OpenAI-compatible (/v1/chat/completions) endpoints return proper tool_calls with finish_reason: "tool_calls" for gemma4.
  2. Streaming also works at the API level — The streaming response includes tool_calls in the delta and ends with finish_reason: "tool_calls".
  3. The issue is on the opencode side — The AI SDK (@ai-sdk/openai-compatible) or opencode is not properly parsing/recognizing the tool calls from gemma4 in streaming mode.

API verification

Non-streaming (works):

curl -s http://localhost:11434/v1/chat/completions -d '{
  "model": "gemma4:e4b",
  "messages": [{"role": "user", "content": "list files in /tmp"}],
  "stream": false,
  "tools": [{"type": "function", "function": {"name": "bash", "description": "Execute a shell command", "parameters": {"type": "object", "required": ["command"], "properties": {"command": {"type": "string"}}}}}]
}'
# Returns: finish_reason: "tool_calls", tool_calls: [{function: {name: "bash", arguments: "{\"command\":\"ls /tmp\"}"}}]

Streaming (also works at API level):

# Streaming includes tool_calls in delta and finish_reason: "tool_calls" in final chunk

Environment

  • OpenCode version: 1.3.13
  • Ollama version: 0.20.2 (Docker)
  • Model: gemma4:e4b
  • OS: Linux

Workaround

Using qwen3.5:9b works correctly with tool calling. The issue is specific to gemma4.

Related

  • ollama/ollama#15241 — Gemma4 tool call parsing fixed in Ollama 0.20.2
  • PR #16531 — OpenAI-compatible custom tool compat layer (unmerged)
  • Issue #20669 — Default agent brittle against local tool-call quirks
  • PR #15306 — Ollama-side tool calling rework for Gemma4

Suggested fix

The toolParser compat layer from PR #16531 should be merged to handle gemma4 tool call format differences, particularly in streaming mod

Plugins

No response

OpenCode version

No response

Steps to reproduce

No response

Screenshot and/or share link

No response

Operating System

No response

Terminal

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcoreAnything pertaining to core functionality of the application (opencode server stuff)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions