Extend resolver DI to sampling and roots requests

maxisbey · maxisbey · commit f24a39fdcf26 · 2026-07-01T22:46:03.000Z
Resolvers can now return Sample(...) or ListRoots() in addition to
Elicit: on 2026-07-28 sessions the request batches into the
multi-round-trip InputRequiredResult flow, on 2025-11-25 it goes over
the standalone back-channel request. One rendering produces the
identical wire request on both transports, and marker-routed legacy
sends bypass the deprecated session wrappers so no SEP-2577 warning
fires for the compatibility path.

Sampling and roots results are persisted in request_state like
elicited answers (the client pays for an LLM call once per tool call,
not once per round), pinned to the exact rendered request. Because the
response union cannot always discriminate the two sampling result
shapes, an answer is validated against the marker's expected model
rather than trusting the union member.

The elicitation-only capability check generalizes to a per-kind gate
applied before sending on either transport: sampling, roots, and
elicitation - including sampling.tools when the request carries tools,
reported in full in the -32021 requiredCapabilities payload. This also
gates the previously unchecked 2025 elicitation leg (documented in the
migration guide).

Client gains sampling_capabilities so sampling sub-capabilities like
tools support can be declared alongside sampling_callback.
diff --git a/docs/client/callbacks.md b/docs/client/callbacks.md
@@ -78,6 +78,8 @@ When a client connects it declares its `capabilities`, the mirror image of the s
 | `list_roots_callback=` | `"roots": {"listChanged": true}` |
 | none of them | `{}` |
 
+Sampling sub-capabilities are the one refinement: pass `sampling_capabilities=SamplingCapability(tools=SamplingToolsCapability())` alongside `sampling_callback` when your sampler handles the `tools` / `tool_choice` parameters - servers must see `sampling.tools` declared before sending them.
+
 `logging_callback` and `message_handler` are not in the table. They handle notifications, and notifications need no capability.
 
 The server reads the declaration back with `ctx.session.check_client_capability(...)`. Add a tool that does:
diff --git a/docs/handlers/dependencies.md b/docs/handlers/dependencies.md
@@ -134,12 +134,25 @@ That's the right default for a precondition: no answer, no order. When declining
     to bind to. A question built from such volatile data makes every recorded answer look stale,
     so the server re-asks it on every round until the client's round limit ends the call.
 
+## Ask the client, not the user
+
+Elicitation is one of three questions a resolver can ask - the closed set the multi-round-trip flow allows. The other two go to the **client** rather than the user: return `Sample(...)` to run an LLM call through the client (a `sampling/createMessage` request), or `ListRoots()` to fetch the client's current roots. Neither has an accept/decline outcome - the consumer annotates the result type directly, `CreateMessageResult` (`CreateMessageResultWithTools` when the request carries tools) or `ListRootsResult`:
+
+```python title="server.py" hl_lines="11-16 22"
+--8<-- "docs_src/dependencies/tutorial004.py"
+```
+
+* The framework routes these exactly like `Elicit`: inside the multi-round-trip `tools/call` on **2026-07-28**, over the standalone server->client request on **2025-11-25** - and on either transport it refuses with a `-32021` protocol error when the client never declared the matching capability (`sampling`, `roots`, `elicitation`; `sampling.tools` when the request carries tools).
+* Everything the info box above says about questions applies unchanged: a `Sample` request is matched to its recorded result by its exact rendering, so build it deterministically from the tool's arguments and earlier answers - the client then pays for the LLM call once per tool call, not once per round. The recorded result rides `request_state` for the rest of the call, so a very large completion makes every remaining round-trip heavier.
+* The standalone sampling and roots *features* are deprecated at 2026-07-28 (SEP-2577) - new servers that need the client's model ask through this carrier instead, and servers that don't should integrate with an LLM provider directly. `include_context` values other than `"none"` are themselves deprecated; avoid them.
+
 ## Recap
 
 * `Annotated[T, Resolve(fn)]` on a tool parameter: the SDK runs `fn` and injects its return value.
 * A resolved parameter is invisible to the model and cannot be supplied by a client. Values the model must not invent - prices, identities, permissions - belong here.
 * A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per round, however many consumers it has; each question is asked exactly once, and any resolver may run again when a call resumes after a question.
 * Bad graphs fail at registration with `InvalidSignature`, not mid-call.
 * Return `Elicit(message, Model)` to ask the user, only when you have to. Unwrapped annotations abort on decline; `ElicitationResult[T]` lets the tool branch.
+* Return `Sample(...)` or `ListRoots()` to ask the client - an LLM completion or the roots list, injected as the plain result.
 
 The state your server builds once at startup, and how a handler reaches it, is the **[Lifespan](lifespan.md)** page.
diff --git a/docs/handlers/multi-round-trip.md b/docs/handlers/multi-round-trip.md
@@ -19,7 +19,7 @@ That's the whole protocol. Every leg is an ordinary request from the client to t
 
 ## The server side
 
-On `@mcp.tool()` you rarely build this by hand: declare a dependency that asks the user and the SDK returns the `InputRequiredResult` for you - that form is the **[Dependencies](dependencies.md)** page. The two forms don't mix: a call has one `input_responses`/`request_state` channel, so a tool that uses `Resolve(...)` parameters cannot also return `InputRequiredResult` from its body. A declared `InputRequiredResult` return is rejected at registration (`InvalidSignature`), and an undeclared one fails the call at runtime. The manual form is the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:
+On `@mcp.tool()` you rarely build this by hand: declare a dependency that asks the user (`Elicit`), samples the client's LLM (`Sample`), or lists its roots (`ListRoots`) and the SDK returns the `InputRequiredResult` for you - that form is the **[Dependencies](dependencies.md)** page. The two forms don't mix: a call has one `input_responses`/`request_state` channel, so a tool that uses `Resolve(...)` parameters cannot also return `InputRequiredResult` from its body. A declared `InputRequiredResult` return is rejected at registration (`InvalidSignature`), and an undeclared one fails the call at runtime. The manual form is the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:
 
 ```python title="server.py" hl_lines="44-47"
 --8<-- "docs_src/mrtr/tutorial001.py"
diff --git a/docs/migration.md b/docs/migration.md
@@ -40,6 +40,21 @@ to receive the `InputRequiredResult` and forward it as its own result calls
 dependencies elicit via `Resolve(...)`: the resolver owns that tool's
 `request_state` channel, and a forwarded result's state would clobber it.
 
+### Resolver-routed requests require the client capability on every protocol version
+
+A v1 server could call `ctx.elicit()`, `create_message()`, or `list_roots()`
+against any client; nothing checked what the client had declared. In v2 the
+`Resolve(...)` markers (`Elicit`, `Sample`, `ListRoots`) enforce the spec's
+egress rule on both transports: if the client never declared the matching
+capability (`elicitation`, `sampling` — plus `sampling.tools` when the request
+carries tools — or `roots`), the call fails with a `-32021`
+`MISSING_REQUIRED_CLIENT_CAPABILITY` JSON-RPC error instead of sending a
+request the client cannot handle. This applies on 2025-11-25 sessions too, so a
+client that answered elicitations without declaring the capability now sees the
+error: declare the capability (the SDK client does this automatically when the
+matching callback is set) or drop the asking dependency. Direct `ctx.elicit()`
+and `ctx.session.*` calls outside resolvers are not gated.
+
 ### `MCPError` raised from an `@mcp.tool()` handler now surfaces as a JSON-RPC error
 
 Raising `MCPError` (or a subclass such as `UrlElicitationRequiredError`) inside
diff --git a/docs_src/dependencies/tutorial004.py b/docs_src/dependencies/tutorial004.py
@@ -0,0 +1,26 @@
+from typing import Annotated
+
+from mcp_types import CreateMessageResult, SamplingMessage, TextContent
+
+from mcp.server import MCPServer
+from mcp.server.mcpserver import Resolve, Sample
+
+mcp = MCPServer("Bookshop")
+
+
+def suggest_title(genre: str) -> Sample:
+    prompt = f"Suggest one {genre} book title. Answer with the title only."
+    return Sample(
+        [SamplingMessage(role="user", content=TextContent(type="text", text=prompt))],
+        max_tokens=50,
+    )
+
+
+@mcp.tool()
+async def recommend_book(
+    genre: str,
+    suggestion: Annotated[CreateMessageResult, Resolve(suggest_title)],
+) -> str:
+    """Recommend a book in the given genre."""
+    title = suggestion.content.text if suggestion.content.type == "text" else "the classics"
+    return f"Today's {genre} pick: {title}"
diff --git a/src/mcp/client/client.py b/src/mcp/client/client.py
@@ -303,6 +303,12 @@ async def main():
     sampling_callback: SamplingFnT | None = None
     """Callback for handling sampling requests."""
 
+    sampling_capabilities: types.SamplingCapability | None = None
+    """Sampling sub-capabilities to declare alongside `sampling_callback` (e.g. tools support).
+
+    Only declared when `sampling_callback` is set; on its own it has no effect.
+    """
+
     list_roots_callback: ListRootsFnT | None = None
     """Callback for handling list roots requests."""
 
@@ -418,6 +424,7 @@ async def _build_session(self, exit_stack: AsyncExitStack) -> ClientSession:
             dispatcher=dispatcher,
             read_timeout_seconds=self.read_timeout_seconds,
             sampling_callback=self.sampling_callback,
+            sampling_capabilities=self.sampling_capabilities,
             list_roots_callback=self.list_roots_callback,
             logging_callback=self.logging_callback,
             message_handler=message_handler,
diff --git a/src/mcp/server/mcpserver/__init__.py b/src/mcp/server/mcpserver/__init__.py
@@ -19,7 +19,9 @@
     DeclinedElicitation,
     Elicit,
     ElicitationResult,
+    ListRoots,
     Resolve,
+    Sample,
 )
 from .resources import DEFAULT_RESOURCE_SECURITY, ResourceSecurity
 from .server import MCPServer, require_client_extension
@@ -33,6 +35,8 @@
     "Icon",
     "Resolve",
     "Elicit",
+    "Sample",
+    "ListRoots",
     "ElicitationResult",
     "AcceptedElicitation",
     "DeclinedElicitation",
diff --git a/src/mcp/server/mcpserver/resolve.py b/src/mcp/server/mcpserver/resolve.py
diff --git a/tests/docs_src/test_dependencies.py b/tests/docs_src/test_dependencies.py
diff --git a/tests/server/mcpserver/test_resolve.py b/tests/server/mcpserver/test_resolve.py