Skip to content

feat: support explicit OAuth config for remote MCP servers#2274

Open
iElsha wants to merge 2 commits intodocker:mainfrom
iElsha:feat/explicit-oauth-config-2248
Open

feat: support explicit OAuth config for remote MCP servers#2274
iElsha wants to merge 2 commits intodocker:mainfrom
iElsha:feat/explicit-oauth-config-2248

Conversation

@iElsha
Copy link
Copy Markdown

@iElsha iElsha commented Mar 28, 2026

Support explicit OAuth configuration for remote MCP servers

Closes #2248
Related: #416

Problem

Remote MCP servers that do not support Dynamic Client Registration (RFC 7591), such as Slack and the GitHub MCP Server, were completely broken. More details on #2248

Solution

Add an oauth block to the remote config that lets users supply pre-registered credentials, and fix the OAuth flow so it no longer fabricates a registration_endpoint when the server doesn't advertise one.

Overall the approach is working but had design decisions taken that you might want to validate

  1. No persistence across session, it uses the inMemory storage, meaning oAuth flows needs to run once per sesssion. (Improvement can be in a follow-up PR)
  2. If the user declines the oAuth flows, he can not retry it for this session (same - improvement in a follow-up PR)
  3. oAuth flow is triggered after the first interaction, not on start-up. I had issues getting the elicitation handler wired up at this moment, if you know a better way happy to discuss.
  4. TLS (https) is required for some provider on the oAuth callback. The user will see a warning on the browser when attempting to validate because a temporary certificate is created on the runtime. Most likely a secured certificate should be used.

Hence this PR is putting the fundation for the OAuth supports but it needs a bit of polishing.

Demo video for slack MCP server: https://streamable.com/y3i7mx

Config format

Below format is following the specs given by @aheritier with one exception, the addition of a tls flag required

toolsets:
  - type: mcp
    remote:
      url: https://mcp.slack.com/mcp
      transport_type: streamable
      oauth:
        clientId: "<your-app-client-id>"
        clientSecret: "<your-app-client-secret>"   # optional, for confidential clients
        callbackPort: 3118                          # optional, pins the local redirect port
        tls: true                                   # optional, required by some providers (e.g. Slack)
        scopes:                                     # optional, overrides server-advertised scopes
          - channels:read
          - chat:write
          - search:read.public

AI generated change-log bellow that helps understanding the changes

Detailed Changes

Config (pkg/config/latest/)

  • Added RemoteOAuthConfig struct with fields ClientID, ClientSecret, CallbackPort, Scopes, TLS
  • Added OAuth *RemoteOAuthConfig field to the Remote struct
  • Added validation: oauth requires type: mcp, a url, a non-empty clientId, and callbackPort >= 0
  • TLS is required for some provider like Slack, a certificate is created on runtime to satisfy the needs.

OAuth flow (pkg/tools/mcp/oauth.go)

  • Removed the fabricated registration_endpointvalidateAndFillDefaults() and createDefaultMetadata() no longer invent an endpoint the server never advertised
  • handleManagedOAuthFlow() now follows three paths: explicit clientId → use directly; server advertises registration_endpoint → dynamic registration; neither → fail with a clear actionable error
  • Explicit credentials force the managed flow regardless of the managed flag
  • scopes from config override authServerMetadata.ScopesSupported when provided
  • On user decline, a UserDeclined sentinel is stored in the token store — RoundTrip checks this first on every subsequent request and returns immediately without re-prompting

Callback server (pkg/tools/mcp/oauth_server.go, oauth_helpers.go)

  • NewCallbackServerWithOptions(port, ...opts) + WithTLS() functional option — wraps the TCP listener with an in-memory self-signed TLS cert (required by Slack, which rejects http:// redirect URIs even for loopback)
  • GetRedirectURI() returns https:// when TLS is enabled
  • Fixed ExchangeCodeForToken(): io.ReadAll was draining the body before json.NewDecoder, making the primary decode path dead code — now uses json.Unmarshal directly

Token store (pkg/tools/mcp/tokenstore.go)

  • Added UserDeclined bool to OAuthToken as a session-scoped sentinel to permanently suppress re-prompting after a decline

Runtime (pkg/runtime/runtime.go)

  • emitToolsProgressively() defers OAuth toolsets — they are not started until RunStream is active and an elicitation handler is available

Wiring (pkg/tools/mcp/mcp.go, pkg/tools/mcp/remote.go, pkg/teamloader/registry.go)

  • NewRemoteToolset() accepts oauthConfig; RequiresOAuth() signals the runtime to defer startup
  • registry.go passes toolset.Remote.OAuth through to the toolset

Schema (agent-schema.json)

  • Added RemoteOAuthConfig definition with required: ["clientId"], minimum: 0 on callbackPort, additionalProperties: false

Testing

Unit tests

  • TestOAuthTransport_UserDeclined — end-to-end against a real httptest.Server: elicitation fires exactly once on first request, zero times on all subsequent requests after decline
  • TestOAuthUserDeclinedSentinel — three parallel subtests: sentinel written to store correctly, scoped to the declined URL only, not misclassified as expired by IsExpired()
  • TestValidateAndFillDefaults — guards against regression of the fabricated endpoint; includes a case asserting RegistrationEndpoint stays empty when the server omits it
  • TestToolset_Validate_RemoteOAuth — covers all validation paths: valid config, missing clientId, negative callbackPort, wrong toolset type, missing url
  • TestEmitStartupInfo_OAuthToolsetDeferred / TestEmitStartupInfo_NonOAuthToolsetStartedEagerly — verify the deferred-start behaviour in the runtime
  • TestSelfSignedTLSConfig, TestGetRedirectURI, TestNewCallbackServer_*, TestBuildAuthorizationURL — cover the new callback server and helper paths

Manual — Slack MCP Server (end-to-end)

Tested against the Slack MCP Server using a Slack app registered in the Docker workspace. Config used:

toolsets:
  - type: mcp
    remote:
      url: https://mcp.slack.com/mcp
      transport_type: streamable
      oauth:
        clientId: "<redacted>"
        clientSecret: "<redacted>"
        callbackPort: 3118
        tls: true
        scopes:
          - channels:read
          - chat:write
          - search:read.users
          - search:read.public

Flow verified:

  1. Agent starts — Slack toolset is deferred (0 tools reported at startup, no no elicitation handler configured error)
  2. First interaction — browser prompt opens, Slack OAuth consent screen loads over https://127.0.0.1:3118/callback
  3. User authorises — token exchanged, Slack tools loaded, agent confirms tool count updated
  4. Successfully called search:read.public to search channels and chat:write to send a message
  5. Session ended, new session started — OAuth prompt appears again (token store is per-session, not persisted)

Decline flow verified:

  1. OAuth prompt appears on first interaction
  2. User clicks "Decline" in the elicitation UI
  3. Agent responds that the Slack toolset is unavailable
  4. User sends a second message — no OAuth prompt appears again for the rest of the session
  5. Confirmed via debug logs: RoundTrip returns OAuth authorization was declined for this session immediately on requests 2+, elicitation handler call count stays at 1

Other minor Design decisions

  • UserDeclined reuses the token store rather than adding state to the transport — the InMemoryTokenStore already acts as the per-server auth state map; storing a sentinel there keeps the short-circuit check in one place (RoundTrip) with no new fields or coordination needed elsewhere.
  • callbackPort: 0 means random port — standard Unix convention for "give me any free port". Validation allows >= 0; the callback server treats > 0 as pinned and 0 as random, avoiding the need for a separate sentinel value.
  • scopes fully override, not merge — merging would require deduplication and could silently add scopes the user didn't intend. Full override gives deterministic control, which matters for security-sensitive scope selection.
  • Explicit clientId skips dynamic registration entirely — if credentials are already provided there is no reason to attempt registration. Trying it first would make the explicit config unreliable if the server happens to also advertise a registration_endpoint.

Add an `oauth` block to the `remote` toolset config allowing users to supply pre-registered OAuth credentials (`clientId`, `clientSecret`, `callbackPort`, `scopes`, `tls`) for MCP servers that do not support Dynamic Client Registration (RFC 7591), such as Slack and GitHub MCP Server.

Previously the OAuth flow fabricated a `registration_endpoint` by appending `/register` to the auth server URL when the server metadata omitted it. This caused a guaranteed 404/302 failure and silently dropped the toolset. The fabrication is removed entirely — servers that omit `registration_endpoint` now receive a clear error directing the user to set `remote.oauth.clientId`.

Additional fixes included in this change:
- OAuth toolsets are deferred at startup and only started once RunStream is active and an elicitation handler is available, preventing the `no elicitation handler configured` failure
- User decline of the OAuth prompt is permanent for the session: a `UserDeclined` sentinel stored in the token store short-circuits RoundTrip on all subsequent requests without re-prompting
- TLS support on the callback server via `WithTLS()` option using an in-memory self-signed cert, required by providers like Slack that reject http:// loopback redirect URIs

Closes docker#2248
Related: docker#416

Signed-off-by: Maxence Dominici <dmaxenc@amazon.com>
@iElsha iElsha requested a review from a team as a code owner March 28, 2026 19:48
@aheritier
Copy link
Copy Markdown
Contributor

/review

Copy link
Copy Markdown

@docker-agent docker-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🔴 CRITICAL

This PR introduces important OAuth functionality but has 2 critical security vulnerabilities that must be fixed before merging:


🔴 HIGH Severity Issues

1. XSS Vulnerability in OAuth Error Page (oauth_server.go:144)

The error message from the OAuth provider is rendered directly into HTML without escaping:

fmt.Fprintf(w, `<!DOCTYPE html>...<p>%s</p>...`, errMsg)

Impact: A malicious or compromised OAuth provider could inject JavaScript via the error or error_description query parameters, leading to XSS when the callback server renders the error page.

Fix: Use html.EscapeString() to sanitize error messages before rendering, or use html/template for automatic escaping.

2. Unprotected HTTP GET Calls (oauth.go:236, 399)

The http.Get(resourceURL) calls use the default HTTP client with no timeout or TLS verification configuration.

Impact:

  • Indefinite hangs if the OAuth metadata server is unresponsive, blocking the entire OAuth flow
  • Potential MITM attacks when fetching metadata over HTTPS without proper certificate verification

Fix: Create a dedicated HTTP client with timeout and proper TLS configuration (similar to the existing metadataClient).


🟡 MEDIUM Severity Issues

3. OAuth Error Messages Leak Server Details (oauth.go:307, 437)

Error wrapping with %w exposes full authorization server errors, potentially leaking database errors, internal paths, etc.

Fix: Sanitize errors to only include actionable information for users.

4. Callback Server Shutdown Race (oauth.go:278)

Deferred callbackServer.Shutdown() may interrupt in-flight OAuth callback responses, causing browser connection reset errors.

Fix: Add a small delay (100ms) before shutdown or use a sync mechanism to wait for handler completion.

5. Missing authServer Validation (oauth.go:234)

No validation before constructing fallback resource URL. If authServer is empty or malformed, produces invalid URLs like "https:///.well-known/oauth-protected-resource".

Fix: Validate authServer is non-empty and well-formed before URL construction.

6. Silently Ignored Read Errors (oauth_helpers.go:80, 150)

io.ReadAll errors are discarded with _, making debugging harder when response bodies are malformed.

Fix: Handle read errors properly and include them in error messages.

7. Fixed Certificate Serial Number (oauth_helpers.go:184)

Self-signed TLS certificate uses fixed serial number of 1 instead of random, enabling fingerprinting.

Fix: Generate random serial numbers per RFC 5280 using crypto/rand.

8. OAuth State Machine Bug (remote_runtime.go:277, 285, 293)

Error paths call ResumeElicitation with "decline" and then return an error, creating inconsistent state where the server thinks the user declined but the caller sees an error.

Fix: Either return the error without calling ResumeElicitation, or call it and return nil.


Summary:

  • 🔴 2 HIGH severity issues (security vulnerabilities requiring immediate fix)
  • 🟡 6 MEDIUM severity issues (robustness and security hygiene)
  • ✅ 3 findings dismissed (not actual bugs in changed code)

All findings have been verified against the actual implementation in the PR.

yunus25jmi1 added a commit to yunus25jmi1/docker-agent that referenced this pull request Mar 29, 2026
This commit fixes 6 issues (4 HIGH, 2 MEDIUM severity) identified in the PR review.
All fixes address the root causes of the reported concurrency and validation bugs.

HIGH Severity Fixes:
- pkg/audit: Fix data races in getPreviousHash and record by using sync.RWMutex and
  ensuring atomic updates to the cryptographic chain.
- pkg/audit: Add error check for crypto/rand.Read to prevent predictable ID generation.
- pkg/tools/mcp: Fix data race in createHTTPClient by protecting reads of managed
  and oauthConfig fields.

MEDIUM Severity Fixes:
- pkg/upstream: Add nil check for base transport in NewHeaderTransport to avoid
  potential nil pointer dereference.
- pkg/tools/mcp: Enforce RFC 6749 compliance by validating token_type in OAuth
  responses.

Verification:
- All pkg/audit tests pass (13/13)
- All pkg/tools/mcp OAuth tests pass (6/6)
- Successful build of project and final binary

Fixes: PR docker#2274
yunus25jmi1 added a commit to yunus25jmi1/docker-agent that referenced this pull request Mar 29, 2026
This commit fixes 6 issues (4 HIGH, 2 MEDIUM severity) identified in the review
of PR docker#2274. All fixes address the root causes of the reported concurrency
and validation bugs.

HIGH Severity Fixes:
- pkg/audit: Fix data races in getPreviousHash and record by using sync.RWMutex and
  ensuring atomic updates to the cryptographic chain.
- pkg/audit: Add error check for crypto/rand.Read to prevent predictable ID generation.
- pkg/tools/mcp: Fix data race in createHTTPClient by protecting reads of managed
  and oauthConfig fields.

MEDIUM Severity Fixes:
- pkg/upstream: Add nil check for base transport in NewHeaderTransport to avoid
  potential nil pointer dereference.
- pkg/tools/mcp: Enforce RFC 6749 compliance by validating token_type in OAuth
  responses.

Verification:
- All pkg/audit tests pass (13/13)
- All pkg/tools/mcp OAuth tests pass (6/6)
- Successful build of project and final binary

Fixes docker#2248
Ref: PR docker#2274
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support explicit OAuth configuration for remote MCP servers

2 participants