-
Notifications
You must be signed in to change notification settings - Fork 854
StdioClientTransport missing explicit UTF-8 charset in InputStreamReader (same issue as #295, but on client side) #898
Description
Bug description
StdioClientTransport has the same encoding mismatch issue that was identified in #295 and fixed for StdioServerTransportProvider in #826 — but the fix was only applied to the server side. The client transport still lacks explicit UTF-8 charset
specification when reading from the subprocess.
In startInboundProcessing, the InputStreamReader is created without specifying a charset:
try (BufferedReader processReader = new BufferedReader(new InputStreamReader(process.getInputStream()))) {Similarly, in startErrorProcessing:
try (BufferedReader processErrorReader = new BufferedReader(
new InputStreamReader(process.getErrorStream()))) {Meanwhile, startOutboundProcessing already correctly specifies UTF-8:
os.write(jsonMessage.getBytes(StandardCharsets.UTF_8));
os.write("\n".getBytes(StandardCharsets.UTF_8));This is the exact same inconsistency that #295 reported for StdioServerTransportProvider, and that #826 fixed — only on the server side.
Steps to reproduce
- Start a JVM with default charset set to something other than UTF-8 (e.g., -Dfile.encoding=COMPAT on Windows with Japanese locale, which resolves to MS932/Shift_JIS)
- Connect to an MCP server via StdioClientTransport
- Call a tool that returns multi-byte UTF-8 characters (e.g., Japanese, Chinese, Korean, emoji) in its response
Expected behavior
Multi-byte characters in the server's JSON-RPC response should be decoded correctly, since the MCP stdio transport specification requires UTF-8.
Actual behavior
The InputStreamReader uses Charset.defaultCharset() instead of UTF-8. When the default charset is not UTF-8, the response bytes are decoded with the wrong charset, corrupting multi-byte characters. This corruption can also break the JSON structure itself,
resulting in JsonParseException:
com.fasterxml.jackson.core.JsonParseException: Unexpected character ('' (code 92)): was expecting double-quote to start field name
For example, with MS932 as the default charset, the last byte of certain UTF-8 characters (0x8B, etc.) is interpreted as a MS932 lead byte, which then consumes the following byte — potentially a JSON structural character like \ (0x5C). This shifts the parser
state and breaks JSON parsing entirely.
Environment
- MCP Java SDK version: 1.1.1
- Java version: 21
- OS: Windows 11 (Japanese locale, default charset MS932 with -Dfile.encoding=COMPAT)