Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Character encoding of stdin/stdout is platform-dependent, leading to Unicode corruption #199

Closed
rnortman opened this issue Feb 10, 2025 · 1 comment

Comments

@rnortman
Copy link
Contributor

Describe the bug
Character encoding of stdin/stdout on Windows is not UTF-8. (It will depend on the Windows locale, but usually CP437 or CP1252.) This breaks JSON, which requires proper Unicode. The Windows Claude Desktop application at least sense UTF-8 encoded data.

To Reproduce
Steps to reproduce the behavior:

  1. Run Claude Desktop on Windows with an MCP server
  2. Send something with an emoji
  3. Observe corruption

Expected behavior
Unicode characters are not corrupted. (MCP clients/servers should be using UTF-8; see additional context below.)

Screenshots
N/A

Desktop (please complete the following information):

  • OS: Windows 11
  • Browser: None (Claude Desktop)
  • Version: ???

Smartphone (please complete the following information):
N/A

Additional context

The JSON spec requires UTF-8, UTF-16, or UTF-32 encoding, so the default behavior on Windows is out of spec. The MCP protocol spec does not specify encoding and has no handshake to establish it, so this leaves some ambiguity, but I think UTF-8 is by far the most reasonable default in the modern era.

I have put up a PR with a fix: #198

@rnortman
Copy link
Contributor Author

PR with fix: #198

@dsp-ant dsp-ant closed this as completed Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants