Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client hangs on valid server responses #192

Closed
peter-jerry-ye opened this issue Feb 6, 2025 · 5 comments
Closed

Client hangs on valid server responses #192

peter-jerry-ye opened this issue Feb 6, 2025 · 5 comments

Comments

@peter-jerry-ye
Copy link

Describe the bug

  1. The client hangs when the server send a log notification after the initialization phase.
  2. The client hangs when the server sends a response having the id as string.

Both of the actions operations are valid as per specification.

To Reproduce
This repository contains the files.

By modifying server.py then run uv run client.py should give you the behavior.

It also has a client implemented with typescript sdk as comparison.

Expected behavior
The client should perform normally, as the client written with typescript sdk.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: MacOS
  • Browser [e.g. chrome, safari]: n/a
  • Version [e.g. 22]: n/a

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]: n/a
  • OS: [e.g. iOS8.1]: n/a
  • Browser [e.g. stock browser, safari]: n/a
  • Version [e.g. 22]: n/a

Additional context
n/a

@g-votte
Copy link

g-votte commented Feb 10, 2025

I investigated this issue and found:

  • The hang problem itself can be resolved by adding a cast in session.py
  • However, before proceeding with any fix, I would like to hear maintainers' thoughts on whether this issue relates to python-sdk, typescript-sdk, or the specification itself

Here is my analysis to share what I found.

Analysis of Hang Problem

Direct cause of the hang seems in those two lines:

self._response_streams[request_id] = response_stream

stream = self._response_streams.pop(message.root.id, None)

What is happening?

  • _response_streams identifies a request by the "integer" request id
  • When receiving the response, the request id in the message (message.root.id) is not cast to integer
  • Thus, if the request id is passed as a str, the program cannot identify the corresponding request id in the dictionary. Hence, the stream is not sent

Those logics make the following line keep waiting forever:

response_or_error = await response_stream_reader.receive()

Discussion

Following the typescript-sdk approach, we can resolve the hang by adding a cast.

Specifically, changing self._response_streams.pop(message.root.id, None) to self._response_streams.pop(int(message.root.id), None) in this line would fix the hang, allowing @peter-jerry-ye's example program to run without issues:

stream = self._response_streams.pop(message.root.id, None)

However, I wonder about the intended design in the MCP spec - why is RequestId allowed to be either string (str) or number (int)?

There are three possibilities:

  • If the schema is designed for developer experience by allowing incrementing integer IDs to be passed as strings, we should add integer casting
    • In this case, the current python-sdk behavior would be considered a bug, and we could create a PR to add the casting along with test cases to verify this behavior
  • If the schema is designed to support non-numeric IDs (e.g., UUIDs), integer casting would be inappropriate
    • In this case, the typescript-sdk behavior would need revision rather than the python-sdk
  • If this design direction is not yet finalized, neither SDK implementation can be considered definitively correct or incorrect
    • This would indicate a specification issue that should be addressed in the specification repository, as ambiguity in RequestId handling could cause interoperability issues

@dsp-ant Could you help guide our understanding of this issue? The above analysis might be missing important aspects of the MCP design, so we'd appreciate your insights on how to best address this.

@peter-jerry-ye
Copy link
Author

However, I wonder about the intended design in the MCP spec - why is RequestId allowed to be either string (str) or number (int)?

The MCP is based on JSON RPC 2.0, where

An identifier established by the Client that MUST contain a String, Number, or NULL value if included. If it is not included it is assumed to be a notification. The value SHOULD normally not be Null and Numbers SHOULD NOT contain fractional parts

@dsp-ant
Copy link
Member

dsp-ant commented Feb 11, 2025

Thank you for the analysis. This is super helpful. We should fix this. We will stick to the JSON RPC standard and hence allow string or a number. Now we should NOT cast to integer but rather allow the _response_streams, since a string can contain a uuid or any other format that cannot be casted.

Now I am not even sure I fully follow WHY the bug is happening. It seems to me that the _response_stream dictionary uses RequestId as key, so it should be able to find any request id. message.root.id is also a RequestId. I have to do more diggig.

If any of you have time to see you can produce a unit tests for this, this would be massive help. There are some examples for creating a test with a client and a server in the repo.

@peter-jerry-ye
Copy link
Author

Now we should NOT cast to integer but rather allow the _response_streams, since a string can contain a uuid or any other format that cannot be casted.

I see, then I misunderstood the JSON rpc protocol. It seems that the id needs to be exactly the same in the sense that they should also be the same type: number or string. Then this might have been a misimplementation on typescript sdk's side.

Closing as this is the expected behavior.

@g-votte
Copy link

g-votte commented Feb 12, 2025

@dsp-ant
Thank you for your guidance!

Let me clarify the situation:

  • Actually, the python-sdk behavior is correct - it properly handles RequestId allowing both string (such as UUID) and number.
    • The hang occurs in python-sdk when uncommenting this line of @peter-jerry-ye's code and using a string RequestId ("1" instead of 1), which is the expected behavior.
  • On the other hand, the hang does not happen in the typescript-sdk, which might be a bug in the typescript-sdk.
    • The typescript-sdk forcibly casts RequestId to Number here, even if the RequestId is a string.

dsp-ant added a commit that referenced this issue Feb 12, 2025
This test ensures that the server properly preserves and returns the same request ID in responses,
which is a fundamental part of the JSON-RPC protocol. The test initializes the server, sends
requests with custom IDs, and verifies that these IDs are correctly returned.

Github-Issue:#192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants