Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Do responses need to be correlated with requests? #31

Closed
benfrancis opened this issue Dec 19, 2024 · 11 comments
Closed

Question: Do responses need to be correlated with requests? #31

benfrancis opened this issue Dec 19, 2024 · 11 comments
Labels

Comments

@benfrancis
Copy link
Member

@RobWin and @hspaay have suggested that for the more request/response style messages in the protocol it would be useful to be able to correlate responses with specific requests. E.g. so that a Consumer knows that a given propertyReading message corresponds to a particular readProperty message.

Is it important for a Consumer to know exactly which request a given response corresponds to, or is it OK for a Consumer which sends a readProperty message to just make use of the next propertyReading message it receives for the given property?

Which other operations might need this feature?

How would it work?

What additional members are needed?

@benfrancis
Copy link
Member Author

Note that in my strawman proposal action requests are already assigned a unique identifier (currently a URL) in an actionStatus message which can be used to query the status of the action with a queryAction message (Inspired by the ActionStatus object in the HTTP Basic Profile). However, there is no way to correlate an actionStatus message with an invokeAction message other than comparing timestamps and/or input parameters.

How would this work for properties? Would a Consumer generate an identifier (e.g. requestId) which is included in a readProperty message, which is then also included in a propertyReading message?

@RobWin
Copy link
Collaborator

RobWin commented Dec 19, 2024

Hello,

I believe the inclusion of a unique message ID and correlation ID in the response message is just beneficial, especially in a request/reply pattern. While not every client implementation is required to use these IDs, having them available adds flexibility and helps ensure that more complex use cases are supported. For clients that don't need to correlate responses, they can simply process the incoming messages without paying attention to the correlation ID. However, for clients that do need to track responses against specific requests, the presence of these IDs makes it easier and more reliable.

WebSockets, by design, enable asynchronous communication, which means that responses may not arrive in the same order as requests were sent. This is where message IDs and correlation IDs become crucial. Message IDs allow you to uniquely identify each message, while correlation IDs ensure that even if responses are received out-of-order, each reply can be properly matched with its corresponding request. This is particularly important when dealing with scenarios where multiple requests are being sent simultaneously, and responses might be processed by the server in parallel.

For example, consider a scenario where a client is controlling a light by issuing a series of commands—turning the light off, turning it on, and changing its color. If the device handling these requests is multi-threaded or experiences any form of processing delay, it’s possible for the responses to be sent back in a different order than the requests were made. In such cases, having a correlation ID in the response would allow the client to correctly match the response (e.g., turning the light on) to the corresponding request, even if the response for "turning the light off" arrives afterward. This ensures that the client can maintain the proper sequence of operations and understand the state of the system correctly.

Ultimately, incorporating message IDs and correlation IDs provides a layer of robustness that allows for better handling of out-of-order or delayed responses, improving the overall reliability and consistency of the system.

Also in error handling use cases it's important to understand which request message has actually failed. Correlation IDs help to ensure that retry logic is tied to the correct message. If a request fails and needs to be retried, the correlation ID ensures that the retry is linked to the original request, avoiding confusion or duplicate processing.

@hspaay
Copy link
Collaborator

hspaay commented Dec 19, 2024

Adding a few use-cases in favor of request/correlation-id:

  1. A consumer sends three commands to three different devices, or three different inputs on a single device. The consumer is shown the progress of the command so he doesn't keep repeating the command if the device responds a bit slowly. After the third response is received the user-interface indicates that the command for each action are complete.
    For this to work deterministically, it must be possible to know when each command for each device or action is completed and which device or output it belongs to. Using a request/correlation-id lets the consumer link the response to the request.

  2. The consumer uses a hub or gateway to send a request to a device. A request is first sent to the router/gateway which in turn passes it on to another device. The result is asynchronous. When the hub/gateway receives a response from the device it must route this response back to the consumer. Using the request-IDs the router can identify the consumer to forward the response to.

  3. A device does not provide output on an action but updates a property or send an event once the action is complete. The availability of a requestID allows the consumer to get notified when the action is complete.

@VigneshVSV
Copy link

VigneshVSV commented Dec 19, 2024

When I first read the strawman proposal a few months back, I wanted to comment on this specific need.

I think its absolutely necessary to have a message ID. Apart from asyncio, also threaded clients, clients shared in multiprocess scenarios will always need a message ID to correlate the responses. These patterns are common in python and also in SCADA systems, control & real time systems based on web of things.

Besides, it will help solve many more problems apart from correlation, including writing nicer logic, logs etc.. Besides its a few more bytes. With a decent serializer and a faster responding eventloop, the overhead will be literally in nanoseconds if I am not wrong.

I have already given the strawman proposal a shot over ZMQ , but I could not conclude it because my use case was significantly ahead of what the proposal could satisfy. However, it could be very useful to also share the message contract over multiple transport layers apart from websocket. So, some day I hope that may message contract will converge to what you guys have and I absolutely need a message ID.

The following is my current messaging contract which looks somewhat similar to the strawman proposal , except laid out as a list instead of a dict/JSON:
https://github.com/VigneshVSV/hololinked/blob/ad934c47d6c76bd3a1298eadfa2512c49561e686/hololinked/server/protocols/zmq/message.py#L124

You can also see the message types I support:

https://github.com/VigneshVSV/hololinked/blob/ad934c47d6c76bd3a1298eadfa2512c49561e686/hololinked/server/protocols/zmq/message.py#L11

If I use your sub protocol, this will go to messageType field.

I had some issue with encoding a raw byte payload with a different content type than JSON (like raw byte array or an octet stream) within the JSON, so I didn't cast the contract yet into something looking like the proposal of this project.

In following you can see its usage to run a certain WoT operation here:

https://github.com/VigneshVSV/hololinked/blob/ad934c47d6c76bd3a1298eadfa2512c49561e686/hololinked/server/rpc_server.py#L414

https://github.com/VigneshVSV/hololinked/blob/ad934c47d6c76bd3a1298eadfa2512c49561e686/hololinked/server/rpc_server.py#L374

Its not exactly following the standard, but you can see the similarity I hope.

@benfrancis
Copy link
Member Author

I'm not completely convinced by all of the use cases suggested here:

  • As I understand it TCP (and by extension WebSockets) does at least guarantee that messages are received in the order they were sent, and timestamps can go a long way to help with eventual consistency
  • In the strawman proposal it's already possible to track the status of multiple action requests in parallel
  • If a gateway is responding on behalf of a device and communicating with multiple Consumers it needs to be able to differentiate between Consumers without the use of a requestID
  • A Thing should never assume that a Consumer can understand relationships between interaction affordances, since there is no standardised way to denote this in a Thing Description. It definitely wouldn't make sense to include the requestID from an invokeAction request in a propertyReading response for example.

However, I can see that correlating responses with requests could result in a more robust system and reduce the risk of race conditions and ambiguous responses. It's definitely true that error responses are more useful if they can be correlated with a request!

I think we should probably add a requestID to all Consumer -> Thing messages, which can then be included in Thing -> Consumer messages to correlate responses with requests.

That ID could also be used in a requestID member in the actionStatus message to replace the href member in the strawman proposal (to both correlate actionStatus messages with an invokeAction message and with other actionStatus messages for the same action request). It could also be used in a potential eventSubscription message being discussed in #29.

Open questions:

  • Assuming the Consumer is responsible for generating unique IDs to use as requestIDs, what format should they take? I'm thinking it could be any string but with a recommendation of some version of UUID.
  • If multiple subscribeEvent/subscribeAllEvents or observeProperty/observeAllProperties messages are sent by the same Consumer for the same event/property affordance, which requestID should be included in event/propertyReading messages? The last one received? Or should requestID be omitted from event messages in case of overlapping subscriptions, and omitted from propertyReading messages unless in response to a readProperty message?

@hspaay
Copy link
Collaborator

hspaay commented Jan 6, 2025

If a gateway is responding on behalf of a device and communicating with multiple Consumers it needs to be able to differentiate between Consumers without the use of a requestID.

Maybe you already have a solution for this? I don't see a way to accomplish this though, so I'm asking what is wrong with using the requestID to differentiate consumers?

I've implemented the current proposal but can't get it to work properly without adding this dependency on the requestID to identify the consumer to send a response to.

I think we should probably add a requestID to all Consumer -> Thing messages, which can then be included in Thing -> Consumer messages to correlate responses with requests.

I welcome this.

That ID could also be used in a requestID member in the actionStatus message

Yes 👍


open questions:

... requestIDs, what format should they take? I'm thinking it could be any string but with a recommendation of some version of UUID.

Currently I'm using shortid (golang) and nanoid JS. According to the author of nanoid:
"Nano ID is quite comparable to UUID v4 (random-based). It has a similar number of random bits in the ID (126 in Nano ID and 122 in UUID), so it has a similar collision probability:"

So yes, some version of UUID would work nicely. As long as the collision probability is low.

If multiple subscribeEvent/subscribeAllEvents or observeProperty/observeAllProperties messages are sent by the same Consumer for the same event/property affordance, which requestID should be included in event/propertyReading messages...

Hmm yes, good question. It is a bit of an edge case though. Two options come to mind:

  1. prefer specific over general. The downside is that a Thing would have to scan all subscriptions of a consumer to find the most specific one.
  2. first match found by the Thing when looking for subscriptions. It is easier on the Thing but the consumer would have to figure out which subscription it belongs to. (preferred)

My preference is 2 as this is a problem introduced by the consumer by making overlapping subscriptions.

@RobWin
Copy link
Collaborator

RobWin commented Jan 7, 2025

As I understand it TCP (and by extension WebSockets) does at least guarantee that messages are received in the order they were sent, and timestamps can go a long way to help with eventual consistency

While transport protocols like TCP (and WebSockets by extension) guarantee message order, this doesn't necessarily ensure order at the application level. For example, if you rapidly turn a lamp on and off, the Thing's implementation might process these actions on different threads. Without proper handling, responses could be returned out of order via the WebSocket connection. To resolve this, a correlationId is essential, allowing the client to reliably match responses to their corresponding requests.
But I think we already agreed to add some sort of correlationId or requestId to responses.

If a gateway is responding on behalf of a device and communicating with multiple Consumers it needs to be able to differentiate between Consumers without the use of a requestID

In my Client and Server ThingServient implementation, I generate a unique UUID for each WebSocket connection to keep track of multiple consumers. But additionally, I use messageIds (or requestIds) to manage a map of pending requests and responses. This is implemented using a map of messageIds and CompletableDeferred in Kotlin (Promises in JS), allowing efficient tracking of pending requests. The use of messageIds ensures that both the client and server can handle multiple requests concurrently, correlate responses to their originating requests, and gracefully handle timeouts for requests that take too long to complete. Without this messageId/correlationId pair, it would be definitely more difficult to implement.

A Thing should never assume that a Consumer can understand relationships between interaction affordances, since there is no standardised way to denote this in a Thing Description. It definitely wouldn't make sense to include the requestID from an invokeAction request in a propertyReading response for example.

Totally agree. If an action results in a property being updated, I currently don't see the need to put the requestId of the invokeAction message into a propertyReading message.

Assuming the Consumer is responsible for generating unique IDs to use as requestIDs, what format should they take? I'm thinking it could be any string but with a recommendation of some version of UUID.

I think we should specify/use UUIDv4 and not allow to use any ID format.

If multiple subscribeEvent/subscribeAllEvents or observeProperty/observeAllProperties messages are sent by the same Consumer for the same event/property affordance, which requestID should be included in event/propertyReading messages? The last one received? Or should requestID be omitted from event messages in case of overlapping subscriptions, and omitted from propertyReading messages unless in response to a readProperty message?

I would favor to model subscribeEvent/observeProperty as idempotent operations, where the most recent request takes precedence. If a subscription already exists for an event or property, it should simply be overwritten by the latest one, without exceptions.

@benfrancis
Copy link
Member Author

@RobWin stated in #34 (comment) that he thinks correlationID is a better name than requestID because it better captures the purpose of the identifier in cases where there may be multiple messages sent in response to a single request (e.g. events and property readings for observed properties). I understand the name is inspired by terms used in AMQP (which uses the term "correlation-id") and MQTT (which uses the term "correlation data"). Since I'm not so familiar with these protocols this term initially seemed a bit jargony to me, though I understand it's a fairly widely used term in software.

Do other people think correlationID or requestID is a better name? *

I'm going to file a separate issue regarding messageID.


* In the strawman proposal I used the capitalisation "thingId" but I think I actually prefer "thingID" as some others have being using. Sorry if this causes problems for anyone who already started implementing the strawman proposal!

@benfrancis
Copy link
Member Author

@RobWin wrote:

I would favor to model subscribeEvent/observeProperty as idempotent operations, where the most recent request takes precedence. If a subscription already exists for an event or property, it should simply be overwritten by the latest one, without exceptions.

I think this makes the most sense.

Just to note we also need to consider the scenario of overlapping subscriptions between subscribeEvent & subscribeAllEvents or observeProperty & observeAllProperties?

E.g.

  1. Consumer sends subscribeAllEvents with messageID "1"
  2. Consumer sends subscribeEvent with event name "foo" and messageID "2"
  3. Thing sends event message with event name "foo" and correlationID "2"
  4. Thing sends event message with event name "bar" and correlationID "1"
  5. Consumer sends unsubscribeEvent with event name "foo"
  6. Thing no longer sends event messages with event name "foo" even though the Consumer separately sent a subscribeAllEvents message
  7. Consumer sends unsubscribeEvent with event name "bar"
  8. Thing no longer sends event messages with event name "bar" but may continue to send event messages with the event name "baz"

Based on the discussion in this and other issues I am starting to agree that correlationID is a better name.

@RobWin
Copy link
Collaborator

RobWin commented Jan 8, 2025

That's indeed a good question. I haven't implemented subscribeAllEvents and didn't come across this issue yet.
It's tricky to decide what should happen if a consumer sends subscribeAllEvents and then subscribeEvent with event name "foo" and unsubscribeEvent with event name "foo".

@benfrancis
Copy link
Member Author

Answer: Yes, but correlation IDs are optional.

I've filed a separate issue for the question about overlapping subscriptions: #41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants