Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw on received message if the RoC of an all-nodes request has finished #310

Conversation

david-eckardt-sociomantic
Copy link
Contributor

In several situations, including a node shutdown and client/node protocol mismatch, the client can receive a message for a request-on-conn whose fiber has already terminated. Throw a protocol error to initiate a shutdown to prevent RequestOnConn.setReceivedPayload from attempting to resume the terminated fiber.

Fixes #309.

Copy link

@gavin-norman-sociomantic gavin-norman-sociomantic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice for the commit message to include a high-level summary of what the old and behaviour is, in this scenario.

@@ -252,25 +252,29 @@ public struct RequestOnConnSet

/***************************************************************************

Decrements the `num_active` counter and returns whether it is now 0.
Decrements the `num_active` counter, registers `roc` as inactive and
returns whether it is now 0.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your addition to this sentence has changed the meaning of "it" at this point.

returns whether it is now 0.

Params:
roc = the `RequestOnConn` instance to register as inactive

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "the RequestOnConn instance that has finished" is a bit clearer?

// avoid potentially introducing a bug if there is an unobvious reason
// why it isn't done.
request_on_conn.active = false;
// TODO: Consider `this.all_nodes.remove(roc)` instead of using the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this would be better as an issue, rather than a code comment. We'll probably never think of it again, if it's left as a comment deep in the code like this.

// using the `active` flag. Not doing this now for a patch release to
// avoid potentially introducing a bug if there is an unobvious reason
// why it isn't done.
request_on_conn.active = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does request_on_conn refer to? Looks like it's intended to be an argument, but the function has no arguments.

@@ -439,7 +439,8 @@ public final class RequestSet: IRequestSet
}
body
{
if ( this.getRequestOnConnForNode(connection.remote_address) !is null )
if ( this.getRequestOnConnForNode(connection.remote_address,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is a weird case! This should be impossible, right? (If so, this is more like a verify than a protocol error.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the exception here is merely for the sake of the arguments of getRequestOnConnForNode. I can add a null default for the exception parameter.

If a `RequestOnConn` is registered in the `RequestOnConnSet`, make it
possible to determine if `RequestOnConnSet.finished` has been called with
it.
…shed

Fixes sociomantic-tsunami#309.

When a RoC of a multi-node request finishes, it RoC stays registered in the
`RequestOnConnSet` until all other RoCs have finished, too. During that
time the client can receive a message for this request through this
connection. It then looks the RoC up in the RoCset, finds it and calls
`RequestOnConn.setReceivedPayload`, which attempts to resume the fiber -
but the fiber has terminated so it cannot be resumed, and an assertion
fails.
This is the case of a protocol error, the client has finished handling the
request while the node is still sending messages. Handle it by tracking
which of the RoCs in the RoCset have finished and throwing `ProtocolError`
when attempting to use one.
@david-eckardt-sociomantic
Copy link
Contributor Author

Updated, addressing @gavin-norman-sociomantic's comments. Added issue #311.

@gavin-norman-sociomantic gavin-norman-sociomantic merged commit dcda5cd into sociomantic-tsunami:v4.6.x Mar 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants