-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consume: prevent loss of data in write buffers / pending batches #16
Comments
One idea would be to modify the process of popping a record from the storage engine. Rather than a destructive pop, it could be a two step process:
That way, the ordering of records in the channel would not be altered. |
To end Consume requests when a node is shut down a protocol change is not necessary -- just to flush the pending batch and send
An easier way of accomplishing this could be to use a timeout rather than waiting for requests to report that they have ended. On
This could be done at the storage engine level in a way that the storage engine creates the batch and writes it to a dump file on disk on shutdown if it hasn't been sent. On the other hand, many times (that is, if there are sufficiently many records in the storage) the batch is removed from the storage immediately before sending it so that wouldn't help if the shutdown intercepts sending, and there is no feedback from |
That could be a listener event, raised by the storage engine when it shuts down (Nemanja's idea). |
Ahh yes, I was only thinking about the queue in memory. Having to also handle the disk queue would indeed make this more complicated. |
The neo Consume request has the facility for the client to initiate a clean shutdown, ensuring that all records that have already been popped from the channel are flushed to the client before the request ends.
There are, however, other situations where records can be lost:
We should consider ways to fix these cases.
(Created here, not in dmqnode, as the fix will likely require both changes to the storage engine and accompanying protocol changes.)
The text was updated successfully, but these errors were encountered: