-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(logstore): provide more opportunities to read #20546
base: main
Are you sure you want to change the base?
Conversation
@@ -144,6 +150,11 @@ struct FlushedChunkInfo { | |||
} | |||
|
|||
enum WriteFuture<S: LocalStateStore> { | |||
Paused { | |||
duration: Duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should store the sleep future instead of the duration. The sleep future is created in the next_event
call and only store in the future returned from next_event
. However, the next_event
future can be easily dropped because of select
, and every time the read future gets ready, the next_event
future will be created again, and the sleep time is reset. Under this circumstance, the write future will easily starved by the read future.
); | ||
// If buffer 90% full, pause the stream for a while, let downstream do some processing | ||
// to avoid flushing. | ||
if buffer.buffer.len() >= self.buffer_size * 9 / 10 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd better determine whether to pause by the number of rows rather than the item count.
More importantly, we shouldn't always pause the write future when the buffer is almost full, because in this way, the slowness in the reader side will always block the writer and the up-down stream won't be decoupled.
In my rough design, we should only pause when we transition from a clean in-memory state to flushed state. The clean in-memory state means when no pending data is in storage, and all chunks can be retrieved from the buffer without reading from storage, and the second state is all other circumstances.
More specifically, we may pause for a while only when we were previously in clean in-memory state, and in either of the following scenarios:
- when receive a chunk and will write a chunk to storage
- when receive a checkpoint barrier and going to write all unflushed chunks to storage.
When this happens and we may store the item we receive when we pause, and then after the pause sleep, re-apply the item to buffer and storage.
And when we are not in the clean in-memory state, we don't have to pause.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
At the writer future sleep briefly after:
Checklist
Documentation
Release note