-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ui): Implement Timeline
lazy-loading
#4594
feat(ui): Implement Timeline
lazy-loading
#4594
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4594 +/- ##
=======================================
Coverage 85.73% 85.73%
=======================================
Files 291 292 +1
Lines 33288 33348 +60
=======================================
+ Hits 28538 28590 +52
- Misses 4750 4758 +8 ☔ View full report in Codecov by Sentry. |
6bb6e8f
to
331113c
Compare
…timeline_item`. This patch drastically improves the performance of `TimelineEventHandler::deduplicate_local_timeline_item`. Before this patch, `rfind_event_item` was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found. The problem is the following: all items are traversed. However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met. This patch updates `TimelineEventHandler::deduplicate_local_timeline_item` to replace to use of `rfind_event_item` by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate. To do so, [`Iterator::try_fold`] is probably the best companion. [`Iterator::try_find`] would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods in `Iterator` are using `try_fold`, like `find` or any other methods that need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here, and does exactly what we need. Our use of `try_fold` is to return a `ControlFlow<Option<(usize, TimelineItem)>, ()>`. After `try_fold`, we call `ControlFlow::break_value`, which returns an `Option`. Hence the need to call `Option::flatten` at the end to get a single `Option` instead of having an `Option<Option<(usize, TimelineItem)>>`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. This `deduplicate_local_timeline_item` method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small. [`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold [`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find
…timeline_item`. This patch drastically improves the performance of `TimelineEventHandler::deduplicate_local_timeline_item`. Before this patch, `rfind_event_item` was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found. The problem is the following: all items are traversed. However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met. This patch updates `TimelineEventHandler::deduplicate_local_timeline_item` to replace to use of `rfind_event_item` by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate. To do so, [`Iterator::try_fold`] is probably the best companion. [`Iterator::try_find`] would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods in `Iterator` are using `try_fold`, like `find` or any other methods that need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here, and does exactly what we need. Our use of `try_fold` is to return a `ControlFlow<Option<(usize, TimelineItem)>, ()>`. After `try_fold`, we call `ControlFlow::break_value`, which returns an `Option`. Hence the need to call `Option::flatten` at the end to get a single `Option` instead of having an `Option<Option<(usize, TimelineItem)>>`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. This `deduplicate_local_timeline_item` method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small. [`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold [`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find
a228029
to
d8a7b2a
Compare
This patch updates `TimelineController::subscribe` to use `VectorSubscriber::into_values_and_batched_stream`. It returns the cloned items along with the stream. It saves the need to call `ObservableItems::clone_items`, thus saving the clone of all items. tl;dr: `clone_items()` clones… items… and `subscribe()` also clones the items. There is 2 clones. With `into_values_and_batched_stream()`, there is 1 clone.
c33b8f4
to
95bf243
Compare
…line_item_index_after`. This patch fixes the performance of `AllRemoteEvents::increment_all_timeline_item_index_after` and `decrment_all_timeline_item_index_after`. It appears that the code was previously iterating over all items. This is a waste of time. This patch updates the code to iterate over all items in reverse order: - if `new_timeline_item_index` is 0, we need to shift all items anyways, so all items must be traversed, the iterator direction doesn't matter, - otherwise, it's unlikely we want to traverse all items: the item has been either inserted or pushed back, so there is no need to traverse the first items; we can also break the iteration as soon as all timeline item index after `new_timeline_item_index` has been updated. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ObservableItems::push_back` method (that uses `AllRemoteEvents::increment_all_timeline_item_index_after`) was taking 7% of the whole execution time. With this patch, it takes 0.7%.
…timeline_item`. This patch drastically improves the performance of `TimelineEventHandler::deduplicate_local_timeline_item`. Before this patch, `rfind_event_item` was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found. The problem is the following: all items are traversed. However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met. This patch updates `TimelineEventHandler::deduplicate_local_timeline_item` to replace to use of `rfind_event_item` by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate. To do so, [`Iterator::try_fold`] is probably the best companion. [`Iterator::try_find`] would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods in `Iterator` are using `try_fold`, like `find` or any other methods that need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here, and does exactly what we need. Our use of `try_fold` is to return a `ControlFlow<Option<(usize, TimelineItem)>, ()>`. After `try_fold`, we call `ControlFlow::break_value`, which returns an `Option`. Hence the need to call `Option::flatten` at the end to get a single `Option` instead of having an `Option<Option<(usize, TimelineItem)>>`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. This `deduplicate_local_timeline_item` method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small. [`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold [`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find
…line_item_index_after`. This patch fixes the performance of `AllRemoteEvents::increment_all_timeline_item_index_after` and `decrment_all_timeline_item_index_after`. It appears that the code was previously iterating over all items. This is a waste of time. This patch updates the code to iterate over all items in reverse order: - if `new_timeline_item_index` is 0, we need to shift all items anyways, so all items must be traversed, the iterator direction doesn't matter, - otherwise, it's unlikely we want to traverse all items: the item has been either inserted or pushed back, so there is no need to traverse the first items; we can also break the iteration as soon as all timeline item index after `new_timeline_item_index` has been updated. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ObservableItems::push_back` method (that uses `AllRemoteEvents::increment_all_timeline_item_index_after`) was taking 7% of the whole execution time. With this patch, it takes 0.7%.
…timeline_item`. This patch drastically improves the performance of `TimelineEventHandler::deduplicate_local_timeline_item`. Before this patch, `rfind_event_item` was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found. The problem is the following: all items are traversed. However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met. This patch updates `TimelineEventHandler::deduplicate_local_timeline_item` to replace to use of `rfind_event_item` by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate. To do so, [`Iterator::try_fold`] is probably the best companion. [`Iterator::try_find`] would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods in `Iterator` are using `try_fold`, like `find` or any other methods that need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here, and does exactly what we need. Our use of `try_fold` is to return a `ControlFlow<Option<(usize, TimelineItem)>, ()>`. After `try_fold`, we call `ControlFlow::break_value`, which returns an `Option`. Hence the need to call `Option::flatten` at the end to get a single `Option` instead of having an `Option<Option<(usize, TimelineItem)>>`. I'm testing this patch with the `test_lazy_back_pagination` test in #4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. This `deduplicate_local_timeline_item` method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small. [`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold [`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find
…line_item_index_after`. This patch fixes the performance of `AllRemoteEvents::increment_all_timeline_item_index_after` and `decrment_all_timeline_item_index_after`. It appears that the code was previously iterating over all items. This is a waste of time. This patch updates the code to iterate over all items in reverse order: - if `new_timeline_item_index` is 0, we need to shift all items anyways, so all items must be traversed, the iterator direction doesn't matter, - otherwise, it's unlikely we want to traverse all items: the item has been either inserted or pushed back, so there is no need to traverse the first items; we can also break the iteration as soon as all timeline item index after `new_timeline_item_index` has been updated. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ObservableItems::push_back` method (that uses `AllRemoteEvents::increment_all_timeline_item_index_after`) was taking 7% of the whole execution time. With this patch, it takes 0.7%.
…line_item_index_after`. This patch fixes the performance of `AllRemoteEvents::increment_all_timeline_item_index_after` and `decrment_all_timeline_item_index_after`. It appears that the code was previously iterating over all items. This is a waste of time. This patch updates the code to iterate over all items in reverse order: - if `new_timeline_item_index` is 0, we need to shift all items anyways, so all items must be traversed, the iterator direction doesn't matter, - otherwise, it's unlikely we want to traverse all items: the item has been either inserted or pushed back, so there is no need to traverse the first items; we can also break the iteration as soon as all timeline item index after `new_timeline_item_index` has been updated. I'm testing this patch with the `test_lazy_back_pagination` test in #4594. With 10_000 events in the sync, the `ObservableItems::push_back` method (that uses `AllRemoteEvents::increment_all_timeline_item_index_after`) was taking 7% of the whole execution time. With this patch, it takes 0.7%.
95bf243
to
405e834
Compare
This patch improves the performance of `ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls `remove_old_receipt` and `add_new_receipt`. Both of them need an timeline item position. Until this patch, `rfind_event_by_id` was used and was the bottleneck. The improvement is twofold as is as follows. First off, when collecting data to create `ReadReceiptTimelineUpdate`, the timeline item position can be known ahead of time by using `EventMeta::timeline_item_index`. This data is not always available, for example if the timeline item isn't created yet. But let's try to collect these data if there are some. Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use the timeline item position collected from `EventMeta` if it exists. Otherwise, let's fallback to a similar `rfind_event_by_id` pattern, without using intermediate types. It's more straightforward here: we don't need an `EventTimelineItemWithId`, we only need the position. Once the position is known, it is stored in `Self` (!), this is the biggest improvement here. Le't see why. Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use the timeline item position collected from `EventMeta` if it exists, similarly to what `remove_old_receipt` does. Otherwise, let's fallback to an iterator to find the position. However, instead of iterating over **all** items, we can skip the first ones, up to the position of the timeline item holding the old receipt, so up to the position found by `remove_old_receipt`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ReadReceipts::maybe_update_read_receipt` method was taking 52% of the whole execution time. With this patch, it takes 8.1%.
This patch improves the performance of `ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls `remove_old_receipt` and `add_new_receipt`. Both of them need an timeline item position. Until this patch, `rfind_event_by_id` was used and was the bottleneck. The improvement is twofold as is as follows. First off, when collecting data to create `ReadReceiptTimelineUpdate`, the timeline item position can be known ahead of time by using `EventMeta::timeline_item_index`. This data is not always available, for example if the timeline item isn't created yet. But let's try to collect these data if there are some. Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use the timeline item position collected from `EventMeta` if it exists. Otherwise, let's fallback to a similar `rfind_event_by_id` pattern, without using intermediate types. It's more straightforward here: we don't need an `EventTimelineItemWithId`, we only need the position. Once the position is known, it is stored in `Self` (!), this is the biggest improvement here. Le't see why. Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use the timeline item position collected from `EventMeta` if it exists, similarly to what `remove_old_receipt` does. Otherwise, let's fallback to an iterator to find the position. However, instead of iterating over **all** items, we can skip the first ones, up to the position of the timeline item holding the old receipt, so up to the position found by `remove_old_receipt`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ReadReceipts::maybe_update_read_receipt` method was taking 52% of the whole execution time. With this patch, it takes 8.1%.
405e834
to
2d056c2
Compare
…on`. This patch improves the performance of `RoomEvents::maybe_apply_new_redaction`. This method deserialises all the events it receives, entirely. If the event is not an `m.room.redaction`, then the method returns early. Most of the time, the event is deserialised for nothing because most events aren't of kind `m.room.redaction`! This patch first uses `Raw::get_field("type")` to detect the type of the event. If it's a `m.room.redaction`, then the event is entirely deserialized, otherwise the method returns. When running the `test_lazy_back_pagination` from matrix-org#4594 with 10'000 events, prior to this patch, this method takes 11% of the execution time. With this patch, this method takes 2.5%.
…on`. This patch improves the performance of `RoomEvents::maybe_apply_new_redaction`. This method deserialises all the events it receives, entirely. If the event is not an `m.room.redaction`, then the method returns early. Most of the time, the event is deserialised for nothing because most events aren't of kind `m.room.redaction`! This patch first uses `Raw::get_field("type")` to detect the type of the event. If it's a `m.room.redaction`, then the event is entirely deserialized, otherwise the method returns. When running the `test_lazy_back_pagination` from matrix-org#4594 with 10'000 events, prior to this patch, this method takes 11% of the execution time. With this patch, this method takes 2.5%.
…on`. This patch improves the performance of `RoomEvents::maybe_apply_new_redaction`. This method deserialises all the events it receives, entirely. If the event is not an `m.room.redaction`, then the method returns early. Most of the time, the event is deserialised for nothing because most events aren't of kind `m.room.redaction`! This patch first uses `Raw::get_field("type")` to detect the type of the event. If it's a `m.room.redaction`, then the event is entirely deserialized, otherwise the method returns. When running the `test_lazy_back_pagination` from #4594 with 10'000 events, prior to this patch, this method takes 11% of the execution time. With this patch, this method takes 2.5%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm and I confirm that it's working properly on element x ios.
if self.controller.is_live().await { | ||
match self.controller.live_lazy_paginate_backwards(num_events).await { | ||
Some(needed_num_events) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it maybe be better to have live_lazy_paginate_backwards
return the right type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by the right type?
Cargo.toml
Outdated
eyeball = { git = "https://github.com/Hywan/eyeball", branch = "feat-im-util-skip", features = ["tracing"] } | ||
eyeball-im = { git = "https://github.com/Hywan/eyeball", branch = "feat-im-util-skip", features = ["tracing"] } | ||
eyeball-im-util = { git = "https://github.com/Hywan/eyeball", branch = "feat-im-util-skip" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably work on these patches from a matrix-org organisation fork as opposed to your personal one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! My patch on eyeball has been merged, and I'm about to make a release of eyeball-im-util
, which would remove the need for this change in our Cargo.toml
file.
count: SharedObservable<usize>, | ||
} | ||
|
||
impl SkipCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments on this file are just 👌 !!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
This patch improves the performance of `ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls `remove_old_receipt` and `add_new_receipt`. Both of them need an timeline item position. Until this patch, `rfind_event_by_id` was used and was the bottleneck. The improvement is twofold as is as follows. First off, when collecting data to create `ReadReceiptTimelineUpdate`, the timeline item position can be known ahead of time by using `EventMeta::timeline_item_index`. This data is not always available, for example if the timeline item isn't created yet. But let's try to collect these data if there are some. Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use the timeline item position collected from `EventMeta` if it exists. Otherwise, let's fallback to a similar `rfind_event_by_id` pattern, without using intermediate types. It's more straightforward here: we don't need an `EventTimelineItemWithId`, we only need the position. Once the position is known, it is stored in `Self` (!), this is the biggest improvement here. Le't see why. Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use the timeline item position collected from `EventMeta` if it exists, similarly to what `remove_old_receipt` does. Otherwise, let's fallback to an iterator to find the position. However, instead of iterating over **all** items, we can skip the first ones, up to the position of the timeline item holding the old receipt, so up to the position found by `remove_old_receipt`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ReadReceipts::maybe_update_read_receipt` method was taking 52% of the whole execution time. With this patch, it takes 8.1%.
This patch improves the performance of `ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls `remove_old_receipt` and `add_new_receipt`. Both of them need an timeline item position. Until this patch, `rfind_event_by_id` was used and was the bottleneck. The improvement is twofold as is as follows. First off, when collecting data to create `ReadReceiptTimelineUpdate`, the timeline item position can be known ahead of time by using `EventMeta::timeline_item_index`. This data is not always available, for example if the timeline item isn't created yet. But let's try to collect these data if there are some. Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use the timeline item position collected from `EventMeta` if it exists. Otherwise, let's fallback to a similar `rfind_event_by_id` pattern, without using intermediate types. It's more straightforward here: we don't need an `EventTimelineItemWithId`, we only need the position. Once the position is known, it is stored in `Self` (!), this is the biggest improvement here. Le't see why. Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use the timeline item position collected from `EventMeta` if it exists, similarly to what `remove_old_receipt` does. Otherwise, let's fallback to an iterator to find the position. However, instead of iterating over **all** items, we can skip the first ones, up to the position of the timeline item holding the old receipt, so up to the position found by `remove_old_receipt`. I'm testing this patch with the `test_lazy_back_pagination` test in #4594. With 10_000 events in the sync, the `ReadReceipts::maybe_update_read_receipt` method was taking 52% of the whole execution time. With this patch, it takes 8.1%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for the extensive tests around the skip count. Thanks for implementing this, it will be a sweet improvement over what we have right now and will make the storage more usable in the real world 🫶
I think it might be interesting to write a test showing the known limitations, esp. with respect to gaps introduced after the skip count, as we discussed privately. (For instance, a situation where the room cache contains [e1, e2, gap, e3], the timeline shows [e2, e3] because of a skip count=1, but it will have to get back to e1 before filling the gap after e2).
This patch moves the `TimelineStream` type into the new `subscriber` module. The idea is to add more code related to subscribers in the next patches.
This patch renames `TimelineStream` to `TimelineWithDropHandle`, as the former name was too vague and was not clarify what the type was doing. The new name makes it clear that it attaches a `TimelineDropHandle` to a subscriber (since it is part of the `subscriber` module!).
…`Skip`. This patch adds functions like `compute_next`, `compute_next_when_paginating_backwards` and `compute_next_when_paginating_forwards`, which are necessary to correctly compute the `count` value for the `Skip` higher-order stream. This patch adds the associated test suite.
This patch adds the `subscriber_skip_count` field to `TimelineMetadata`. It's going to be used to define the `count` value of the `Skip` higher-order stream that is going to be applied to the `Timeline` subscriber.
This patch applies the `Skip` higher-order stream on the `Timeline` subscriber. The method `TimelineController::subscribe_batched` is renamed `subscribe_batched_and_limited` at the same time. The `Skip` stream uses the `TimelineMetadata::subscriber_skip_count` observable value as the `count_stream`. The initial count value is 0. No test needs to be changed so far.
This patch updates `TimelineStateTransaction::commit` to adjust the count value of the `Skip` higher-order stream. This patch also creates `TimelineStateTransaction::new` to simplify the creation of a state transaction.
This patch updates the pagination mechanism of the `Timeline` to support lazy backwards pagination. `Timeline::paginate_backwards` already does different things whether the timeline focus is live or focused. When it's live, the method will first try to paginate backwards lazily, by adjusting the `count` value of the `Skip` stream used by the `Timeline::subscribe` method. If there is not enough items to provide, the greedy pagination will run.
This patch renames the (test only) `Timeline::subscribe` method to `Timeline::subscribe_raw`.
This patch gathers the logic of the `Timeline::subscribe` into a single type: `TimelineSubscriber`. The `TimelineController::subscribe_batched_and_limited` method is renamed `subscribe` to match `Timeline::subscribe`. Things are simpler to apprehend. The `TimelineSubscriber` type configures the subscriber/stream in a single place. It takes an `&ObservableItems` and a `&SkipCount`, and configures everything. It also provides a single place to document the behaviour of the subscriber, with the `Skip` higher-order stream.
This patch fixes an issue where 0 was used as the initial value for the `Skip` higher-order stream in the `TimelineSubscriber`. This is wrong, as the `SkipCount` value may have been modified before the `TimelineSubscriber` is created. This patch provides a test to reproduce the problem.
760e75c
to
69bcdce
Compare
This problem exists with or without the |
Timeline
lazy paginationTimeline
lazy-loading
This patch improves the performance of `ReadReceiptTimelineUpdate::apply`, which does 2 things: it calls `remove_old_receipt` and `add_new_receipt`. Both of them need an timeline item position. Until this patch, `rfind_event_by_id` was used and was the bottleneck. The improvement is twofold as is as follows. First off, when collecting data to create `ReadReceiptTimelineUpdate`, the timeline item position can be known ahead of time by using `EventMeta::timeline_item_index`. This data is not always available, for example if the timeline item isn't created yet. But let's try to collect these data if there are some. Second, inside `ReadReceiptTimelineUpdate::remove_old_receipt`, we use the timeline item position collected from `EventMeta` if it exists. Otherwise, let's fallback to a similar `rfind_event_by_id` pattern, without using intermediate types. It's more straightforward here: we don't need an `EventTimelineItemWithId`, we only need the position. Once the position is known, it is stored in `Self` (!), this is the biggest improvement here. Le't see why. Finally, inside `ReadReceiptTimelineUpdate::add_new_receipt`, we use the timeline item position collected from `EventMeta` if it exists, similarly to what `remove_old_receipt` does. Otherwise, let's fallback to an iterator to find the position. However, instead of iterating over **all** items, we can skip the first ones, up to the position of the timeline item holding the old receipt, so up to the position found by `remove_old_receipt`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the `ReadReceipts::maybe_update_read_receipt` method was taking 52% of the whole execution time. With this patch, it takes 8.1%.
A red fox, relaxing under the snow (credits roeselienraimond.com)
These patches implement lazy pagination for a live
Timeline
. The idea is thatTimeline::subscribe()
keeps returning aStream
but theSkip
higher-order stream is applied to skip the first items, exactly thecount
first items:Timeline
, they will appear in the subscriber's stream (if they are not part of the skipped items, seeSkip
's documentation to learn more).Timeline
paginates backwards, theSkip
'scount
value decreases, revealing more items if there is such. When thecount
reaches 0, it fallbacks to the event cache pagination.For the moment, the event cache reads all events from its storage and sends them to the
Timeline
. A follow up of this pull request is to prevent the event cache to read all events, but to read events chunk by chunk. Thus, theTimeline
do a lazy pagination: whencount
reaches zero, it fallbacks to the event cache, which will load another chunk if possible, otherwise it will fallback to the network. See the documentation in this pull request to learn more. This is a general overview of the mechanism.First off, the first patches do a bit of clean up (removing clones, move and rename
TimelineStream
into a newsubscriber
module).After that, patches create a
SkipCount
type, and use it inside theTimeline::subscribe()
flow.Finally, the last patches add integration tests. Unit tests are already written.
EventCache
storage #3280