-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impr: propagate VirtualFile metrics via RequestContext #7202
Draft
problame
wants to merge
44
commits into
main
Choose a base branch
from
problame/virtual-file-metrics-no-hashing
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7744 tests run: 7366 passed, 0 failed, 378 skipped (full report)Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
5767947 at 2025-02-27T22:10:58.798Z :recycle: |
…t wrong, also detached_child and attached_child don't use ::extend())
c8897e5
to
26da696
Compare
…n; page_service's shard swapping turns out to be painful / requires `Handle: Clone`, don't want that
fb63bd1
to
b688d1e
Compare
… Scope's refs into another Arc that can be propagated cheaply
…ys go the furthest upstack to minimize cloning of Arc<Timeline> in inner loops
…-metrics-no-hashing
…-metrics-no-hashing
For some reason the layer download API never fully got RequestContext-infected. This PR fixes that as a precursor to - #6107
…ame/virtual-file-metrics-no-hashing
arpad-m
reviewed
Feb 26, 2025
…-metrics-no-hashing
I extracted a piece of this PR into That one I definitely want to land. |
…Open stuff into separate struct
…reference cycles Storing Arc<Timeline> inside RequestContext risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by Timeline. So, we wrap the TimelineMetrics into an Arc and propagate that instead. This makes it easy for future metrics to be access through RequestContext, like we do for storage_io_size here. To make the page_service case happy, where we may be dispatching to a different shard every successive request, and don't want to be cloning from the shared Timeline::metrics on each request, we pre-clone as part of the handle cache miss.
…sk of reference cycles): bring back no alloc for pagestream
ebe5493
to
ab7271a
Compare
…Open stuff into separate struct
…reference cycles Storing Arc<Timeline> inside RequestContext risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by Timeline. So, we wrap the TimelineMetrics into an Arc and propagate that instead. This makes it easy for future metrics to be access through RequestContext, like we do for storage_io_size here. To make the page_service case happy, where we may be dispatching to a different shard every successive request, and don't want to be cloning from the shared Timeline::metrics on each request, we pre-clone as part of the handle cache miss.
github-merge-queue bot
pushed a commit
that referenced
this pull request
Feb 28, 2025
…a special case (#11030) # Changes While working on - #7202 I found myself needing to cache another expensive Arc::clone inside inside the timeline::handle::Cache by wrapping it in another Arc. Before this PR, it seemed like the only expensive thing we were caching was the connection handler tasks' clone of `Arc<Timeline>`. But in fact the GateGuard was another such thing, but it was special-cased in the implementation. So, this refactoring PR de-special-cases the GateGuard. # Performance With this PR we are doing strictly _less_ operations per `Cache::get`. The reason is that we wrap the entire `Types::Timeline` into one Arc. Before this PR, it was a separate Arc around the Arc<Timeline> and one around the Arc<GateGuard>. With this PR, we avoid an allocation per cached item, namely, the separate Arc around the GateGuard. This PR does not change the amount of shared mutable state. So, all in all, it should be a net positive, albeit probably not noticable with our small non-NUMA instances and generally high CPU usage per request. # Reviewing To understand the refactoring logistics, look at the changes to the unit test types first. Then read the improved module doc comment. Then the remaining changes. In the future, we could rename things to be even more generic. For example, `Types::TenantMgr` could really be a `Types::Resolver`. And `Types::Timeline` should, to avoid constant confusion in the doc comment, be called `Types::Cached` or `Types::Resolved`. Because the `handle` module, after this PR, really doesn't care that we're using it for storing Arc's and GateGuards. Then again, specicifity is sometimes more useful than being generic. And writing the module doc comment in a totally generic way would probably also be more confusing than helpful.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Refs
with_label_value
d counters #6107Problem
VirtualFile
currently parses the path it is opened with to identify thetenant,shard,timeline
labels to be used for theSTORAGE_IO_SIZE
metric.Further, for each read or write call to VirtualFile, it uses
with_label_values
to retrieve the correct metrics object, which under the hood is a global hashmap guarded by a parking_lot mutex.We perform tens of thousands of reads and writes per second on every pageserver instance; thus, doing the mutex lock + hashmap lookup is wasteful.
Changes
Apply the technque we do for all other timeline-scoped metrics to avoid the repeat
with_label_values
: add it toTimelineMetrics
.Wrap
TimelineMetrics
into anArc
.Propagate the
Arc<TimelineMetrics>
down doVirtualFile
, and useTimeline::metrics::storage_io_size
.We avoid contention on the
Arc<TimelineMetrics>
's refcount atomics between different connection handlers for the same timeline.The technique is to add indirection, using another Arc wrapper.
To avoid frequent allocations, we store that Arc<Arc> inside the per-connection timeline cache.
Preliminary refactorings to enable this change:
Performance
I ran the benchmarks in
test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py
on ani3en.3xlarge
because that's what we currently run them on.None of the benchmarks shows a meaningful difference in latency or throughput or CPU utilization.
I would have expected some improvement in the many-tenants-one-client-each workload because they all hit that hashmap constantly, and clone the same
UintCounter
/Arc
inside of it.But apparently the overhead is miniscule compared to the remaining work we do per getpage.
Yet, since the changes are already made, the added complexity is manageable, and the perf overhead of
with_label_values
demonstable in micro-benchmarks, let's have this change anyway.Also, propagating TimelineMetrics through RequestContext might come in handy down the line.
Micro-benchmark that demonstrates perf impact of
with_label_values
, along with other pitfalls and mitigation techniques around themetrics
/prometheus
crate:metrics
/prometheus
crate multicore scalability pitfalls & workarounds #11019Alternative Designs
An earlier iteration of this PR stored an
Arc<Arc<Timeline>>
insideRequestContext
.The problem is that this risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by
Timeline
.Ideally, we wouldn't be using this mess of Arc's at all and propagate Rust references instead.
But tokio requires tasks to be
'static
, and so, we wouldn't be able to propagate references across task boundaries, which would be incompatible with fan-out type code.So, we'd have to be propagating Cow<>s instead.