impr: propagate VirtualFile metrics via RequestContext #7202

problame · 2024-03-21T17:47:17Z

Refs

fixes STORAGE_IO_SIZE metric doesn't use pre-with_label_valued counters #6107

Problem

VirtualFile currently parses the path it is opened with to identify the tenant,shard,timeline labels to be used for the STORAGE_IO_SIZE metric.

Further, for each read or write call to VirtualFile, it uses with_label_values to retrieve the correct metrics object, which under the hood is a global hashmap guarded by a parking_lot mutex.

We perform tens of thousands of reads and writes per second on every pageserver instance; thus, doing the mutex lock + hashmap lookup is wasteful.

Changes

Apply the technque we do for all other timeline-scoped metrics to avoid the repeat with_label_values: add it to TimelineMetrics.

Wrap TimelineMetrics into an Arc.

Propagate the Arc<TimelineMetrics> down do VirtualFile, and use Timeline::metrics::storage_io_size.

We avoid contention on the Arc<TimelineMetrics>'s refcount atomics between different connection handlers for the same timeline.
The technique is to add indirection, using another Arc wrapper.
To avoid frequent allocations, we store that Arc<Arc> inside the per-connection timeline cache.

Preliminary refactorings to enable this change:

Performance

I ran the benchmarks in test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py on an i3en.3xlarge because that's what we currently run them on.

None of the benchmarks shows a meaningful difference in latency or throughput or CPU utilization.

I would have expected some improvement in the many-tenants-one-client-each workload because they all hit that hashmap constantly, and clone the same UintCounter / Arc inside of it.

But apparently the overhead is miniscule compared to the remaining work we do per getpage.

Yet, since the changes are already made, the added complexity is manageable, and the perf overhead of with_label_values demonstable in micro-benchmarks, let's have this change anyway.
Also, propagating TimelineMetrics through RequestContext might come in handy down the line.

Micro-benchmark that demonstrates perf impact of with_label_values, along with other pitfalls and mitigation techniques around the metrics/prometheus crate:

add benchmark demonstrating metrics/prometheus crate multicore scalability pitfalls & workarounds #11019

Alternative Designs

An earlier iteration of this PR stored an Arc<Arc<Timeline>> inside RequestContext.
The problem is that this risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by Timeline.

Ideally, we wouldn't be using this mess of Arc's at all and propagate Rust references instead.
But tokio requires tasks to be 'static, and so, we wouldn't be able to propagate references across task boundaries, which would be incompatible with fan-out type code.
So, we'd have to be propagating Cow<>s instead.

github-actions · 2024-03-21T17:52:40Z

7744 tests run: 7366 passed, 0 failed, 378 skipped (full report)

Flaky tests (1)

Postgres 17

test_migration_to_cold_secondary: release-arm64-without-lfc

Code coverage* (full report)

functions: 32.8% (8655 of 26380 functions)
lines: 48.6% (73263 of 150671 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
5767947 at 2025-02-27T22:10:58.798Z :recycle:}

…t wrong, also detached_child and attached_child don't use ::extend())

…n; page_service's shard swapping turns out to be painful / requires `Handle: Clone`, don't want that

… Scope's refs into another Arc that can be propagated cheaply

…ys go the furthest upstack to minimize cloning of Arc<Timeline> in inner loops

…-metrics-no-hashing

For some reason the layer download API never fully got RequestContext-infected. This PR fixes that as a precursor to - #6107

…ame/virtual-file-metrics-no-hashing

pageserver/src/context.rs

…-metrics-no-hashing

problame · 2025-02-26T19:26:46Z

I extracted a piece of this PR into

refactor(pageserver): propagate RequestContext to layer downloads #11001

That one I definitely want to land.

…-metrics-no-hashing

… ScopeInner

problame · 2025-02-27T16:33:09Z

@arpad-m I was able to remove the blanket Arc<ScopeInner> - we still do the arc_arc for new_timeline(), but no longer for the pagestream case, so, we save one allocation per getpage batch

b6cc0c3

…Open stuff into separate struct

…reference cycles Storing Arc<Timeline> inside RequestContext risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by Timeline. So, we wrap the TimelineMetrics into an Arc and propagate that instead. This makes it easy for future metrics to be access through RequestContext, like we do for storage_io_size here. To make the page_service case happy, where we may be dispatching to a different shard every successive request, and don't want to be cloning from the shared Timeline::metrics on each request, we pre-clone as part of the handle cache miss.

…sk of reference cycles): bring back no alloc for pagestream

…Open stuff into separate struct

…reference cycles Storing Arc<Timeline> inside RequestContext risks reference cycles if the RequestContext gets stored in an object that is owned directly or indirectly by Timeline. So, we wrap the TimelineMetrics into an Arc and propagate that instead. This makes it easy for future metrics to be access through RequestContext, like we do for storage_io_size here. To make the page_service case happy, where we may be dispatching to a different shard every successive request, and don't want to be cloning from the shared Timeline::metrics on each request, we pre-clone as part of the handle cache miss.

…irtual-file-metrics-no-hashing

…a special case (#11030) # Changes While working on - #7202 I found myself needing to cache another expensive Arc::clone inside inside the timeline::handle::Cache by wrapping it in another Arc. Before this PR, it seemed like the only expensive thing we were caching was the connection handler tasks' clone of `Arc<Timeline>`. But in fact the GateGuard was another such thing, but it was special-cased in the implementation. So, this refactoring PR de-special-cases the GateGuard. # Performance With this PR we are doing strictly _less_ operations per `Cache::get`. The reason is that we wrap the entire `Types::Timeline` into one Arc. Before this PR, it was a separate Arc around the Arc<Timeline> and one around the Arc<GateGuard>. With this PR, we avoid an allocation per cached item, namely, the separate Arc around the GateGuard. This PR does not change the amount of shared mutable state. So, all in all, it should be a net positive, albeit probably not noticable with our small non-NUMA instances and generally high CPU usage per request. # Reviewing To understand the refactoring logistics, look at the changes to the unit test types first. Then read the improved module doc comment. Then the remaining changes. In the future, we could rename things to be even more generic. For example, `Types::TenantMgr` could really be a `Types::Resolver`. And `Types::Timeline` should, to avoid constant confusion in the doc comment, be called `Types::Cached` or `Types::Resolved`. Because the `handle` module, after this PR, really doesn't care that we're using it for storing Arc's and GateGuards. Then again, specicifity is sometimes more useful than being generic. And writing the module doc comment in a totally generic way would probably also be more confusing than helpful.

problame changed the base branch from jcsp/storcon-secrets-mk2 to main March 21, 2024 17:47

problame added 2 commits February 24, 2025 14:54

draft impl

5831573

WIP: integrate (this scope() stuff is messy, too many places to get i…

26da696

…t wrong, also detached_child and attached_child don't use ::extend())

problame force-pushed the problame/virtual-file-metrics-no-hashing branch from c8897e5 to 26da696 Compare February 24, 2025 13:55

problame added 2 commits February 24, 2025 17:17

some more WIP on impl

a866402

half-borked attempt at propagating to more parts of the implementatio…

b688d1e

…n; page_service's shard swapping turns out to be painful / requires `Handle: Clone`, don't want that

problame mentioned this pull request Feb 24, 2025

WIP: alternative approach to propagating per-timeline metrics down to VirtualFile #10962

Draft

problame force-pushed the problame/virtual-file-metrics-no-hashing branch from fb63bd1 to b688d1e Compare February 24, 2025 21:07

problame added 21 commits February 24, 2025 22:18

salvage the idea by upgrading the WeakHandle twice, plus wrapping the…

d096da1

… Scope's refs into another Arc that can be propagated cheaply

implement more propagation to see whether this has legs; method: alwa…

24c69f6

…ys go the furthest upstack to minimize cloning of Arc<Timeline> in inner loops

first 15% of tests pass on my mac, let's see what CI says

62ed544

ignore missing timeline scope in unit tests + clippy & cargo fmt fixes

9c2c728

Merge remote-tracking branch 'origin/main' into problame/virtual-file…

6682e2e

…-metrics-no-hashing

don't keep the gate open / require two Handles

abcc319

propagate RequestContext to layer downloads

330e69b

context propagation for detach ancestor

97e8515

more missing timeline scope discovered by test suite

9ffdad3

clippy

6e2690f

fixup(propagate RequestContext to layer downloads)

60c6178

warning should print backtrace

8cbbc40

secondary downloader

f821017

fix more tests

0ebdffa

workaround for missing timeline context in on-demand download

982edfb

more(propagate RequestContext to layer downloads)

2397c44

more(propagate RequestContext to layer downloads)

01c3f55

more(propagate RequestContext to layer downloads)

533fba7

more(propagate RequestContext to layer downloads)

a555f69

Merge remote-tracking branch 'origin/main' into problame/virtual-file…

e30f9a5

…-metrics-no-hashing

refactor(pageserver): propagate RequestContext to layer downloads

6f0c3d6

For some reason the layer download API never fully got RequestContext-infected. This PR fixes that as a precursor to - #6107

Merge branch 'problame/layer-download-context-propagation' into probl…

35388d3

…ame/virtual-file-metrics-no-hashing

problame changed the title ~~WIP: STORAGE_IO_SIZE: avoid with_label_values on each IO~~ impr: propagate STORAGE_IO_SIZE metric via RequestContext Feb 25, 2025

problame changed the title ~~impr: propagate STORAGE_IO_SIZE metric via RequestContext~~ impr: propagate VirtualFile metrics via RequestContext Feb 25, 2025

arpad-m reviewed Feb 26, 2025

View reviewed changes

pageserver/src/context.rs Outdated Show resolved Hide resolved

pageserver/src/context.rs Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into problame/virtual-file…

c341b99

…-metrics-no-hashing

problame added 2 commits February 27, 2025 10:38

Merge remote-tracking branch 'origin/main' into problame/virtual-file…

d1e68ff

…-metrics-no-hashing

fix all the unit tests

5f83705

problame mentioned this pull request Feb 27, 2025

add benchmark demonstrating metrics/prometheus crate multicore scalability pitfalls & workarounds #11019

Open

problame added 2 commits February 27, 2025 14:22

https://github.com/neondatabase/neon/pull/7202#discussion_r1970708706

37b70a9

don't allocate for the pagestream usecase; avoid the Arc<> around the…

b6cc0c3

… ScopeInner

problame added 5 commits February 27, 2025 19:48

fixup: forgot to rename one function

f25b7e5

refactor (extract before merge!): timeline handle KeepingTimelineGate…

935bd87

…Open stuff into separate struct

refactor(handle stuff): push gate business into the TenantManger impl

a0ba1d6

fixup(store Arc<TimelineMetrics> instead of Arc<Timeline> to avoid ri…

ab7271a

…sk of reference cycles): bring back no alloc for pagestream

problame force-pushed the problame/virtual-file-metrics-no-hashing branch from ebe5493 to ab7271a Compare February 27, 2025 20:34

problame added 5 commits February 27, 2025 21:38

don't keep the gate open / require two Handles

f536cf4

refactor (extract before merge!): timeline handle KeepingTimelineGate…

839dd10

…Open stuff into separate struct

wip

3aef56e

refactor(handle stuff): push gate business into the TenantManger impl

a1a8d0d

problame mentioned this pull request Feb 27, 2025

refactor(page_service / timeline::handle): the GateGuard need not be a special case #11030

Merged

problame added 3 commits February 27, 2025 21:56

we actually don't need another Arc around the GateGuard

ecaacdb

review doc comment

1778a3e

Merge branch 'problame/timeline-handle-orthogonality' into problame/v…

5767947

…irtual-file-metrics-no-hashing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

impr: propagate VirtualFile metrics via RequestContext #7202

impr: propagate VirtualFile metrics via RequestContext #7202

problame commented Mar 21, 2024 •

edited

Loading

github-actions bot commented Mar 21, 2024 •

edited

Loading

Postgres 17

problame commented Feb 26, 2025

problame commented Feb 27, 2025

impr: propagate VirtualFile metrics via RequestContext #7202

Are you sure you want to change the base?

impr: propagate VirtualFile metrics via RequestContext #7202

Conversation

problame commented Mar 21, 2024 • edited Loading

Refs

Problem

Changes

Performance

Alternative Designs

github-actions bot commented Mar 21, 2024 • edited Loading

7744 tests run: 7366 passed, 0 failed, 378 skipped (full report)

Postgres 17

Code coverage* (full report)

problame commented Feb 26, 2025

problame commented Feb 27, 2025

problame commented Mar 21, 2024 •

edited

Loading

github-actions bot commented Mar 21, 2024 •

edited

Loading