[exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently #36524

ArthurSens · 2024-11-25T18:50:25Z

Description

Fixes a data race when batching time series during export. The problem was that multiple go routines could call PushMetrics simultaneously. Each time it was, we would re-use the state address. This address could be written without any locks or atomic operations[1][2][3].

Link to tracking issue

The problem was discovered during the review period at #35184

Testing

The first commit adds a test that calls PushMetrics by several go routines. The second commit fixes the test.

Signed-off-by: Arthur Silva Sens <[email protected]>

ArthurSens · 2024-11-25T18:51:40Z

I chose atomic instead of mutexes because locks are usually slower.

Int64 was choose at random, I'm not sure which size to pick 😅

Signed-off-by: Arthur Silva Sens <[email protected]>

dashpole · 2024-11-25T20:14:52Z

I chose atomic instead of mutexes because locks are usually slower.

Are you up for benchmarking that? I think we have benchmarks.

Aneurysm9 · 2024-11-26T17:34:15Z

exporter/prometheusremotewriteexporter/exporter.go

@@ -229,7 +228,7 @@ func (prwe *prwExporter) handleExport(ctx context.Context, tsMap map[string]*pro
 	}

 	// Calls the helper function to convert and batch the TsMap to the desired format
-	requests, err := batchTimeSeries(tsMap, prwe.maxBatchSizeBytes, m, &prwe.batchTimeSeriesState)
+	requests, err := batchTimeSeries(tsMap, prwe.maxBatchSizeBytes, m, prwe.batchTimeSeriesState)


Does this actually fix the problem? If the single *batchTimeSeriesState value is used by multiple goroutines calling prwe.PushMetrics() concurrently it seems that moving to atomic integer access will certainly avoid data races that would be detected by the runtime but wouldn't necessarily make the changes to that state valid. What happens if there are multiple batches processed concurrently that have significantly different sizes?

What happens if there are multiple batches processed concurrently that have significantly different sizes?

Good point; they would still share the same state, and their results would be conflicting. I think I need to go back to the drawing board and think a bit more about how to solve this

I'm rereading the code, and my understanding is that concurrent requests with very distinct batch sizes would constantly fight for the size of the subsequent request.

Can we even do something useful with the batchStateSize if we allow multiple workers? It sounds like this optimization only works for a single worker scenario 🤔

I'm rereading the code, and my understanding is that concurrent requests with very distinct batch sizes would constantly fight for the size of the subsequent request.

That's how I understand it, as well. I also think it likely that each of the three sizes tracked by this state would be decorrelated, though I'm not sure that's any more problematic.

Can we even do something useful with the batchStateSize if we allow multiple workers? It sounds like this optimization only works for a single worker scenario 🤔

I'm not sure this optimization is safe with multiple workers. Would it make more sense to use a sync.Pool of backing stores that can eventually grow to the needed size and get periodically reaped to avoid one-off large batches causing leaks? Similar to what is done in #35184?

Using the sync.Pool sounds like worth exploring! We could also remove the state altogether and see how bad the benchmarks will look.

I'm trying things out and running benchmarks, I'll open new PRs once I have something to show :)

Signed-off-by: Arthur Silva Sens <[email protected]>

ArthurSens · 2024-11-26T18:26:07Z

I did some benchmarks comparing Atomic vs Mutex, but probably not relevant anymore

I chose atomic instead of mutexes because locks are usually slower.

Are you up for benchmarking that? I think we have benchmarks.

Yep, so I'm comparing two branches in this benchmark. The first column is this branch, and the second column uses mutexes.

arthursens$ benchstat base=atomic.txt new=mutex.txt 
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter
cpu: Apple M2 Pro
                          │     base     │                new                │
                          │    sec/op    │   sec/op     vs base              │
Execute/numSample=100-2     47.47µ ±  3%   47.00µ ± 6%       ~ (p=0.093 n=6)
Execute/numSample=1000-2    169.3µ ± 15%   163.7µ ± 8%       ~ (p=0.310 n=6)
Execute/numSample=10000-2   1.326m ± 10%   1.408m ± 5%       ~ (p=0.240 n=6)
geomean                     220.1µ         221.3µ       +0.54%

                          │     base     │                new                 │
                          │     B/op     │     B/op      vs base              │
Execute/numSample=100-2     6.081Ki ± 0%   6.081Ki ± 0%       ~ (p=1.000 n=6)
Execute/numSample=1000-2    22.12Ki ± 0%   22.12Ki ± 0%       ~ (p=0.675 n=6)
Execute/numSample=10000-2   38.52Ki ± 0%   38.51Ki ± 0%       ~ (p=0.121 n=6)
geomean                     17.30Ki        17.30Ki       -0.01%

                          │    base    │                new                 │
                          │ allocs/op  │ allocs/op   vs base                │
Execute/numSample=100-2     79.00 ± 0%   79.00 ± 0%       ~ (p=1.000 n=6) ¹
Execute/numSample=1000-2    83.00 ± 0%   83.00 ± 0%       ~ (p=1.000 n=6) ¹
Execute/numSample=10000-2   83.00 ± 0%   83.00 ± 0%       ~ (p=1.000 n=6) ¹
geomean                     81.64        81.64       +0.00%
¹ all samples are equal

The main difference between mutex and atomic operations is that the mutex blocks the whole batching operation for a few nanoseconds, so we never try to write the same memory address simultaneously. Atomic uses CAS, ensuring we only overwrite a memory address if another thread has not overwritten the value.

The results don't show much difference 🤷 , so it's just a matter of preference about how the code looks.

If we compare Atomic with main, there's not a remarkable difference as well.

arthursens$ benchstat base=main.txt new=atomic.txt 
goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter
cpu: Apple M2 Pro
                          │     base     │                new                 │
                          │    sec/op    │    sec/op     vs base              │
Execute/numSample=100-2     47.28µ ±  1%   47.47µ ±  3%       ~ (p=0.485 n=6)
Execute/numSample=1000-2    168.3µ ±  6%   169.3µ ± 15%       ~ (p=0.937 n=6)
Execute/numSample=10000-2   1.447m ± 10%   1.326m ± 10%       ~ (p=0.240 n=6)
geomean                     225.8µ         220.1µ        -2.52%

                          │     base     │                new                 │
                          │     B/op     │     B/op      vs base              │
Execute/numSample=100-2     6.081Ki ± 0%   6.081Ki ± 0%       ~ (p=1.000 n=6)
Execute/numSample=1000-2    22.12Ki ± 0%   22.12Ki ± 0%       ~ (p=0.513 n=6)
Execute/numSample=10000-2   38.50Ki ± 0%   38.52Ki ± 0%       ~ (p=0.394 n=6)
geomean                     17.30Ki        17.30Ki       +0.02%

                          │    base    │                new                 │
                          │ allocs/op  │ allocs/op   vs base                │
Execute/numSample=100-2     79.00 ± 0%   79.00 ± 0%       ~ (p=1.000 n=6) ¹
Execute/numSample=1000-2    83.00 ± 0%   83.00 ± 0%       ~ (p=1.000 n=6) ¹
Execute/numSample=10000-2   83.00 ± 0%   83.00 ± 0%       ~ (p=1.000 n=6) ¹
geomean                     81.64        81.64       +0.00%
¹ all samples are equal

ArthurSens · 2024-11-29T13:09:35Z

Ok, I've created two alternative PRs to this one:

[exporter/prometheusremotewrite] Fix data race for batch state by removing the state altogether #36600
[exporter]prometheusremotewrite] Fix data race by introducing pool of batch state #36601

I couldn't see any relevant differences in the benchmarks, so I'd love to understand if I did anything wrong

github-actions · 2024-12-14T05:21:10Z

This PR was marked stale due to lack of activity. It will be closed in 14 days.

ArthurSens · 2024-12-16T20:19:31Z

Superseded by #36601

@edma2

… batch state (#36601) #### Description This is an alternative for #36524 and #36600 This PR does a couple of things: * Add a test written by @edma2 that shows a data race to the batch state when running multiple consumers. * Add a benchmark for PushMetrics, with options to run with a stable number of metrics or varying metrics. * Fix the data race by introducing a `sync.Pool` of batch states. #### Benchmark results results comparing `main`, #36600 and this PR: ```console arthursens$ benchstat main.txt withoutState.txt syncpool.txt goos: darwin goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter cpu: Apple M2 Pro │ main.txt │ withoutState.txt │ syncpool.txt │ │ sec/op │ sec/op vs base │ sec/op vs base │ PushMetricsVaryingMetrics-2 8.066m ± 5% 13.821m ± 9% +71.36% (p=0.002 n=6) 8.316m ± 6% ~ (p=0.065 n=6) │ main.txt │ withoutState.txt │ syncpool.txt │ │ B/op │ B/op vs base │ B/op vs base │ PushMetricsVaryingMetrics-2 5.216Mi ± 0% 34.436Mi ± 0% +560.17% (p=0.002 n=6) 5.548Mi ± 0% +6.36% (p=0.002 n=6) │ main.txt │ withoutState.txt │ syncpool.txt │ │ allocs/op │ allocs/op vs base │ allocs/op vs base │ PushMetricsVaryingMetrics-2 56.02k ± 0% 56.05k ± 0% ~ (p=0.721 n=6) 56.04k ± 0% ~ (p=0.665 n=6) ``` --------- Signed-off-by: Arthur Silva Sens <[email protected]>

Add test to show data race in batchTimeSeries()

2052bd1

Signed-off-by: Arthur Silva Sens <[email protected]>

ArthurSens requested review from dashpole and a team as code owners November 25, 2024 18:50

github-actions bot assigned dashpole Nov 25, 2024

github-actions bot added the exporter/prometheusremotewrite label Nov 25, 2024

github-actions bot requested review from Aneurysm9 and rapphil November 25, 2024 18:50

ArthurSens mentioned this pull request Nov 25, 2024

[prometheusremotewriteexporter] reduce allocations in createAttributes #35184

Closed

ArthurSens force-pushed the prwexporter-batchSeries-concurrencybug branch from 910ad54 to 54b9015 Compare November 25, 2024 18:52

ArthurSens changed the title ~~Prwexporter batch series concurrencybug~~ [exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently Nov 25, 2024

ArthurSens mentioned this pull request Nov 25, 2024

feat/re-allow multiple workers #36134

Open

ArthurSens force-pushed the prwexporter-batchSeries-concurrencybug branch from 54b9015 to c79b58c Compare November 25, 2024 18:58

bugfix: Fix data race in batchTimeSeries state

e41c6a2

Signed-off-by: Arthur Silva Sens <[email protected]>

ArthurSens force-pushed the prwexporter-batchSeries-concurrencybug branch from c79b58c to e41c6a2 Compare November 25, 2024 19:21

Aneurysm9 reviewed Nov 26, 2024

View reviewed changes

ArthurSens added a commit to ArthurSens/opentelemetry-collector-contrib that referenced this pull request Nov 26, 2024

open-telemetry#36524 but with mutex

b40881d

Signed-off-by: Arthur Silva Sens <[email protected]>

github-actions bot added the Stale label Dec 14, 2024

ArthurSens closed this Dec 16, 2024

ArthurSens deleted the prwexporter-batchSeries-concurrencybug branch December 16, 2024 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently #36524

[exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently #36524

ArthurSens commented Nov 25, 2024

ArthurSens commented Nov 25, 2024

dashpole commented Nov 25, 2024

Aneurysm9 Nov 26, 2024

ArthurSens Nov 26, 2024

ArthurSens Nov 26, 2024

Aneurysm9 Nov 26, 2024

ArthurSens Nov 28, 2024

ArthurSens commented Nov 26, 2024

ArthurSens commented Nov 29, 2024

github-actions bot commented Dec 14, 2024

ArthurSens commented Dec 16, 2024

[exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently #36524

[exporter/prometheusremotewrite] Fix data race in batch series state if called concurrently #36524

Conversation

ArthurSens commented Nov 25, 2024

Description

Link to tracking issue

Testing

ArthurSens commented Nov 25, 2024

dashpole commented Nov 25, 2024

Aneurysm9 Nov 26, 2024

Choose a reason for hiding this comment

ArthurSens Nov 26, 2024

Choose a reason for hiding this comment

ArthurSens Nov 26, 2024

Choose a reason for hiding this comment

Aneurysm9 Nov 26, 2024

Choose a reason for hiding this comment

ArthurSens Nov 28, 2024

Choose a reason for hiding this comment

ArthurSens commented Nov 26, 2024

ArthurSens commented Nov 29, 2024

github-actions bot commented Dec 14, 2024

ArthurSens commented Dec 16, 2024