Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf optimization for protobuf instrumentation #6694

Merged
merged 2 commits into from
Feb 25, 2025
Merged

Conversation

vandonr
Copy link
Contributor

@vandonr vandonr commented Feb 21, 2025

Summary of changes

Stop relying on the proto source file to know if we're dealing with an internal protobuf message.
Instead, we check the type at the very beginning, and disable the instrumentation for that type if we detect it's a google type.
This means we can also get rid of all the dark magic we had around making sure we can access the descriptor, because we now know we'll only do it for regular messages.

Reason for change

it's better perf

Implementation details

I use an other OnMethodXxx method to be able to access the raw type, I explained it in the comments.
See this thread for more.

Test coverage

added a path in the integration tests that checks that

Other details

@vandonr vandonr requested review from a team as code owners February 21, 2025 08:45
@andrewlock
Copy link
Member

andrewlock commented Feb 21, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (69ms)  : 66, 72
     .   : milestone, 69,
    master - mean (69ms)  : 66, 72
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (1,002ms)  : 980, 1024
     .   : milestone, 1002,
    master - mean (998ms)  : 973, 1022
     .   : milestone, 998,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (103ms)  : 101, 105
     .   : milestone, 103,
    master - mean (102ms)  : 99, 105
     .   : milestone, 102,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (674ms)  : 659, 690
     .   : milestone, 674,
    master - mean (678ms)  : 659, 697
     .   : milestone, 678,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (89ms)  : 87, 91
     .   : milestone, 89,
    master - mean (89ms)  : 87, 90
     .   : milestone, 89,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (627ms)  : 607, 647
     .   : milestone, 627,
    master - mean (631ms)  : 615, 647
     .   : milestone, 631,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (191ms)  : 188, 195
     .   : milestone, 191,
    master - mean (191ms)  : 187, 196
     .   : milestone, 191,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (1,110ms)  : 1086, 1135
     .   : milestone, 1110,
    master - mean (1,114ms)  : 1079, 1149
     .   : milestone, 1114,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (272ms)  : 267, 277
     .   : milestone, 272,
    master - mean (271ms)  : 267, 276
     .   : milestone, 271,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (865ms)  : 838, 892
     .   : milestone, 865,
    master - mean (869ms)  : 838, 901
     .   : milestone, 869,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6694) - mean (263ms)  : 258, 269
     .   : milestone, 263,
    master - mean (262ms)  : 258, 266
     .   : milestone, 262,

    section CallTarget+Inlining+NGEN
    This PR (6694) - mean (848ms)  : 810, 886
     .   : milestone, 848,
    master - mean (848ms)  : 816, 880
     .   : milestone, 848,

Loading

@andrewlock
Copy link
Member

andrewlock commented Feb 21, 2025

Benchmarks Report for tracer 🐌

Benchmarks for #6694 compared to master:

  • 1 benchmarks are slower, with geometric mean 1.149
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 8.2μs 47.3ns 393ns 0.0162 0.00404 0 5.61 KB
master StartStopWithChild netcoreapp3.1 10.6μs 57.8ns 409ns 0.0163 0.00544 0 5.81 KB
master StartStopWithChild net472 16.4μs 32.7ns 122ns 1.04 0.299 0.0946 6.2 KB
#6694 StartStopWithChild net6.0 8.09μs 46.3ns 347ns 0.0145 0.00727 0 5.61 KB
#6694 StartStopWithChild netcoreapp3.1 10.3μs 58.8ns 455ns 0.0202 0.0101 0 5.8 KB
#6694 StartStopWithChild net472 16.1μs 65.1ns 252ns 1.03 0.294 0.0955 6.21 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 468μs 202ns 781ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 667μs 424ns 1.47μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 839μs 563ns 2.18μs 0.417 0 0 3.3 KB
#6694 WriteAndFlushEnrichedTraces net6.0 487μs 220ns 825ns 0 0 0 2.7 KB
#6694 WriteAndFlushEnrichedTraces netcoreapp3.1 665μs 512ns 1.98μs 0 0 0 2.7 KB
#6694 WriteAndFlushEnrichedTraces net472 847μs 568ns 2.13μs 0.422 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 130μs 173ns 598ns 0.195 0 0 14.47 KB
master SendRequest netcoreapp3.1 151μs 183ns 684ns 0.15 0 0 17.27 KB
master SendRequest net472 0.000551ns 0.000227ns 0.000786ns 0 0 0 0 b
#6694 SendRequest net6.0 130μs 445ns 1.72μs 0.194 0 0 14.47 KB
#6694 SendRequest netcoreapp3.1 143μs 455ns 1.76μs 0.21 0 0 17.27 KB
#6694 SendRequest net472 0.000106ns 7.22E‑05ns 0.00027ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 556μs 2.55μs 9.18μs 0.293 0 0 41.66 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 656μs 3.59μs 22.4μs 0.345 0 0 41.72 KB
master WriteAndFlushEnrichedTraces net472 851μs 3.52μs 13.2μs 8.28 2.48 0.414 53.27 KB
#6694 WriteAndFlushEnrichedTraces net6.0 579μs 2.54μs 9.14μs 0.566 0 0 41.75 KB
#6694 WriteAndFlushEnrichedTraces netcoreapp3.1 694μs 3.83μs 26.5μs 0.336 0 0 41.76 KB
#6694 WriteAndFlushEnrichedTraces net472 855μs 3.08μs 12.3μs 8.28 2.48 0.414 53.28 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.28μs 1.25ns 4.83ns 0.0147 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.74μs 0.867ns 3ns 0.0131 0 0 1.02 KB
master ExecuteNonQuery net472 2.03μs 1.7ns 6.6ns 0.157 0.00101 0 987 B
#6694 ExecuteNonQuery net6.0 1.33μs 2.2ns 12.8ns 0.0141 0 0 1.02 KB
#6694 ExecuteNonQuery netcoreapp3.1 1.78μs 1.65ns 6.41ns 0.0134 0 0 1.02 KB
#6694 ExecuteNonQuery net472 1.97μs 2.8ns 10.9ns 0.156 0.000984 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.24μs 0.395ns 1.48ns 0.0134 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.56μs 0.234ns 0.844ns 0.0135 0 0 976 B
master CallElasticsearch net472 2.46μs 2.25ns 8.71ns 0.157 0 0 995 B
master CallElasticsearchAsync net6.0 1.37μs 0.627ns 2.34ns 0.013 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.65μs 0.682ns 2.55ns 0.0139 0 0 1.02 KB
master CallElasticsearchAsync net472 2.61μs 1.48ns 5.74ns 0.166 0 0 1.05 KB
#6694 CallElasticsearch net6.0 1.17μs 0.615ns 2.38ns 0.0135 0 0 976 B
#6694 CallElasticsearch netcoreapp3.1 1.54μs 1.01ns 3.93ns 0.0132 0 0 976 B
#6694 CallElasticsearch net472 2.49μs 2.09ns 8.09ns 0.158 0 0 995 B
#6694 CallElasticsearchAsync net6.0 1.25μs 0.44ns 1.65ns 0.0134 0 0 952 B
#6694 CallElasticsearchAsync netcoreapp3.1 1.65μs 0.367ns 1.42ns 0.0141 0 0 1.02 KB
#6694 CallElasticsearchAsync net472 2.58μs 1.68ns 6.29ns 0.166 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.4μs 0.424ns 1.59ns 0.0133 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.63μs 0.625ns 2.34ns 0.0125 0 0 952 B
master ExecuteAsync net472 1.87μs 1.39ns 5.37ns 0.145 0 0 915 B
#6694 ExecuteAsync net6.0 1.33μs 0.445ns 1.73ns 0.0133 0 0 952 B
#6694 ExecuteAsync netcoreapp3.1 1.67μs 0.976ns 3.78ns 0.0126 0 0 952 B
#6694 ExecuteAsync net472 1.87μs 0.452ns 1.63ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.44μs 1.82ns 7.04ns 0.031 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.29μs 3.2ns 12.4ns 0.037 0 0 2.85 KB
master SendAsync net472 7.54μs 2.15ns 8.04ns 0.494 0 0 3.12 KB
#6694 SendAsync net6.0 4.53μs 2.64ns 10.2ns 0.0317 0 0 2.31 KB
#6694 SendAsync netcoreapp3.1 5.36μs 2.36ns 8.82ns 0.0381 0 0 2.85 KB
#6694 SendAsync net472 7.47μs 1.58ns 6.13ns 0.494 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.53μs 0.928ns 3.47ns 0.0231 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.34μs 1.04ns 4.05ns 0.0223 0 0 1.64 KB
master EnrichedLog net472 2.56μs 0.73ns 2.83ns 0.249 0 0 1.57 KB
#6694 EnrichedLog net6.0 1.51μs 0.937ns 3.63ns 0.0235 0 0 1.64 KB
#6694 EnrichedLog netcoreapp3.1 2.2μs 0.979ns 3.53ns 0.0219 0 0 1.64 KB
#6694 EnrichedLog net472 2.6μs 1.23ns 4.78ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 112μs 85.3ns 330ns 0 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 121μs 166ns 622ns 0 0 0 4.28 KB
master EnrichedLog net472 151μs 191ns 741ns 0.677 0.226 0 4.46 KB
#6694 EnrichedLog net6.0 112μs 220ns 851ns 0 0 0 4.28 KB
#6694 EnrichedLog netcoreapp3.1 118μs 160ns 620ns 0 0 0 4.28 KB
#6694 EnrichedLog net472 150μs 121ns 454ns 0.679 0.226 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.14μs 2.99ns 10.8ns 0.03 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.22μs 1.44ns 5.4ns 0.03 0 0 2.2 KB
master EnrichedLog net472 4.92μs 1.66ns 6.44ns 0.319 0 0 2.02 KB
#6694 EnrichedLog net6.0 2.99μs 0.768ns 2.87ns 0.0315 0 0 2.2 KB
#6694 EnrichedLog netcoreapp3.1 4.14μs 2.15ns 8.04ns 0.03 0 0 2.2 KB
#6694 EnrichedLog net472 4.91μs 1.2ns 4.34ns 0.319 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.43μs 1.04ns 3.91ns 0.0157 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.93μs 0.562ns 2.1ns 0.0153 0 0 1.14 KB
master SendReceive net472 2.15μs 2.04ns 7.9ns 0.183 0 0 1.16 KB
#6694 SendReceive net6.0 1.37μs 0.555ns 2.15ns 0.0158 0 0 1.14 KB
#6694 SendReceive netcoreapp3.1 1.75μs 0.571ns 2.21ns 0.0149 0 0 1.14 KB
#6694 SendReceive net472 2.13μs 0.833ns 3.12ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.79μs 0.807ns 3.13ns 0.0224 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.72μs 1.34ns 5.17ns 0.0223 0 0 1.65 KB
master EnrichedLog net472 4.3μs 6.19ns 24ns 0.323 0 0 2.04 KB
#6694 EnrichedLog net6.0 2.76μs 1.04ns 3.9ns 0.0221 0 0 1.6 KB
#6694 EnrichedLog netcoreapp3.1 3.78μs 1.36ns 5.09ns 0.0208 0 0 1.65 KB
#6694 EnrichedLog net472 4.18μs 3.11ns 12ns 0.322 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6694

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net472 1.149 591.85 680.08

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 447ns 0.887ns 3.43ns 0.00806 0 0 576 B
master StartFinishSpan netcoreapp3.1 600ns 0.831ns 3ns 0.00782 0 0 576 B
master StartFinishSpan net472 590ns 1.96ns 7.33ns 0.0915 0 0 578 B
master StartFinishScope net6.0 482ns 0.943ns 3.53ns 0.0097 0 0 696 B
master StartFinishScope netcoreapp3.1 717ns 1.98ns 7.68ns 0.00962 0 0 696 B
master StartFinishScope net472 886ns 1.84ns 7.12ns 0.105 0 0 658 B
#6694 StartFinishSpan net6.0 465ns 0.633ns 2.28ns 0.00806 0 0 576 B
#6694 StartFinishSpan netcoreapp3.1 565ns 0.597ns 2.23ns 0.0077 0 0 576 B
#6694 StartFinishSpan net472 679ns 0.809ns 3.03ns 0.0916 0 0 578 B
#6694 StartFinishScope net6.0 482ns 0.83ns 3.11ns 0.00977 0 0 696 B
#6694 StartFinishScope netcoreapp3.1 752ns 0.856ns 3.2ns 0.00929 0 0 696 B
#6694 StartFinishScope net472 860ns 1.69ns 6.56ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 679ns 0.696ns 2.69ns 0.00986 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 882ns 0.557ns 2.01ns 0.0092 0 0 696 B
master RunOnMethodBegin net472 1.09μs 1.9ns 7.36ns 0.104 0 0 658 B
#6694 RunOnMethodBegin net6.0 683ns 0.775ns 3ns 0.00979 0 0 696 B
#6694 RunOnMethodBegin netcoreapp3.1 926ns 1.81ns 7.02ns 0.00924 0 0 696 B
#6694 RunOnMethodBegin net472 1μs 1.84ns 7.12ns 0.104 0 0 658 B

@lucaspimentel lucaspimentel added type:performance Performance, speed, latency, resource usage (CPU, memory) area:integrations labels Feb 24, 2025
Copy link
Member

@lucaspimentel lucaspimentel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if this is what @tonyredondo recommends 😅

Co-authored-by: Lucas Pimentel <[email protected]>
Copy link
Contributor

@bouwkast bouwkast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines +32 to +35
// For performance reasons, we want to do the actual instrumentation work with a Duck constraint,
// but to be able to disable the instrumentation we need the raw type
// so we use 2 different methods to have access to both when we need it.
// Note: Disabling OnMethodBegin means the OnMethodEnd will not be called afterward.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting thanks!

@vandonr vandonr merged commit df89bf4 into master Feb 25, 2025
146 of 148 checks passed
@vandonr vandonr deleted the vandonr/proto branch February 25, 2025 13:10
@github-actions github-actions bot added this to the vNext-v3 milestone Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:integrations type:performance Performance, speed, latency, resource usage (CPU, memory)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants