Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MQE: Add support for histogram_quantile #9929

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

jhesketh
Copy link
Contributor

Also preps support for more classic histogram functions to come. (Will require some re-work, but the basics are there).

Tidies up annotation tests and checks their results between engines. (Since sometimes we emit annotations with results, and sometimes the results are omitted when there is an annotation).

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@jhesketh jhesketh requested review from stevesg, grafanabot and a team as code owners November 18, 2024 00:36
Also preps support for more classic histogram functions to come.
(Will require some re-work, but the basics are there).

Tidies up annotation tests and checks their results between engines.
(Since sometimes we emit annotations with results, and sometimes the
results are omitted when there is an annotation).
@jhesketh jhesketh force-pushed the jhesketh/mqe-histogram-quantile branch from 38d1a77 to d293f3a Compare November 18, 2024 00:40
@jhesketh jhesketh force-pushed the jhesketh/mqe-histogram-quantile branch from d89bbab to 78fc811 Compare November 18, 2024 05:45
@jhesketh jhesketh force-pushed the jhesketh/mqe-histogram-quantile branch from 78fc811 to 5abeee0 Compare November 18, 2024 05:46
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working my way through the implementation and will keep going tomorrow - I have some suggestions for the tests in the meantime

pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
@@ -1836,11 +1836,82 @@ func (t *timeoutTestingQueryTracker) Close() error {
return nil
}

func TestAnnotations(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you've split this test in half?

Copy link
Contributor Author

@jhesketh jhesketh Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like shorter tests and it felt like a good place to split it up since I wanted a function to just test the histogram annotations etc. Happy to rejoin though if you disagree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely against it, but it is going to cause a bunch of merge conflicts with the upcoming Prometheus 3 changes (which introduce a bunch of new annotations) unless I rebase those.

Perhaps we can keep things as they were in this PR and then split the test in a later PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will create conflicts regardless. Lets see where we are at when either this or the Prometheus 3 PR's are close to merging.

pkg/streamingpromql/engine_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a test for the case where the buckets for an output series change over time (eg. at T=1, buckets are 1, 2 and 5, but at T=2, buckets are 1, 3 and 7).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by output series here sorry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry: let's say the input series are:

metric{env="test", le="1"}
metric{env="test", le="2"}
metric{env="test", le="3"}
metric{env="test", le="5"}
metric{env="test", le="7"}

Then these all map to the one output series, {env="test"}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure what you mean by the buckets changing over time sorry? Do you just mean where the output series has labels more than just its __name__?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm imagining a test case like this:

load 6m
  metric{env="test", le="1"} 1 2
  metric{env="test", le="2"} 5 _
  metric{env="test", le="3"} _ 9
  metric{env="test", le="5"} 8 _
  metric{env="test", le="7"} _ 20

eval range from 0 to 6m step 6m histogram_quantile(0.5, metric)
  {env="test"} xxx yyy

@jhesketh jhesketh force-pushed the jhesketh/mqe-histogram-quantile branch from c20de2b to 016d546 Compare November 18, 2024 10:01
Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🙂

I'd like to see some benchmark results for this.

pkg/streamingpromql/operators/functions/histograms.go Outdated Show resolved Hide resolved
pkg/streamingpromql/operators/functions/histograms.go Outdated Show resolved Hide resolved
pkg/streamingpromql/operators/functions/histograms.go Outdated Show resolved Hide resolved
pkg/streamingpromql/operators/functions/histograms.go Outdated Show resolved Hide resolved
} else {
for _, f := range fPoints {
pointIdx := h.timeRange.PointIndex(f.T)
g.pointBuckets[pointIdx] = append(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to consider, might be something for a follow-up PR: what if we allocated g.pointBuckets[pointIdx] once based on the expected number of buckets? As it stands, we'll keep appending to g.pointBuckets[pointIdx], which may require many expansions of the slice, with all the allocations and copying that entails.

We could assume that if there are any floats present at a point, then all buckets will be present at that point (which should hold true unless the bucket layout changes).

We could also then pre-sort the list of buckets by upperBound, and then directly write to the correct bucket, reducing / eliminating any shuffling required when sorting in bucketQuantile.

The only thing I'm not sure about is how we'd handle the case where some buckets aren't present (eg. because the bucket layout changed mid-query) - we'd need to keep track of which buckets are present somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea, but I think there are some complicated edge cases as you point out. So I agree better for a followup PR.

Copy link
Contributor

@charleskorn charleskorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good.

I'd like to see some benchmark results, and I'd also like for
histogram_quantile to go behind a feature flag so we can turn it off if need be - there's a lot of complexity here, and while I can't see any obvious issues, it'd be good to have that available if we need it.

@@ -219,7 +218,13 @@ func (h *HistogramFunctionOverInstantVector) accumulateUntilGroupComplete(ctx co
// It is also possible that both series groups are the same.
// The conflict in points is then detected in computeOutputSeriesForGroup.
h.saveNativeHistogramsToGroup(s.Histograms, thisSeriesGroups.nativeHistogramGroup)
h.saveFloatsToGroup(s.Floats, thisSeriesGroups.bucketValue, thisSeriesGroups.classicHistogramGroup)
if thisSeriesGroups.classicHistogramGroup != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no le label, but there are float values for the series, do we need to emit an annotation or do something like that here?

@@ -136,7 +140,7 @@ func (h *HistogramFunctionOverInstantVector) SeriesMetadata(ctx context.Context)
if !groupExists {
g.labels = series.Labels
g.group = bucketGroupPool.Get()
g.group.firstInputSeriesIdx = innerIdx
g.group.lastInputSeriesIdx = innerIdx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right - we need to update lastInputSeriesIdx both when the group already exists and when it doesn't, but it's currently only set when the group is created for the first time.

Same comment applies for classic histogram case below.

Might be good to add some tests for the sorting logic to catch stuff like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants