-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MQE: Add support for histogram_quantile #9929
base: main
Are you sure you want to change the base?
Conversation
Also preps support for more classic histogram functions to come. (Will require some re-work, but the basics are there). Tidies up annotation tests and checks their results between engines. (Since sometimes we emit annotations with results, and sometimes the results are omitted when there is an annotation).
38d1a77
to
d293f3a
Compare
d89bbab
to
78fc811
Compare
78fc811
to
5abeee0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still working my way through the implementation and will keep going tomorrow - I have some suggestions for the tests in the meantime
@@ -1836,11 +1836,82 @@ func (t *timeoutTestingQueryTracker) Close() error { | |||
return nil | |||
} | |||
|
|||
func TestAnnotations(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why you've split this test in half?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like shorter tests and it felt like a good place to split it up since I wanted a function to just test the histogram annotations etc. Happy to rejoin though if you disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely against it, but it is going to cause a bunch of merge conflicts with the upcoming Prometheus 3 changes (which introduce a bunch of new annotations) unless I rebase those.
Perhaps we can keep things as they were in this PR and then split the test in a later PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will create conflicts regardless. Lets see where we are at when either this or the Prometheus 3 PR's are close to merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test for the case where the buckets for an output series change over time (eg. at T=1, buckets are 1, 2 and 5, but at T=2, buckets are 1, 3 and 7).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by output series here sorry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry: let's say the input series are:
metric{env="test", le="1"}
metric{env="test", le="2"}
metric{env="test", le="3"}
metric{env="test", le="5"}
metric{env="test", le="7"}
Then these all map to the one output series, {env="test"}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure what you mean by the buckets changing over time sorry? Do you just mean where the output series has labels more than just its __name__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm imagining a test case like this:
load 6m
metric{env="test", le="1"} 1 2
metric{env="test", le="2"} 5 _
metric{env="test", le="3"} _ 9
metric{env="test", le="5"} 8 _
metric{env="test", le="7"} _ 20
eval range from 0 to 6m step 6m histogram_quantile(0.5, metric)
{env="test"} xxx yyy
c20de2b
to
016d546
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 🙂
I'd like to see some benchmark results for this.
} else { | ||
for _, f := range fPoints { | ||
pointIdx := h.timeRange.PointIndex(f.T) | ||
g.pointBuckets[pointIdx] = append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider, might be something for a follow-up PR: what if we allocated g.pointBuckets[pointIdx]
once based on the expected number of buckets? As it stands, we'll keep appending to g.pointBuckets[pointIdx]
, which may require many expansions of the slice, with all the allocations and copying that entails.
We could assume that if there are any floats present at a point, then all buckets will be present at that point (which should hold true unless the bucket layout changes).
We could also then pre-sort the list of buckets by upperBound
, and then directly write to the correct bucket, reducing / eliminating any shuffling required when sorting in bucketQuantile
.
The only thing I'm not sure about is how we'd handle the case where some buckets aren't present (eg. because the bucket layout changed mid-query) - we'd need to keep track of which buckets are present somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea, but I think there are some complicated edge cases as you point out. So I agree better for a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good.
I'd like to see some benchmark results, and I'd also like for
histogram_quantile
to go behind a feature flag so we can turn it off if need be - there's a lot of complexity here, and while I can't see any obvious issues, it'd be good to have that available if we need it.
@@ -219,7 +218,13 @@ func (h *HistogramFunctionOverInstantVector) accumulateUntilGroupComplete(ctx co | |||
// It is also possible that both series groups are the same. | |||
// The conflict in points is then detected in computeOutputSeriesForGroup. | |||
h.saveNativeHistogramsToGroup(s.Histograms, thisSeriesGroups.nativeHistogramGroup) | |||
h.saveFloatsToGroup(s.Floats, thisSeriesGroups.bucketValue, thisSeriesGroups.classicHistogramGroup) | |||
if thisSeriesGroups.classicHistogramGroup != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no le
label, but there are float values for the series, do we need to emit an annotation or do something like that here?
@@ -136,7 +140,7 @@ func (h *HistogramFunctionOverInstantVector) SeriesMetadata(ctx context.Context) | |||
if !groupExists { | |||
g.labels = series.Labels | |||
g.group = bucketGroupPool.Get() | |||
g.group.firstInputSeriesIdx = innerIdx | |||
g.group.lastInputSeriesIdx = innerIdx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite right - we need to update lastInputSeriesIdx
both when the group already exists and when it doesn't, but it's currently only set when the group is created for the first time.
Same comment applies for classic histogram case below.
Might be good to add some tests for the sorting logic to catch stuff like this.
Also preps support for more classic histogram functions to come. (Will require some re-work, but the basics are there).
Tidies up annotation tests and checks their results between engines. (Since sometimes we emit annotations with results, and sometimes the results are omitted when there is an annotation).
What this PR does
Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.