Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple histogram output formats #325

Open
jmartisk opened this issue Jun 16, 2020 · 10 comments
Open

Support multiple histogram output formats #325

jmartisk opened this issue Jun 16, 2020 · 10 comments
Labels
enhancement New feature or request innovation-candidate Something not in the spec and SmallRye could serve for verifying usefulness before backporting it

Comments

@jmartisk
Copy link
Member

Allow switching histogram output from showing percentiles to showing item counts within defined buckets. For example, a timer will export something like this

starlette_requests_processing_time_seconds_bucket{le="5.0",method="GET",path_template="/metrics"} 12789.0
starlette_requests_processing_time_seconds_bucket{le="7.5",method="GET",path_template="/metrics"} 12789.0
starlette_requests_processing_time_seconds_bucket{le="10.0",method="GET",path_template="/metrics"} 12789.0
starlette_requests_processing_time_seconds_bucket{le="+Inf",method="GET",path_template="/metrics"} 12789.0

Originally reported in MP Metrics as microprofile/microprofile-metrics#587

@jmartisk jmartisk added the innovation-candidate Something not in the spec and SmallRye could serve for verifying usefulness before backporting it label Jun 16, 2020
@jmartisk jmartisk added this to the 3.0.0 milestone Jun 16, 2020
@jmartisk
Copy link
Member Author

Hey @theute + @pilhuhn , so I'd start with timers only for now and implement this in a way that SmallRye will have a io.smallrye.metrics.annotation.Timed annotation that does the same thing as org.eclipse.microprofile.metrics.annotation.Timed, but offers two new parameters:

  • histogramType which is an enum with values PERCENTILES and THRESHOLD_VALUES (default will be PERCENTILES)
  • buckets which will be a float[] and it will define the buckets when histogramType==THRESHOLD_VALUES. They will be considered to be in the same unit as the main unit of the timer, and if this is undefined, there will be some default.
    • Potentially we could apply the buckets field as well when histogramType==PERCENTILES, in which case the array would be interpreted as desired percentiles ([0.01, 0.05, 0.1, 0.5, 0.9, 0.99] etc)

Our SmallRye-specific ExtendedMetadata will also add the same two fields with the same semantics. These fields will only be taken into account for timers, ignored for other metric types.

If this proves useful, it will be backported to MP Metrics spec later.

Do you have any other ideas?

@theute
Copy link

theute commented Jun 16, 2020

Do you have a code example ? That would speak to me better.

@jmartisk
Copy link
Member Author

It would look like this - for annotated metrics you will use

@io.smallrye.metrics.annotation.Timed(histogramType=HistogramType.THRESHOLD_VALUES, 
     buckets=[0.05, 0.1, 0.5])
// instead of @org.eclipse.microprofile.annotation.Timed
public void timedMethod() {
	// your logic
}

for programmatic approach without annotations, you will use

@Inject
MetricRegistry registry;

ExtendedMetadata metadata = new io.smallrye.metrics.ExtendedMetadataBuilder()
	.withName("my-timer")
	.withHistogramType(HistogramType.THRESHOLD_VALUES)
	.withBuckets(0.05, 0.1, 0.5)
	.build();
registry.timer(metadata);

the timer registered this way would be exported the way you requested, with the defined buckets. There will also be some sort of default for the buckets, so you don't have to specify them.

@jmartisk
Copy link
Member Author

Perhaps COUNTS_WITHIN_BUCKETS would be more understandable than THRESHOLD_VALUES, I don't know.

@theute
Copy link

theute commented Jun 16, 2020

Maybe BY_BUCKETS or PER_BUCKET ?
How about those who would want both buckets and percentile ? Can you use twice the annotation (different names ?) ?

Also for buckets it should count all request lower or equals to the threshold, and I suggest to use the name "le" like in other solutions ( https://prometheus.io/docs/practices/histograms/ )

So for buckets of 1, 2, 3 and for a time of 2 this should have
{le='1'} =1
{le='2'} =1
{le='3'} =0
{le='+Inf'} =0

(Also seen here: https://books.google.ch/books?id=EpxVDwAAQBAJ&pg=PA232&lpg=PA232&dq=%22le%22+metrics+latency&source=bl&ots=31_7p39pbT&sig=ACfU3U1fnScK3hkBd17Q6VKe7eeM9VWe1Q&hl=en&sa=X&ved=2ahUKEwjL-7qUrobqAhWBoVwKHY9tAgEQ6AEwAHoECAcQAQ#v=onepage&q=%22le%22%20metrics%20latency&f=false )

@theute
Copy link

theute commented Jun 16, 2020

I may have asked the wrong thing actually...
Maybe we should have a different approach and not a task for SmallRye metrics...

In the end it would be handy to also have the HTTP service path and HTTP method (and buckets) like in the upper example used in Python... Maybe we should just implement a JAX-RS filter... @pilhuhn thoughts ?

@abkieling
Copy link

abkieling commented Jul 23, 2020

This is a very important feature. Having the histogram data available in Prometheus gives a lot of flexibility when querying and aggregating the data. The same doesn't happen when the percentiles are calculated in the client.
When in doubt about design decisions, have a look at the Micrometer library and the Prometheus Java client.

@abkieling
Copy link

abkieling commented Jul 23, 2020

@jmartisk Is there any plan to include this feature in the next release of SmallRye? I need to decide between temporarily use the existing percentiles calculated on the client or wait until it gets released. Thanks.

@jmartisk
Copy link
Member Author

If there's really demand for something like what I described in comments #325 (comment) and #325 (comment) then I am willing to do that, I think it is realistic that it could get into SR Metrics 3.0 (but I can't promise 100 %). However, PRs with contributions are welcome too ;)
Note that it would be marked as an experimental vendor-specific feature and therefore not guaranteed stable.

@abkieling
Copy link

I don't think there are types of histogram. There are output formats: precalculated percentiles (Prometheus calls it Summary) or the raw histogram data (the values in each bucket). Just an idea: a new attribute called output with possible values: summary or histogram.

@jmartisk jmartisk removed this from the 3.0.0 milestone Oct 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request innovation-candidate Something not in the spec and SmallRye could serve for verifying usefulness before backporting it
Projects
None yet
Development

No branches or pull requests

3 participants