-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src/collector: Introduce Collector abstraction #82
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do folks think of this design? Would this enable your use-cases?
//CC @gagbo @dovreshef @vladvasiliu @phyber
Alternative implementation to #82.
src/collector.rs
Outdated
pub trait Collector<M: Clone> { | ||
fn collect<'a>(&'a self) -> Box<dyn Iterator<Item = (Cow<Descriptor>, Cow<M>)> + 'a>; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the trait one would need to implement to provide a custom collector, e.g. a process collector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this trait is generic over M
, which I think is the type of the metric? How does the Clone
bound work with collectors/registries of type Box<dyn EncodeMetric>
which aren't Clone
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the
Clone
bound work with collectors/registries of typeBox<dyn EncodeMetric>
which aren'tClone
? thinking
It doesn't. For now this is just a proposal. We could require Clone
on EnocdeMetric
, though I think the much better idea is to no longer be generic over the metric type. See #82 (comment).
Let me know what you think @sd2k!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah great, that makes sense. I agree that switching to an enum instead of generics/trait objects would simplify things a lot! (Especially since adding a Clone
bound to EncodeMetric
would mean it wasn't object safe, so I don't think we could use trait objects anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Especially since adding a Clone bound to EncodeMetric would mean it wasn't object safe, so I don't think we could use trait objects anyway).
Good point. We would have to drop the Cow<M>
in favor of M
, but then again, let's try to get rid of M
.
src/registry.rs
Outdated
/// struct MyCollector {} | ||
/// | ||
/// impl Collector<Counter> for MyCollector { | ||
/// fn collect<'a>(&'a self) -> Box<dyn Iterator<Item = (Cow<Descriptor>, Cow<Counter>)> + 'a> { | ||
/// let c = Counter::default(); | ||
/// let descriptor = Descriptor::new( | ||
/// "my_counter", | ||
/// "This is my counter", | ||
/// None, | ||
/// None, | ||
/// vec![], | ||
/// ); | ||
/// Box::new(std::iter::once((Cow::Owned(descriptor), Cow::Owned(c)))) | ||
/// } | ||
/// } | ||
/// | ||
/// let my_collector = Box::new(MyCollector{}); | ||
/// | ||
/// let mut registry: Registry<Counter> = Registry::default(); | ||
/// | ||
/// registry.register_collector(my_collector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example on how to implement and register a custom collector.
src/registry.rs
Outdated
@@ -57,26 +59,37 @@ use std::borrow::Cow; | |||
/// # "# EOF\n"; | |||
/// # assert_eq!(expected, String::from_utf8(buffer).unwrap()); | |||
/// ``` | |||
#[derive(Debug)] | |||
pub struct Registry<M = Box<dyn crate::encoding::text::SendSyncEncodeMetric>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the Collector
mechanism I don't think there is need for being abstract over M
any longer. Using a concrete type (e.g. enum
over all metrics in prometheus_client::metrics
) will drastically simplify both the library and its usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #100 proposing the above.
Also //CC @sd2k |
And //CC @08d2 |
Looks like it will work for the process collector use case! Thanks. |
Am I understanding correctly that the user of the lib is expected to reimplement the If so, I think there should be a way to provide an async collector; this will probably require implementing some alternative AsyncRegistry with matching AsyncCollector trait. |
For context, That call is subject to a few layers of (typically sub-second) timeouts and failsafes, and is expected to return more or less immediately. An implementation of If the information returned by |
That's what I was thinking, although this example seems to be reading stuff (memory info) from an external system (though, in this case, it's probably unlikely that it should take long). I have to think a bit more about this; it seems to me that once the collector is added to the registry, there's no easy way to only update it such that it doesn't share state between possibly concurrent calls to the |
callLongGetter? Yeah, it's not obvious to me what that does, exactly... but I wouldn't take too much inspiration from the Java client, which hasn't been updated in a long while, and was never particularly idiomatic. Stick with the Go client, or maybe look at the more active exporters hosted in the Prometheus org.
The nature of the problem demands shared state, for sure. (Or a "clever" workaround to avoid it, which, well.) But does shared state imply async, either necessarily or even ideally? I don't think so, but I could be wrong. |
I was thinking more about
I wouldn't say shared state implies async. I think the issue was the way I was thinking about somehow getting the external (non-shared) state as a collector of the shared state, which would have then required This could probably be implemented the other way around, with the shared state added as a sub-collector to the non-shared one before returning. Which, if |
I've been looking a bit more over this, and I think that, basically, a I think that if the lib provided an implementation of |
Correct.
Can you be more concrete? Can you describe a use-case where one would need an async collector?
Prometheus does not expect the scrape call to return immediately. E.g. from the Prometheus scheduling section:
By default the collection should not be decoupled from the collection. Citing the Prometheus scheduling section:
I don't follow how |
Also note that the |
My use case is exporting metrics that are gathered via multiple HTTP calls. If my HTTP client is async,
Well, the question is whether
See the other point: this is a confusion about whether If it's not, and random long operations can be done inside However, in the second situation, when the actual gathering of the metrics should be done outside of the |
Looking at the Go implementation, specifically the example collector, it's clear that the This means that there's no particular reason to provide an implementation of |
That example describes how a collector might be "bolted on" to a legacy system. It's not indicative of best practice. But the point is largely moot, I think. |
My current examples why a collector would need to support async are:
While both approaches could also refresh the data in the background, having it directly connected to the scrape and being able to fetch fresh data on a scrape provides more reliable resolution than scraping previously scraped data. Now, the data refresh could actually be handled in the http handler, but still it seems like an unnecessary detour. |
As a first iteration, I suggest we support synchronous
In case folks see a large performance hit, e.g. due to thread spawns (even though that should be < 10 microseconds), or (1) and (2) add too much complexity in your |
@kwiesmueller Thanks for sharing. Let me know in case the threading approach is good enough for you for now, i.e. can collect from the many datasources you are targeting within the Prometheus scrape timeout. If not, as mentioned above, let's design an |
Adopt encoding style similar to serde. `EncodeXXX` for a type that can be encoded (see serde's `Serialize`) and `XXXEncoder` for a supported encoding i.e. text and protobuf (see serde's `Serializer`). - Compatible with Rust's additive features. Enabling the `protobuf` feature does not change type signatures. - `EncodeMetric` is trait object safe, and thus `Registry` can use dynamic dispatch. - Implement a single encoding trait `EncodeMetric` per metric type instead of one implementation per metric AND per encoding format. - Leverage `std::fmt::Write` instead of `std::io::Write`. The OpenMetrics text format has to be unicode compliant. The OpenMetrics Protobuf format requires labels to be valid strings. Signed-off-by: Max Inden <[email protected]>
Very interested in this work are you looking for help or what would help move this forward. |
Thanks for the interest. Sorry for the silence. I rebased this pull request on #105. I want to try out the new API with one of my projects (e.g. https://github.com/mxinden/kademlia-exporter/) before merging #105 and this pull request.
I would appreciate feedback on #105 and this pull request. You would be of great help by testing this pull request (it is on top of #105) within your application in need of the |
I have some cycles today will take a look, thanks! |
Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
The `Collector` abstraction allows users to provide additional metrics and their description on each scrape. See also: - https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#hdr-Custom_Collectors_and_constant_Metrics - prometheus#49 - prometheus#29 Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
Iterators returned by a Collector don't have to cross thread boundaries, nor doe their references.
Tested this patch with |
Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
Signed-off-by: Max Inden <[email protected]>
@@ -228,41 +258,58 @@ impl Registry { | |||
.expect("sub_registries not to be empty.") | |||
} | |||
|
|||
/// [`Iterator`] over all metrics registered with the [`Registry`]. | |||
pub fn iter(&self) -> RegistryIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming RegistryIterator
and making this pub(crate)
is the only breaking change as far as I can tell. Might be worth undoing.
impl<S: EncodeLabelSet, M: EncodeMetric + TypedMetric, T: Iterator<Item = (S, M)>> EncodeMetric | ||
for RefCell<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very hard to discover and doesn't feel Rust-y to me. Why do we need to depend on RefCell
here?
I think it would be more idiomatic to do:
impl EncodeMetric for Vec<T> where T: EncodeMetric { }
impl EncodeMetric for (S, M) where S: EncodeLabelSet, M: EncodeMetric { }
Then, have the user provide a Vec
instead of being generic over something that can be iterated. That should get rid of the RefCell
and iterator dependency and would compose better with other usecases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An Iterator::next
takes &mut self
in order to be able to e.g. track its position within a Vec
. Unfortunately EncodeMetric::encode
takes &self
.
In this pull request RefCell
is used for interior mutability. It allows calling an Iterator::next
taking &mut self
from an EncodeMetric::encode
providing &self
only.
An alternative approach would be to change EncodeMetric::encode
to take &mut self
. I deemed the change and its implications not worth it, given that Collector
is an abstraction for advanced users only.
All that said, I agree that the usage of RefCell
is neither simple nor intuitive. @thomaseizinger do you see other alternatives to the one above?
The
Collector
abstraction allows users to provide additional metricsand their description on each scrape.
See also:
I am opening this up as a "Draft" for early feedback.