Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High series cardinality #26

Open
cryptk opened this issue Oct 11, 2016 · 15 comments
Open

High series cardinality #26

cryptk opened this issue Oct 11, 2016 · 15 comments

Comments

@cryptk
Copy link
Contributor

cryptk commented Oct 11, 2016

The way that mutator-influxdb-line-protocol.rb works, it creates very high series cardinality. Every individual metric that gets passed in ends up being a different series. Instead, it should be altered so that an individual check is a series, custom tags get applied to that, and the individual metrics that check creates are added as separate fields onto the same series.

Lets take https://github.com/sensu-plugins/sensu-plugins-mysql/blob/master/bin/metrics-mysql-graphite.rb for example. Running this metric/check through the mutator extension does indeed get your metrics into influxdb, but it does so in a very non-efficient way. That check generates almost 100 different series in influxdb after it runs through the mutator extension. What it should do instead is create a single series (perhaps named something like somehostname_mysql_metrics) and that single series has ~100 fields on it.

My big question is, should this change be done to the existing mutator extension (which would likely warrant a major version bump) or should a new mutator extension be created that stores the data in this new way?

Modifying the existing one likely wouldn't be too much of an issue because I do not foresee very many people using it given that it stores data in ways that are fairly antithetical to how influxdb should be used. For reference, check out the influxdb schema design documentation at https://docs.influxdata.com/influxdb/v1.0/concepts/schema_and_data_layout/

@cryptk
Copy link
Contributor Author

cryptk commented Oct 11, 2016

Another nice option would be to talk with the maintainer of https://github.com/seegno/sensu-influxdb-extension and see if he would like to "donate" that code to sensu-plugins-influxdb as that extension has some other really nice features such as batching. It also has the high series cardinality issue that I mentioned above, and it does not support UDP communication, but it may be more worthwhile to use that as a starting point, resolve those two issues, and then have that be the "official" influxdb extension.

@eheydrick
Copy link
Contributor

There's also https://github.com/jhrv/sensu-influxdb-extension which supports 1.x and batching. I'd love to see these extensions given love and made a first class citizen in the extensions repo.

@nathanhruby
Copy link

We hit this on day 1 of using the mutator and fixed it, gist of our code is here:
https://gist.github.com/nathanhruby/d8a64f0505477f68ce189ad8b2a88709

Evidently getting our changes upstreamed fell though the cracks, I'll work on this next week.

@dzeleski
Copy link

@nathanhruby looks like that gist doesnt support tags, I think thats a major issue for many. Going to give it a test as is and then see if I can get tags working as well.

@nathanhruby
Copy link

Yes, when I wrote that we didn't need tags as delivered by the check since we do most aggregation based on hostname and check name.

@dzeleski
Copy link

Ok all good, we unfortunately need them to correlate between app monitoring and system monitoring using client ID as a tag.

@dzeleski
Copy link

dzeleski commented Dec 1, 2016

@nathanhruby mind sharing what version of sensu you are currently running with that udp handler mutator? We are on 26.5 and I cannot get it to send data. Using the mutator "only_check_output" does send data so I know the handler is functioning.

@luisdavim
Copy link

We use this plugin only for the checks, to send data to influxDB we use: https://github.com/PTC-Global/sensu-influxdb-extension check out the templates branch, we are working on a feature to add a template mechanism to translate the metric name into tags, measurement and fields.

@majormoses
Copy link
Member

@luisdavim any thoughts on seeing it become a first class citizen as mentioned here: #26 (comment) ?

@luisdavim
Copy link

@majormoses Just give me a repo and I'll push the code as soon as the templates branch is ready :)

@majormoses
Copy link
Member

@cwjohnston can you create a repo for @luisdavim ?

@luisdavim
Copy link

Looks like the development on the @jhrv influxDB extension has been resumed by @terjesannum maybe it would be nice to get together and create an official community influxDB extension. Ours has strip_metrics and templates and theirs have proxy mode and multiple handler support.
It would be great to put our efforts together in a central point.

@majormoses
Copy link
Member

@luisdavim got someone to create this: https://github.com/sensu-extensions/sensu-extensions-influxdb so we can start transfering it over to be official.

@luisdavim
Copy link

Cool, do I have write access? Can I push the code there?

@majormoses
Copy link
Member

@luisdavim no but you can make prs, if I had access I would give it to you, you can make an issue in the repo to ask for privs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants