Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_disk_*: Keep a cumulative count of bytes read and written #37

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

stevenjm
Copy link

The node exporter maintains this as a counter of the total bytes read and written since system boot. RDS enhanced metrics do not give us this information, instead presenting the number of bytes read and written in each monitoring interval.

This means that the node exporter metrics we produce here were previously incorrect. Instead of directly exposing the information provided by the RDS metrics, maintain a counter to which we add the additional bytes read/written on each scrape.

Also, ensure that we only process each event once, as duplicate processing now means that cumulative statistics are incorrect.

Feedback wanted

This change currently does not honour per-instance labels. This is because a CounterVec does not provide a way to have different labels per Counter.

I'm not sure what the best way is to solve this. We could calculate the union set of labels on all instances and add those labels to the CounterVec. Alternatively, if there is a good way to synthesise a new metric with the value of an existing metric, we could use the CounterVec to track the cumulative values and then create new metrics with added labels for export.

A third option is that there might be a better way to implement this change altogether.

I'm interested in some feedback on what kind of solution would be accepted before I put more work into this.

The node exporter maintains this as a counter of the total bytes
read and written since system boot. RDS enhanced metrics do not
give us this information, instead presenting the number of bytes
read and written in each monitoring interval.

This means that the node exporter metrics we produce here were
previously incorrect. Instead of directly exposing the information
provided by the RDS metrics, maintain a counter to which we add the
additional bytes read/written on each scrape.

Also, ensure that we only process each event once, as duplicate
processing now means that cumulative statistics are incorrect.
@it-percona
Copy link

it-percona commented Jan 17, 2020

CLA assistant check
All committers have signed the CLA.

enhanced/metrics.go Show resolved Hide resolved
enhanced/scraper.go Show resolved Hide resolved
enhanced/scraper.go Show resolved Hide resolved
enhanced/scraper.go Show resolved Hide resolved
@@ -11,6 +11,27 @@ import (
"github.com/prometheus/client_golang/prometheus"
)

var (
diskRead = prometheus.NewCounterVec(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diskRead is a global variable (from gochecknoglobals)

},
[]string{"region", "instance", "device"},
)
diskWritten = prometheus.NewCounterVec(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diskWritten is a global variable (from gochecknoglobals)

@stevenjm
Copy link
Author

I fixed the cuddling issues golangcibot complained about.

However, the global variable usage is valid in this case as this is how CounterVec is supposed to be used. The only alternative I can see would be to add fields to sessions.Instance for these accumulated values, which would mean that these fields are defined in sessions/sessions.go for a single use case in enhanced/metrics.go. This would both violate encapsulation by making sessions need to be aware of the implementation of enhanced, and make the added code much more difficult to understand by splitting it into multiple packages.

Copy link
Member

@BupycHuk BupycHuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for you contribution.
As far as I see the problem of this implementation that it collects total number of bytes for all instances and doesn't separate metrics by instances.
Tests are failing and merge conflicts there.
Please update PR or if you can't we will.

@askomorokhov askomorokhov removed their request for review January 11, 2021 11:49
@denisok
Copy link

denisok commented Jul 27, 2021

@stevenjm is this still an issue? maybe you could update PR or we can

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants