Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure] [container_service] Duplicated documents for the kube_node_status_condition metric #7160

Closed
tetianakravchenko opened this issue Jul 27, 2023 · 7 comments
Assignees
Labels

Comments

@tetianakravchenko
Copy link
Contributor

for the container_service data_stream: some documents will be dropped after enabling TSDB due to duplications, mainly for the kube_node_status_condition metric - since each condition is stored in separate document:

example: for the same node/status and condition there are multiple documents with the same metric value:
Screenshot 2023-07-25 at 16 29 03

we checked together with @zmoog and could easily reproduce this behavior

cc @tommyers-elastic

@tommyers-elastic
Copy link
Contributor

is node status condition a dimension? if so we will only drop identical docs, which seems ok to me?

@tetianakravchenko
Copy link
Contributor Author

@tommyers-elastic yes, node, status and condition are dimensions. On the screenshot it is a "good" example, I will try to provide different example.

I think we shouldn't move de-duplication logic to the tsdb, instead it should be on the beats side

@tommyers-elastic
Copy link
Contributor

we are going to look into the azure API response here to ensure we are not missing some dimensions. once that is confirmed we can come back to the discussion about where is the appropriate place to do the deduplication, if necessary.

@zmoog zmoog self-assigned this Aug 28, 2023
@zmoog zmoog added the Integration:azure Azure Logs label Aug 28, 2023
zmoog added a commit to zmoog/beats that referenced this issue Sep 3, 2023
When users define dimensions in the config, the current implementation
groups metrics by timestamp and single dimension.

Grouping by ts + single dimension can sometimes lead to multiple
documents with the same dimension values. This does not play well with
TSDB, because it expects all documents with the same timestamp to have
a unique combination of dimensions value.

I am updating the group by dimensions logic to use all dimensions for
grouping instead of just one.

It is working fine with the test cases I am using, but this needs more
testing and understanding.

Refs: elastic/integrations#7160
@zmoog
Copy link
Contributor

zmoog commented Sep 3, 2023

@tetianakravchenko, I think I now have a good understanding of why we're getting multiple documents with the same dimension values.

I created the PR elastic/beats#36491 and experimented with a solution. I hope to wrap this up soon.

@zmoog
Copy link
Contributor

zmoog commented Sep 3, 2023

I'm expanding the commit message a little.

The integration/model sets up dimensions in the config:

    resources:
    - resource_group: ""
      resource_type: "Microsoft.ContainerService/managedClusters"
      metrics:
      - name: ["kube_node_status_condition"]
        namespace: "Microsoft.ContainerService/managedClusters"
        ignore_unsupported: true
        timegrain: "PT5M"
        dimensions:
        - name: "node"
          value: "*"
        - name: "status"
          value: "*"
        - name: "condition"
          value: "*"

When we have dimensions, the current implementation groups metrics response by timestamp + dimension (just one).

Grouping by timestamp + single dimension can sometimes lead to multiple documents with the same dimension values.

Here's an example where we have 10 metrics values right before entering the final groupings:

CleanShot 2023-09-04 at 01 21 35@2x

At this point, the current implementation starts grouping these 10 metric values by each single dimension. Here are screenshots of the groupings:

CleanShot 2023-09-04 at 00 58 58@2x

CleanShot 2023-09-04 at 00 59 15@2x

CleanShot 2023-09-04 at 00 59 45@2x

This grouping does not play well with TSDB because it expects all documents with the same timestamp to have
a unique combination of dimensions values.

Here's the result of the grouping: we end up having 1 + 2 + 10 = 13 documents in Elasticsearch:

CleanShot 2023-09-04 at 01 08 17@2x

@tetianakravchenko
Copy link
Contributor Author

Hey @zmoog thank you for the detailed explanation!

I created the PR #7160 and experimented with a solution. I hope to wrap this up soon.

did you mean this PR - elastic/beats#36491 ?

Do you think it is the same reason for another issue as well - #7621 (note: there are 2 cases) ?

@zmoog
Copy link
Contributor

zmoog commented Sep 4, 2023

I created the PR #7160 and experimented with a solution. I hope to wrap this up soon.

did you mean this PR - elastic/beats#36491 ?

Yep, I'm sorry for pasting the wrong link. I updated the comment with the correct one.

Do you think it is the same reason for another issue as well - #7621 (note: there are 2 cases)?

Uhm, I'm not sure, but probably not. I don't see dimensions in #7621; this problem happens only when we have them. Added a comment to the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants