[Azure] [container_service] Duplicated documents for the `kube_node_status_condition` metric #7160

tetianakravchenko · 2023-07-27T08:37:36Z

for the container_service data_stream: some documents will be dropped after enabling TSDB due to duplications, mainly for the kube_node_status_condition metric - since each condition is stored in separate document:

example: for the same node/status and condition there are multiple documents with the same metric value:

we checked together with @zmoog and could easily reproduce this behavior

cc @tommyers-elastic

The text was updated successfully, but these errors were encountered:

tommyers-elastic · 2023-07-27T15:06:19Z

is node status condition a dimension? if so we will only drop identical docs, which seems ok to me?

tetianakravchenko · 2023-07-27T16:18:37Z

@tommyers-elastic yes, node, status and condition are dimensions. On the screenshot it is a "good" example, I will try to provide different example.

I think we shouldn't move de-duplication logic to the tsdb, instead it should be on the beats side

tommyers-elastic · 2023-07-31T13:29:34Z

we are going to look into the azure API response here to ensure we are not missing some dimensions. once that is confirmed we can come back to the discussion about where is the appropriate place to do the deduplication, if necessary.

When users define dimensions in the config, the current implementation groups metrics by timestamp and single dimension. Grouping by ts + single dimension can sometimes lead to multiple documents with the same dimension values. This does not play well with TSDB, because it expects all documents with the same timestamp to have a unique combination of dimensions value. I am updating the group by dimensions logic to use all dimensions for grouping instead of just one. It is working fine with the test cases I am using, but this needs more testing and understanding. Refs: elastic/integrations#7160

zmoog · 2023-09-03T17:32:38Z

@tetianakravchenko, I think I now have a good understanding of why we're getting multiple documents with the same dimension values.

I created the PR elastic/beats#36491 and experimented with a solution. I hope to wrap this up soon.

zmoog · 2023-09-03T23:28:06Z

I'm expanding the commit message a little.

The integration/model sets up dimensions in the config:

    resources:
    - resource_group: ""
      resource_type: "Microsoft.ContainerService/managedClusters"
      metrics:
      - name: ["kube_node_status_condition"]
        namespace: "Microsoft.ContainerService/managedClusters"
        ignore_unsupported: true
        timegrain: "PT5M"
        dimensions:
        - name: "node"
          value: "*"
        - name: "status"
          value: "*"
        - name: "condition"
          value: "*"

When we have dimensions, the current implementation groups metrics response by timestamp + dimension (just one).

Grouping by timestamp + single dimension can sometimes lead to multiple documents with the same dimension values.

Here's an example where we have 10 metrics values right before entering the final groupings:

At this point, the current implementation starts grouping these 10 metric values by each single dimension. Here are screenshots of the groupings:

This grouping does not play well with TSDB because it expects all documents with the same timestamp to have
a unique combination of dimensions values.

Here's the result of the grouping: we end up having 1 + 2 + 10 = 13 documents in Elasticsearch:

tetianakravchenko · 2023-09-04T09:56:20Z

Hey @zmoog thank you for the detailed explanation!

I created the PR #7160 and experimented with a solution. I hope to wrap this up soon.

did you mean this PR - elastic/beats#36491 ?

Do you think it is the same reason for another issue as well - #7621 (note: there are 2 cases) ?

zmoog · 2023-09-04T10:25:50Z

I created the PR #7160 and experimented with a solution. I hope to wrap this up soon.

did you mean this PR - elastic/beats#36491 ?

Yep, I'm sorry for pasting the wrong link. I updated the comment with the correct one.

Do you think it is the same reason for another issue as well - #7621 (note: there are 2 cases)?

Uhm, I'm not sure, but probably not. I don't see dimensions in #7621; this problem happens only when we have them. Added a comment to the issue.

tetianakravchenko mentioned this issue Jul 27, 2023

[Azure] TSDB enablement - track all metrics data streams #7140

Closed

31 tasks

tetianakravchenko mentioned this issue Jul 28, 2023

[Azure] Metrics are not grouped #7027

Closed

zmoog self-assigned this Aug 28, 2023

zmoog added the Integration:azure Azure Logs label Aug 28, 2023

zmoog mentioned this issue Sep 3, 2023

[Azure] [Metrics] Update group by dimensions logic elastic/beats#36491

Closed

6 tasks

zmoog mentioned this issue Nov 6, 2023

Azure Monitor: adjust grouping logic and avoid duplicating documents to make the metricset TSDB-friendly elastic/beats#36823

Merged

6 tasks

tetianakravchenko closed this as completed Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Azure] [container_service] Duplicated documents for the `kube_node_status_condition` metric #7160

[Azure] [container_service] Duplicated documents for the `kube_node_status_condition` metric #7160

tetianakravchenko commented Jul 27, 2023

tommyers-elastic commented Jul 27, 2023

tetianakravchenko commented Jul 27, 2023

tommyers-elastic commented Jul 31, 2023

zmoog commented Sep 3, 2023 •

edited

Loading

zmoog commented Sep 3, 2023

tetianakravchenko commented Sep 4, 2023

zmoog commented Sep 4, 2023

[Azure] [container_service] Duplicated documents for the kube_node_status_condition metric #7160

[Azure] [container_service] Duplicated documents for the kube_node_status_condition metric #7160

Comments

tetianakravchenko commented Jul 27, 2023

tommyers-elastic commented Jul 27, 2023

tetianakravchenko commented Jul 27, 2023

tommyers-elastic commented Jul 31, 2023

zmoog commented Sep 3, 2023 • edited Loading

zmoog commented Sep 3, 2023

tetianakravchenko commented Sep 4, 2023

zmoog commented Sep 4, 2023

[Azure] [container_service] Duplicated documents for the `kube_node_status_condition` metric #7160

[Azure] [container_service] Duplicated documents for the `kube_node_status_condition` metric #7160

zmoog commented Sep 3, 2023 •

edited

Loading