Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inventory] Alerts not matching to K8s entities #202355

Closed
roshan-elastic opened this issue Nov 29, 2024 · 4 comments
Closed

[Inventory] Alerts not matching to K8s entities #202355

roshan-elastic opened this issue Nov 29, 2024 · 4 comments
Assignees
Labels
Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team

Comments

@roshan-elastic
Copy link

roshan-elastic commented Nov 29, 2024

Description

Alerts grouped by the k8s entity ID are not showing against the k8s entities in the Inventory (or showing wrong):

Steps to Replicate

  1. Incorrect alerts showing
incorrect.alerts.-.k8s.entities.mp4
  1. Alerts missing
missing.alerts.-.k8s.entities.mp4
@roshan-elastic roshan-elastic added the Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team label Nov 29, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

@roshan-elastic
Copy link
Author

cc @kpatticha - You can test this on edge-logs if you want to quickly replicate

@jennypavlova jennypavlova self-assigned this Dec 2, 2024
@jennypavlova
Copy link
Member

Hi @roshan-elastic
I am trying to understand the issue better now and tried it on the edge logs cluster.

First issue:

To summarize in some cases (like the k8s.cluster.ecs) we see fewer alerts in inventory and when we click on the alerts link we see more alerts because of the filter ( we have in the filter field only "edge-log" and on the rule page you showed the filter is orchestrator.cluster.name: "edge-log" ) so when we get the count we use the filter and when we navigate to the alerts we have only the value ( "edge-log" ) set in the search bar:

Inventory -> Alerts page after count click Is that the expected filter?
Image Image Image

I am asking this because I saw that we removed the field on purpose as part of #202188 (last table entry in the PR description) so do we want that back to fix this or there is something else I am missing here 🤔 ?


Second issue:

I checked the data and compared the rules and the only difference is that it is using the metrics dataview and not the logs one we have for the cluster rule. I am not super familiar with the custom threshold rules but it should work in theory with different dataviews (we have the correct mapping 'k8s.deployment.ecs' => 'kubernetes.deployment.name' in inventory so I assume something is wrong with the filtering. I tried to execute both queries to the alerts indices we use in inventory to get the alerts count and the host one for example returns the result but the deployment one doesn't:

Image

But on the alerts page I see the alerts:

Image

⚠ Not with the field filter (kubernetes.deployment.name : *) tho:

Image

So maybe this is something to check with the @elastic/response-ops team 🤔 :
This is the alert rule: logs cluster link (it's also shown in the second video in the description.

The queries from the screenshot (click here) #no results GET .alerts-*/_search {"size":0,"track_total_hits":false,"query":{"bool":{"filter":[{"term":{"kibana.alert.status":"active"}}]}},"aggs":{"k8s.deployment.ecs":{"composite":{"size":500,"sources":[{"kubernetes.deployment.name":{"terms":{"field":"kubernetes.deployment.name"}}}]}}}}

#returnes results
GET .alerts-*/_search
{"size":0,"track_total_hits":false,"query":{"bool":{"filter":[{"term":{"kibana.alert.status":"active"}}]}},"aggs":{"host":{"composite":{"size":500,"sources":[{"host.name":{"terms":{"field":"host.name"}}}]}}}}

@roshan-elastic
Copy link
Author

roshan-elastic commented Dec 2, 2024

Hey @jennypavlova,

cc @simianhacker

Thanks for picking this up.

First issue:

Ah - good catch here. This predates the simplification of the alerts filtering to filter by just the string of the entity name. Either way, I have no idea why the 'host' rules are matching the cluster as I'm unable to replicate this with other rules.

Here's a vid to show you:

Video

I have a hypothesis that filtering to only show alerts using kibana.alert.group.field : {ID FIELD} would do the trick:

Alerts for a host
Image

Alerts for a kubernetes pod
Image

I'm not 100% sure this it the right solution but I think it would ensure that alerts are explicitly grouped by the entity we're showing.

I like the elegance of just filtering by the entity you're looking for but I personally don't understand the behaviour of why those host alerts are showing against the cluster (and the pod alerts in the same cluster don't show up in the alerts app if I simply filter by the cluster name - edge-logs).

If figure if I can't figure it out, our users aren't going to either.

Any thoughts/ideas?

@roshan-elastic roshan-elastic changed the title [Inventory] Alerts not matching to K8s entities [stub] [Inventory] Alerts not matching to K8s entities Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team
Projects
None yet
Development

No branches or pull requests

3 participants