Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Synthetics] Improve overview page performance !! #201275

Merged
merged 33 commits into from
Dec 11, 2024

Conversation

shahzad31
Copy link
Contributor

@shahzad31 shahzad31 commented Nov 21, 2024

Summary

Improve overview page performance !!

Right now UI works for few hundred to 1000 monitors, but it starts degrading after that, this PR makes sure, we refactor queries in such a way that it scale up to 10k-20k monitors easily.

Queries before

Before this PR, we were doing 2 steps queries, first fetch all saved objects and the fetch all summary documents by passings all ids from first phase. This meant that let's say if we have 20k saved objects, first we will need to page through all of them to even start fetching summaries. To fetch summary documents, we were using top_hits query which can be memory expensive.

Queries now

In this PR we fetch summaries and saved objects in parallel, since we have space id on documents as well, there was no need to do 2 step queries. Now we fetch both things in parallel and then we hydrate saved object data from summary data. In this PR now we are using top_metrics query to fetch each monitor status instead of top_hits

I tested on about 20k monitors, app performs reasoably well after the PR
image

On a very slow cluster on which kibana is local against a remote cluster

After

image

Before

image

@shahzad31 shahzad31 marked this pull request as ready for review November 27, 2024 14:23
@shahzad31 shahzad31 requested a review from a team as a code owner November 27, 2024 14:23
@shahzad31 shahzad31 added release_note:skip Skip the PR/issue when compiling release notes backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) labels Nov 27, 2024
Copy link
Contributor

@justinkambic justinkambic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality looks good to me. Have some recommended code changes/improvements. I'll let @dominiqueclarke leave a review as well before approving when she's available.

@botelastic botelastic bot added ci:project-deploy-observability Create an Observability project Team:obs-ux-management Observability Management User Experience Team labels Nov 27, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

Copy link
Contributor

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@shahzad31
Copy link
Contributor Author

/oblt-deploy

Copy link
Contributor

@justinkambic justinkambic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticmachine
Copy link
Contributor

elasticmachine commented Dec 11, 2024

💚 Build Succeeded

  • Buildkite Build
  • Commit: d9894a9
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-201275-d9894a91010f

Metrics [docs]

✅ unchanged

History

@shahzad31 shahzad31 merged commit b4ccb0c into elastic:main Dec 11, 2024
8 checks passed
@shahzad31 shahzad31 deleted the overview-status-performance branch December 11, 2024 19:33
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12283590228

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 11, 2024
## Summary

Improve overview page performance !!

Right now UI works for few hundred to 1000 monitors, but it starts
degrading after that, this PR makes sure, we refactor queries in such a
way that it scale up to 10k-20k monitors easily.

### Queries before
Before this PR, we were doing 2 steps queries, first fetch all saved
objects and the fetch all summary documents by passings all ids from
first phase. This meant that let's say if we have 20k saved objects,
first we will need to page through all of them to even start fetching
summaries. To fetch summary documents, we were using `top_hits` query
which can be memory expensive.

### Queries now
In this PR we fetch summaries and saved objects in parallel, since we
have space id on documents as well, there was no need to do 2 step
queries. Now we fetch both things in parallel and then we hydrate saved
object data from summary data. In this PR now we are using top_metrics
query to fetch each monitor status instead of `top_hits`

I tested on about 20k monitors, app performs reasoably well after the PR
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b">

On a very slow cluster on which kibana is local against a remote cluster

### After
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff">

### Before
<img width="1918" alt="image"
src="https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806">

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit b4ccb0c)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
## Summary

Improve overview page performance !!

Right now UI works for few hundred to 1000 monitors, but it starts
degrading after that, this PR makes sure, we refactor queries in such a
way that it scale up to 10k-20k monitors easily.


### Queries before
Before this PR, we were doing 2 steps queries, first fetch all saved
objects and the fetch all summary documents by passings all ids from
first phase. This meant that let's say if we have 20k saved objects,
first we will need to page through all of them to even start fetching
summaries. To fetch summary documents, we were using `top_hits` query
which can be memory expensive.


### Queries now
In this PR we fetch summaries and saved objects in parallel, since we
have space id on documents as well, there was no need to do 2 step
queries. Now we fetch both things in parallel and then we hydrate saved
object data from summary data. In this PR now we are using top_metrics
query to fetch each monitor status instead of `top_hits`


I tested on about 20k monitors, app performs reasoably well after the PR
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b">


On a very slow cluster on which kibana is local against a remote cluster

### After
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff">


### Before
<img width="1918" alt="image"
src="https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806">

---------

Co-authored-by: kibanamachine <[email protected]>
kibanamachine added a commit that referenced this pull request Dec 12, 2024
…03892)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Synthetics] Improve overview page performance !!
(#201275)](#201275)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT
[{"author":{"name":"Shahzad","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-11T19:33:33Z","message":"[Synthetics]
Improve overview page performance !! (#201275)\n\n##
Summary\r\n\r\nImprove overview page performance !!\r\n\r\nRight now UI
works for few hundred to 1000 monitors, but it starts\r\ndegrading after
that, this PR makes sure, we refactor queries in such a\r\nway that it
scale up to 10k-20k monitors easily.\r\n\r\n\r\n### Queries
before\r\nBefore this PR, we were doing 2 steps queries, first fetch all
saved\r\nobjects and the fetch all summary documents by passings all ids
from\r\nfirst phase. This meant that let's say if we have 20k saved
objects,\r\nfirst we will need to page through all of them to even start
fetching\r\nsummaries. To fetch summary documents, we were using
`top_hits` query\r\nwhich can be memory expensive.\r\n\r\n\r\n###
Queries now\r\nIn this PR we fetch summaries and saved objects in
parallel, since we\r\nhave space id on documents as well, there was no
need to do 2 step\r\nqueries. Now we fetch both things in parallel and
then we hydrate saved\r\nobject data from summary data. In this PR now
we are using top_metrics\r\nquery to fetch each monitor status instead
of `top_hits`\r\n\r\n\r\nI tested on about 20k monitors, app performs
reasoably well after the PR\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b\">\r\n\r\n\r\nOn
a very slow cluster on which kibana is local against a remote
cluster\r\n\r\n### After\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff\">\r\n\r\n\r\n###
Before\r\n<img width=\"1918\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806\">\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"b4ccb0c205b2df4312edfe7a087e0bca25242d05","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Team:obs-ux-management"],"title":"[Synthetics]
Improve overview page performance
!!","number":201275,"url":"https://github.com/elastic/kibana/pull/201275","mergeCommit":{"message":"[Synthetics]
Improve overview page performance !! (#201275)\n\n##
Summary\r\n\r\nImprove overview page performance !!\r\n\r\nRight now UI
works for few hundred to 1000 monitors, but it starts\r\ndegrading after
that, this PR makes sure, we refactor queries in such a\r\nway that it
scale up to 10k-20k monitors easily.\r\n\r\n\r\n### Queries
before\r\nBefore this PR, we were doing 2 steps queries, first fetch all
saved\r\nobjects and the fetch all summary documents by passings all ids
from\r\nfirst phase. This meant that let's say if we have 20k saved
objects,\r\nfirst we will need to page through all of them to even start
fetching\r\nsummaries. To fetch summary documents, we were using
`top_hits` query\r\nwhich can be memory expensive.\r\n\r\n\r\n###
Queries now\r\nIn this PR we fetch summaries and saved objects in
parallel, since we\r\nhave space id on documents as well, there was no
need to do 2 step\r\nqueries. Now we fetch both things in parallel and
then we hydrate saved\r\nobject data from summary data. In this PR now
we are using top_metrics\r\nquery to fetch each monitor status instead
of `top_hits`\r\n\r\n\r\nI tested on about 20k monitors, app performs
reasoably well after the PR\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b\">\r\n\r\n\r\nOn
a very slow cluster on which kibana is local against a remote
cluster\r\n\r\n### After\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff\">\r\n\r\n\r\n###
Before\r\n<img width=\"1918\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806\">\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"b4ccb0c205b2df4312edfe7a087e0bca25242d05"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/201275","number":201275,"mergeCommit":{"message":"[Synthetics]
Improve overview page performance !! (#201275)\n\n##
Summary\r\n\r\nImprove overview page performance !!\r\n\r\nRight now UI
works for few hundred to 1000 monitors, but it starts\r\ndegrading after
that, this PR makes sure, we refactor queries in such a\r\nway that it
scale up to 10k-20k monitors easily.\r\n\r\n\r\n### Queries
before\r\nBefore this PR, we were doing 2 steps queries, first fetch all
saved\r\nobjects and the fetch all summary documents by passings all ids
from\r\nfirst phase. This meant that let's say if we have 20k saved
objects,\r\nfirst we will need to page through all of them to even start
fetching\r\nsummaries. To fetch summary documents, we were using
`top_hits` query\r\nwhich can be memory expensive.\r\n\r\n\r\n###
Queries now\r\nIn this PR we fetch summaries and saved objects in
parallel, since we\r\nhave space id on documents as well, there was no
need to do 2 step\r\nqueries. Now we fetch both things in parallel and
then we hydrate saved\r\nobject data from summary data. In this PR now
we are using top_metrics\r\nquery to fetch each monitor status instead
of `top_hits`\r\n\r\n\r\nI tested on about 20k monitors, app performs
reasoably well after the PR\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b\">\r\n\r\n\r\nOn
a very slow cluster on which kibana is local against a remote
cluster\r\n\r\n### After\r\n<img width=\"1920\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff\">\r\n\r\n\r\n###
Before\r\n<img width=\"1918\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806\">\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"b4ccb0c205b2df4312edfe7a087e0bca25242d05"}}]}]
BACKPORT-->

Co-authored-by: Shahzad <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Jan 13, 2025
## Summary

Improve overview page performance !!

Right now UI works for few hundred to 1000 monitors, but it starts
degrading after that, this PR makes sure, we refactor queries in such a
way that it scale up to 10k-20k monitors easily.


### Queries before
Before this PR, we were doing 2 steps queries, first fetch all saved
objects and the fetch all summary documents by passings all ids from
first phase. This meant that let's say if we have 20k saved objects,
first we will need to page through all of them to even start fetching
summaries. To fetch summary documents, we were using `top_hits` query
which can be memory expensive.


### Queries now
In this PR we fetch summaries and saved objects in parallel, since we
have space id on documents as well, there was no need to do 2 step
queries. Now we fetch both things in parallel and then we hydrate saved
object data from summary data. In this PR now we are using top_metrics
query to fetch each monitor status instead of `top_hits`


I tested on about 20k monitors, app performs reasoably well after the PR
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/c143e196-59a4-45b4-86b7-bd22ac4c5d4b">


On a very slow cluster on which kibana is local against a remote cluster

### After
<img width="1920" alt="image"
src="https://github.com/user-attachments/assets/540d0cdf-2f8c-44d1-af76-81953d9ca0ff">


### Before
<img width="1918" alt="image"
src="https://github.com/user-attachments/assets/5fdc314d-bb59-4137-9397-d8aee6bd4806">

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:obs-ux-management Observability Management User Experience Team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants