Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implements backoff delay in usage counters service #206363

Merged

Conversation

TinaHeiligers
Copy link
Contributor

@TinaHeiligers TinaHeiligers commented Jan 11, 2025

fix #192829

Summary

Usage counters lose counts when there's a version conflict exception to increment counters.
ATM, the service retries incrementCounter immediately on error, which could also fail if another Kibana tries at the same time.
Using a backoff retry reduces the likelihood of a version conflict exception.  

This PR replaces the retry with an exponential backoffDelay that will wait longer between retry attempts.

@TinaHeiligers TinaHeiligers added release_note:skip Skip the PR/issue when compiling release notes backport:all-open Backport to all branches that could still receive a release labels Jan 11, 2025
@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7701

[✅] test/plugin_functional/config.ts: 25/25 tests passed.

see run history

@TinaHeiligers TinaHeiligers marked this pull request as ready for review January 11, 2025 16:32
@TinaHeiligers TinaHeiligers requested a review from a team as a code owner January 11, 2025 16:32
Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny comments mostly on our tests

@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7715

[✅] test/plugin_functional/config.ts: 50/50 tests passed.

see run history

@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

kibana: {
usageCounters: {
results: [mockError, 'pass'],
// requires extended test runtime because of exponential backoff delay for retries
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our future selves, to remind us why this specific test is allowed to run longer than 5s.

Copy link
Contributor Author

@TinaHeiligers TinaHeiligers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TinaHeiligers TinaHeiligers enabled auto-merge (squash) January 15, 2025 17:56
@TinaHeiligers TinaHeiligers merged commit fd3b4ed into elastic:main Jan 15, 2025
8 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 7.17, 8.16, 8.17, 8.x

https://github.com/elastic/kibana/actions/runs/12794928570

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 15, 2025
fix elastic#192829

## Summary

Usage counters lose counts when there's a version conflict exception to
increment counters.
ATM, the service retries `incrementCounter` immediately on error, which
could also fail if another Kibana tries at the same time.
Using a backoff retry reduces the likelihood of a version conflict
exception.  

This PR replaces the retry with an exponential `backoffDelay` that will
wait longer between retry attempts.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit fd3b4ed)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 15, 2025
fix elastic#192829

## Summary

Usage counters lose counts when there's a version conflict exception to
increment counters.
ATM, the service retries `incrementCounter` immediately on error, which
could also fail if another Kibana tries at the same time.
Using a backoff retry reduces the likelihood of a version conflict
exception.  

This PR replaces the retry with an exponential `backoffDelay` that will
wait longer between retry attempts.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit fd3b4ed)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 15, 2025
fix elastic#192829

## Summary

Usage counters lose counts when there's a version conflict exception to
increment counters.
ATM, the service retries `incrementCounter` immediately on error, which
could also fail if another Kibana tries at the same time.
Using a backoff retry reduces the likelihood of a version conflict
exception.  

This PR replaces the retry with an exponential `backoffDelay` that will
wait longer between retry attempts.

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit fd3b4ed)
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
7.17 Backport failed because of merge conflicts
8.16
8.17
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 206363

Questions ?

Please refer to the Backport tool documentation

@TinaHeiligers
Copy link
Contributor Author

backport to 7.17 not needed.

kibanamachine added a commit that referenced this pull request Jan 15, 2025
…206852)

# Backport

This will backport the following commits from `main` to `8.x`:
- [implements backoff delay in usage counters service
(#206363)](#206363)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Christiane (Tina)
Heiligers","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-01-15T18:42:44Z","message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:all-open"],"title":"implements
backoff delay in usage counters
service","number":206363,"url":"https://github.com/elastic/kibana/pull/206363","mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/206363","number":206363,"mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}}]}]
BACKPORT-->

Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
kibanamachine added a commit that referenced this pull request Jan 15, 2025
…206851)

# Backport

This will backport the following commits from `main` to `8.17`:
- [implements backoff delay in usage counters service
(#206363)](#206363)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Christiane (Tina)
Heiligers","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-01-15T18:42:44Z","message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:all-open"],"title":"implements
backoff delay in usage counters
service","number":206363,"url":"https://github.com/elastic/kibana/pull/206363","mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/206363","number":206363,"mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}}]}]
BACKPORT-->

Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
kibanamachine added a commit that referenced this pull request Jan 15, 2025
…206850)

# Backport

This will backport the following commits from `main` to `8.16`:
- [implements backoff delay in usage counters service
(#206363)](#206363)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Christiane (Tina)
Heiligers","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-01-15T18:42:44Z","message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:all-open"],"title":"implements
backoff delay in usage counters
service","number":206363,"url":"https://github.com/elastic/kibana/pull/206363","mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/206363","number":206363,"mergeCommit":{"message":"implements
backoff delay in usage counters service (#206363)\n\nfix
https://github.com/elastic/kibana/issues/192829\r\n\r\n##
Summary\r\n\r\nUsage counters lose counts when there's a version
conflict exception to\r\nincrement counters.\r\nATM, the service retries
`incrementCounter` immediately on error, which\r\ncould also fail if
another Kibana tries at the same time.\r\nUsing a backoff retry reduces
the likelihood of a version conflict\r\nexception.  \r\n\r\nThis PR
replaces the retry with an exponential `backoffDelay` that will\r\nwait
longer between retry attempts.\r\n\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [x] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests
changed\r\n(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)\r\n\r\n---------\r\n\r\nCo-authored-by:
Elastic Machine
<[email protected]>","sha":"fd3b4ed590006212855c1a80994649eacbf41f31"}}]}]
BACKPORT-->

Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
@TinaHeiligers TinaHeiligers deleted the kbn192829/usage-counters-backoff-retry branch January 15, 2025 21:14
@mistic mistic added v8.17.2 and removed v8.17.1 labels Jan 21, 2025
viduni94 pushed a commit to viduni94/kibana that referenced this pull request Jan 23, 2025
fix elastic#192829

## Summary

Usage counters lose counts when there's a version conflict exception to
increment counters.
ATM, the service retries `incrementCounter` immediately on error, which
could also fail if another Kibana tries at the same time.
Using a backoff retry reduces the likelihood of a version conflict
exception.  

This PR replaces the retry with an exponential `backoffDelay` that will
wait longer between retry attempts.


- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
(https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/7701)

---------

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:all-open Backport to all branches that could still receive a release release_note:skip Skip the PR/issue when compiling release notes v8.16.3 v8.17.2 v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UsageCounters can lose counts due to version_conflict_engine_exception
5 participants