Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[for analysis] Not working airmail #5217

Closed
wants to merge 24 commits into from

Conversation

fulmicoton
Copy link
Contributor

No description provided.

guilload and others added 24 commits June 28, 2024 10:18
* also remove hits that are too many when removing skiped hits

* add mock-test
* Further optimization of validation.

This uses serde_json_borrow to avoid most allocation,
copying, and inserting in hashmap as we deserialize documents.

Before:
validation is taking 10.25% of the CPU
After
validation is taking 5.9% of the CPU.

* CR comment. changed error message
includes cardinality aggregation and term aggregation perf improvement
for large "size" parameters
The piece that estimates whether the next request is likely to fail is extremely simplistic for the moment.
It simply counter the number of errors (not taking in account successes) that happened in a given time window.

The reason is that for the moment, we want to use it for persist requests when the WAL is full.
On airmail, the aggressive retry logic of the client was causing a massive grpc storm on the faulty indexer node,
taking all of its CPU and preventing it from getting out of that state.

In this case, the error estimation logic is very simple, a full WAL guarantees that no further persist request will be successful for a little while.
* docs: using-vector.md: Adjust Vector remap configuration to silence errors/warnings

* docs: using-vector.md: Provide a link to the index configuration code so it doesn't go out of sync
* optimize topn requests

add logic to detect which splits will deliver the top n results for
requests. This is only supported for match_all requests, with optional
sort_by on timestamp sorting.

start_timestamp, end_timestamp as well as a filter on the timestamp field
is not supported currently but could be.

* move to function, refactor
* Using the shard throughput information in the scheduling logic.

* added cli flags
…5198)

Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.2.2 to 2024.7.4.
- [Commits](certifi/python-certifi@2024.02.02...2024.07.04)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
throughput.

Scaling up relies on the short term average in order to rapidly
react to a change in throughput, while scaling down and the indexing scheduler relies on the long term average.
@@ -8112,7 +8134,7 @@ dependencies = [
[[package]]
name = "tantivy"
version = "0.23.0"
source = "git+https://github.com/quickwit-oss/tantivy/?rev=08b9fc0#08b9fc0b3114640ad06c2358c404c474a9eea3c1"
source = "git+https://github.com/quickwit-oss/tantivy/?rev=13e9885#13e9885dfda8cebf4bfef72f53bf811da8549445"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tantivy difference?

@@ -324,7 +325,7 @@ quickwit-serve = { path = "quickwit-serve" }
quickwit-storage = { path = "quickwit-storage" }
quickwit-telemetry = { path = "quickwit-telemetry" }

tantivy = { git = "https://github.com/quickwit-oss/tantivy/", rev = "08b9fc0", default-features = false, features = [
tantivy = { git = "https://github.com/quickwit-oss/tantivy/", rev = "13e9885", default-features = false, features = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tantivy version is changed

@fulmicoton fulmicoton closed this Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants