Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing remote calls for subqueries #1500

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

kvpetrov
Copy link
Contributor

Top level instant subqueries that get redirected to remote partitions are converted to range calls.
We, however, assume that the remote call should match the original call. Ie if we had an instant query call that came to us, then the remote call should be instant as well. This is not the case with subqueries. Instead of trying to continue to call query or query_range depending on what was the original API called, always call query_range as this API will be able to resolve all cases.

@@ -111,7 +111,12 @@ class MultiPartitionPlanner(partitionLocationProvider: PartitionLocationProvider
} else {
// Single partition but remote, send the entire plan remotely
val remotePartitionEndpoint = partitions.head.endPoint
val httpEndpoint = remotePartitionEndpoint + params.remoteQueryPath.getOrElse("")
val remoteQueryPath = if (params.remoteQueryPath.getOrElse("") == "/api/v1/query") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fix at the source where the remoteQueryPath is set instead of "hacking" it here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vishramachandran-- We've met and discussed. Subqueries like these show the problem when raw data lives in multiple partitions:

foo{shard_key=~".*"}[1h:1m]
(foo{shard_key="baz"} + bar{shard_key="bat"})[1h:1m]
sum(foo{shard_key=~".*"}[1h:1m])

In these cases, the MultiPartitionPlanner will walk the plan until it reaches raw data, where it decides which partition to route the PromQL to. The issue is that it always routes the PromQL to the same query API that received the original subquery. So, for example:

api/v1/query
(foo{shard_key="baz"} + bar{shard_key="bat"})[1h:1m]

would have its inner PromQL routed like:

api/v1/query
foo{shard_key="baz"}

where any applicable time params would be dropped. I see two good options:

  1. Add logic to MultiPartitionPlanner specifically to handle this case. Update the QueryContext to contain query_range, then materialize subquery children with that context.
  2. Default always to use query_range for all child requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants