Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SQL context field for stress.json #26

Open
mxmarg opened this issue Oct 13, 2023 · 5 comments
Open

add SQL context field for stress.json #26

mxmarg opened this issue Oct 13, 2023 · 5 comments

Comments

@mxmarg
Copy link
Contributor

mxmarg commented Oct 13, 2023

Having a field to pass in a SQL context via stress.json would allow us to replay any query that we find in queries.json files or job profiles.

Example stress.json:
{ "queries": [ { "query": "SELECT * FROM \"SF weather 2018-2019.csv\" LIMIT 50", "context": ["Samples", "samples.dremio.com"], "frequency": 1 } ] }

Required code changes for HTTP
Add context field here: https://github.com/rsvihladremio/dremio-stress/blob/main/pkg/protocol/protocol_http.go#L59-L62

For ODBC:
Add context parameter: https://pkg.go.dev/database/sql#example-DB.ExecContext

@rsvihladremio
Copy link
Owner

the ODBC example is a different thing with the context (this is a go construct), obviously the http side is trivial, not as sure for ODBC

@mxmarg
Copy link
Contributor Author

mxmarg commented Oct 13, 2023

Yeah, I would already be happy with an HTTP context and for ODBC context can be ignored (for now).

@rsvihladremio
Copy link
Owner

I'd like to avoid partial support if I could, but this is doable with odbc (just not as nice) I can do one of the following

Strategy 1

  1. list the number of contexts found in the stress.json
  2. foreach of those make a separate connection and call "use" on them
  3. route queries to the appropriate connection

Strategy 2

  1. each time the context "changes" block on the odbc connection, call use, submit the query

The downside of strategy 2 is that will for sure hurt query throughput when using contexts with odbc, the downside of strategy 1 is more open files..managing connections etc..also complexity. I may start with 1 since I'm not sure how to set the context back to default with USE

@rsvihladremio
Copy link
Owner

ok some follow up

tried both methods..odbc just hard crashes..not sure why yet, will have to revist

the http method is not so simple as you have to express the path as an array ( I guess one could argue you have to push that back on the user)

-data-raw '{
"sql": "SELECT * FROM "SF weather 2018-2019.csv"",
"context": [
"Samples",
"samples.dremio.com"
],

this is harder than it looks at first glance

@mxmarg
Copy link
Contributor Author

mxmarg commented Oct 16, 2023

Hey Ryan,
You make good points about the complexity involved for ODBC, which I had not considered.
This will ultimately come down to if you see value of this feature for the general audience.
I would hate for you to put a lot of effort into this, just because I asked. I can definitely make do with building my own fork that adds a context for REST calls. I do not have a strong need for it in ODBC.
Regarding the data structure of the context, I would leave the context as an array, since that is also the way it is saved in queries.json and job profiles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants