-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(structured data): query DB v0 #2499
Conversation
915ad31
to
0cff1e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a first pass, will need to review more once addressed 👍
a17b5b9
to
c0fd006
Compare
0cff1e8
to
d1160f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Not a big fan of the ok_or_else I generally go with match cases for that, it's a bit more pedantic but I find slightly easier to read. But no action required. This is valid Rust 👍
Ah I was starting to enjoy them ! |
Really feel free to keep them! 👍 |
e6f6646
to
2a97361
Compare
2a97361
to
ab048df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks overall good. Happy to make a pass on it as well once ready. I think we can optimize a bunch of things a bit more to align with the experiment we did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to look good. I left some comments but overall this looks great.
I'm quite bearish on the untyped jump through Value on your wayt to sql Would much rather have the kind of types we had in the experiment + you can probably handle the Boolean <-> 1/0 in the ToSql FromSql implementation of that which would be much much cleaner?
What we had in the experiment is equally typed as a |
Not really this would be much smaller right. The idea would be to go from JSON (in DB) to that type which is much more restricted as soon as you get out of the DB and have typed interactions everywhere instead of Value which from a typed perspective could be nested etc... |
Ok so you are suggesting we add an intermediary type that is kind of a "narrowed" value and that implements toSQL ? I'm not necessarily against it, but IMO we don't gain much from it, because even if it can't be nested or can't be a type that we don't support in databases, it can still be of the wrong type for a given query so there is no type safety per se |
@spolu will go with the narrowed enum. I feel this one is a much stronger argument than the type safety one |
82f0942
to
cb241b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
One final thing that I think is important to figure out now rather than later.
core/src/databases/database.rs
Outdated
.map(|r| table_schema.schema.get_insert_params(&field_names, r)) | ||
.collect::<Result<Vec<_>>>()? | ||
.iter() | ||
.map(|values| match stmt.execute(params_from_iter(values)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to run that at the end out of the par_iter. Requires benchmarking.
I suspect this inherently prevents parallelization because the pool will get filled with rows waiting for execution on the lock that protects that stmt.execute.
I would instead create all params in parallalel and then run the stmt.execute sequentially which is locked anyway.
Can we benchmark the two approach to convince ourselves of the best approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worth the time since this code is likely to be final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already tried that, params_from_iter complains when it is in the par_iter. I can try again again but I think it's not possible because it needs to own the value (so we'd have to copy which I believe is worse ?).
creating and executing the smt cannot be done in the par_iter because conn cannot go through threads.
I am happy to benchmark, but I don't see how copying every single row's data can be faster than calling params from iter sequentially ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No copying involved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I'd have to clone()
the params (sequentially).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works with into_iter
instead of iter
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the former copying the data ? Or merely saying that from this point only the new interator can reference it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that actually makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM \o/
Thanks for bearing with my feedback
🙌 stoked to merge |
💯 |
#2211
Not particularly optimized, no caching etc..