Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(structured data): query DB v0 #2499
feat(structured data): query DB v0 #2499
Changes from 15 commits
c5420fb
7e4633c
f91dd28
bd61cf1
4fdfc43
6504eb8
0bb795c
3092df0
ab048df
e8c33e9
9e6e7a7
d4dd0fa
a23754f
5ee076b
75ffdbb
d93d04c
2ee45d9
0240957
cb241b1
a9058c4
a8ef53a
93bf9f4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to run that at the end out of the par_iter. Requires benchmarking.
I suspect this inherently prevents parallelization because the pool will get filled with rows waiting for execution on the lock that protects that stmt.execute.
I would instead create all params in parallalel and then run the stmt.execute sequentially which is locked anyway.
Can we benchmark the two approach to convince ourselves of the best approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worth the time since this code is likely to be final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already tried that, params_from_iter complains when it is in the par_iter. I can try again again but I think it's not possible because it needs to own the value (so we'd have to copy which I believe is worse ?).
creating and executing the smt cannot be done in the par_iter because conn cannot go through threads.
I am happy to benchmark, but I don't see how copying every single row's data can be faster than calling params from iter sequentially ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet it's possible: https://github.com/dust-tt/dust/pull/2222/files#diff-bee624fbe402bae0a0c5cbaa790eda4654833e56e5482f62da6ff74d7fb253abR105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No copying involved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I'd have to
clone()
the params (sequentially).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works with
into_iter
instead ofiter
🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the former copying the data ? Or merely saying that from this point only the new interator can reference it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that actually makes sense