Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

super db error using from with a parameter #5660

Open
chrismo opened this issue Feb 18, 2025 · 5 comments
Open

super db error using from with a parameter #5660

chrismo opened this issue Feb 18, 2025 · 5 comments

Comments

@chrismo
Copy link

chrismo commented Feb 18, 2025

I know it's still early days on super but I was trying swapping in super db for zed in some existing scripts and ran across this.

❯ zed init ./test-lake
❯ zed -lake ./test-lake create "a"
❯ super -c "{a:1}" | zed load -lake ./test-lake -use "a" -
❯ zed query -lake ./test-lake "from a | yield this"
❯ zed query -lake ./test-lake -z "op load_pool(pool_name): (from pool_name | yield this) load_pool('a')"
#=> {a:1}
... same setup steps using super db ...
❯ super db query -lake ./test-lake "op load_pool(pool_name): (from pool_name | yield this) load_pool('a')"
pool_name: pool not found at line 1, column 32:
op load_pool(pool_name): (from pool_name | yield this) load_pool('a')
                               ~~~~~~~~~
@philrz
Copy link
Contributor

philrz commented Feb 18, 2025

Hi @chrismo. Sorry for the trouble. Indeed, some adjustments to from were made in #5378 to make way for SQL coverage, so you're seeing the impact of that here. To get the equivalent of what you had before, the recently added eval() would be used, e.g.,

$ super db query -lake ./test-lake "op load_pool(pool_name): (from eval(pool_name) | yield this) load_pool('a')"
{a:1}

So far that's only subtly shown in the docs in an example here. It's on my to-do list to more formally document eval() and make sure it's shown more prominently in the from operator docs, so your experience is a helpful reminder. Let me know if you have any additional questions in the interim.

@chrismo
Copy link
Author

chrismo commented Feb 19, 2025

No trouble, I understand the nature of works-in-progress! :)

But is that the intended design for the from operator (vs. a workaround)? If so, I guess I'd vote for it to revert to prior zed behavior in the future - it doesn't seem intuitive to me? Or is there a larger context here for the future of superdb where eval will be understandable in places like this?

@chrismo
Copy link
Author

chrismo commented Feb 19, 2025

I read over the linked PR ... I can sorta see what the issue now that it can accept different kinds of arguments ...

@chrismo
Copy link
Author

chrismo commented Feb 19, 2025

Follow-up question - the original PR mentions a field reference as the pool name, but that didn't work with zq and doesn't work at the moment with super db and eval ... I'm just not sure if it's supposed to.

This makes easy to distinguish a file reference e.g., "from foo.json", from a record deref. e.g., "from [this.pool]"


super db query -z -lake ./test-lake "{name:'a'} | from this.name | yield this"
from operator cannot have parent unless from argument is an expression at line 1, column 19:
{name:'a'} | from this.name | yield this
                  ~~~~~~~~~

super db query -z -lake ./test-lake "{name:'a'} | from eval(this.name) | yield this"
a: cannot open in a data lake environment

@philrz
Copy link
Contributor

philrz commented Feb 19, 2025

@chrismo: Yes, good find. While I happened to mention #5378 as the first step in these changes that led to where we are now with different from/eval behaviors and how that starts to relate to the new SuperSQL stuff, I neglected to recount the whole history, which also includes the changes in #5437 and #5476. Where it all landed us in the end is that the "field reference" use of eval currently works in non-lake context, e.g.,

$ super -version
Version: v1.18.0-292-g6d4b913e

$ echo '"hello world"' > file.json

$ echo '{input_file: "file.json"}' | super -c 'from eval(input_file)' -
"hello world"

...but the work hasn't yet been done to support the same with the lake and pools (as you encountered), basically because it would require some new lake-specific functionality that's non-trivial and right now the limited development resources we've got are focused on getting the file-centric use cases all working before we focus again on the lake. I'll make sure this caveat is more clearly communicated when I get the eval docs done. When they're drafted it'd be great if you were up for looking them over to confirm if they'd have been helpful in resolving your initial question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants