Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTE not working #124

Open
anthezium opened this issue Mar 28, 2016 · 23 comments
Open

CTE not working #124

anthezium opened this issue Mar 28, 2016 · 23 comments

Comments

@anthezium
Copy link

This should work, right? (emails.csv is there, works with other queries)
q -d, -H "WITH alias AS (SELECT * FROM ./emails.csv) SELECT * FROM alias"

Instead, I get:
No files matching 'alias' have been found

Do I need to do something special to let it know that the alias is an alias?

@anthezium
Copy link
Author

q version 1.5.0
Copyright (C) 2012-2014 Harel Ben-Attia ([email protected], @harelba on twitter)
http://harelba.github.io/q/

@jungle-boogie
Copy link
Contributor

Does sqlite3 depend on q having support for CTE? If so, maybe q was made prior to CTE's in sqlite3.

@harelba
Copy link
Owner

harelba commented Apr 2, 2016

Hi, sorry for the late reply, i'm on a long vacation...

Most sqlite capabilities are exposed through q without any special needs. However, CTE is not one of them. I'll take a deeper look and see if it's possible to add it.

Thanks
Harel

@anthezium
Copy link
Author

Thanks for getting back on this!

The query works in sqlite3, so if you decide not to add CTEs (at least in the near term), you could just change this sentence on the site:
"All sqlite3 SQL constructs are supported, including joins across files (use an alias for each table)."

And add stuff to the "Limitations" section.

@harelba
Copy link
Owner

harelba commented Apr 3, 2016

Of course, i'll definitely add this to the limitations if I can't come up with something good.

@mahiki
Copy link

mahiki commented Nov 6, 2017

this is a pretty big missing functionality. something like this makes it easier to generate histograms, for instance. I've been using q regularly, such a perfect use-case. But now I want to do more, and my workflow is pushing toward copying file to local postgres db.

If q can't run CTE, and I have to migrate workflow to postgres local db, there is not much sense in continuing to use q because I would already have invested effort in managing a postgres workflow for local csv files.

@bitti
Copy link

bitti commented Nov 10, 2017

I don't really know about CTEs but it seems their usecase which can not be achieved with subqueries is in recursive queries. If you don't need recursive queries you should be fine with using subqueries, which are not supported by q in the from clause but there is a workaround: #102

@mahiki
Copy link

mahiki commented Nov 21, 2017

yes you are right that a subquery satisfies CTE functionality, and I don't need recursive CTE for generic use-cases in my work. I guess the note about sqlite getting CTE functionality gave me hope.

I'm aware of the subquery limitation and the workaround, but based on your reminder I took another look at #102. It works great, although I have to lookup the quoting syntax pretty much every time.

On balance, this is easier than the steps of creating a schema in postgres, and copying the file in.

Still -- if its possible to gain the new sqlite CTE functionality, it would be excellent.

@vi
Copy link

vi commented Feb 11, 2020

Can q recognize that things after AS are not to be interpreted as filenames and just left as is?

@harelba
Copy link
Owner

harelba commented Feb 12, 2020

Can q recognize that things after AS are not to be interpreted as filenames and just left as is?

Yes, AS is supported, and actually needed when performing JOINs

@vi
Copy link

vi commented Feb 12, 2020

$ python bin/q.py  'with eee as ( select 9 ) select 4'
4

$ python3 bin/q.py  'with eee as ( select 9 ) select * from eee'
No files matching 'eee' have been found

@vi
Copy link

vi commented Feb 12, 2020

Maybe special table names like nofile_eee can be used to signal q not to try to read any files?

diff --git a/bin/q.py b/bin/q.py
index b41451f..de773bf 100755
--- a/bin/q.py
+++ b/bin/q.py
@@ -406,6 +406,9 @@ class Sql(object):
                     qtable_name = qtable_name[:qtable_name.index(')')]
                     self.sql_parts[idx + 1] = qtable_name

+                if qtable_name.startswith("nofile_"):
+                    idx += 1
+                    continue
                 self.qtable_names.add(qtable_name)

                 if qtable_name not in self.qtable_name_positions.keys():

Or maybe there can be implemented a mode where each file-to-table mapping is provided separately and individually?

@harelba
Copy link
Owner

harelba commented Feb 12, 2020

@vi The issue is not with reading literal SELECTs (e.g. q "select 9" works well in q). CTE syntax itself is not supported because of complications in analyzing the query strings to detect table names.

Perhaps there's a new package for analyzing the SQL in python which can help with this. I'll do some checking.

@vi
Copy link

vi commented Feb 12, 2020

With the diff above some my use cases which required saving temporary CSV files start working without temporary files.

@harelba
Copy link
Owner

harelba commented Feb 12, 2020

@vi the idea of q is to provide seamless use of SQL, without any need for special casing with other parameters (e.g. user knowledge).

If the CTE use case will get more upvotes, i'd consider it a higher priority for trying to find a proper solution for this.

@EdMUK
Copy link

EdMUK commented Mar 21, 2020

Modifying the suggestion from @vi so it checks to see if qtable_name refers to a file seems to work and avoids the need for special table names. @harelba do you see any issues with this approach.

# check if qtable_name references a file
if os.path.isfile(qtable_name) == False:
	# assume reference to CTE
	idx += 1
	continue

@vi
Copy link

vi commented Mar 21, 2020

@EdMUK Will this lead to less user-friendly error messages in case of misspelled filenames?

@harelba
Copy link
Owner

harelba commented Mar 21, 2020

Hi, I don't think that checking for an actual file would be the best approach.

However, I'm looking into the "filename escaping" solution that was suggested by @vi a month ago. Something along these lines could prove very useful to solve this issue without too much risk or new development.

This would render q's SQL to be less compatible with standard sqlite3, but together with some command line flag to enable it, it could be a reasonable solution.

Another solution i'm looking into is to detect the "WITH X" parts and provide automatic "escaping" behind the scenes, which would be a more transparent solution.

I'll update.

Best of health to everyone during these corona times.
Harel

@harelba
Copy link
Owner

harelba commented Mar 21, 2020

I would like to test any solution before actually releasing a version, since the parsing logic is delicate as it is (I always wanted to use a proper AST parser for this, which would have made solving this trivial, but never got to it because the existing solution worked well enough for most cases).

Anyway, I'll create a branch when done, and ask you politely to download it manually and do some tests before going forward with this. Hope that won't be too much of a hassle for you.

Harel

@harelba
Copy link
Owner

harelba commented Mar 21, 2020

Pushed an initial PR (#223 ) with a fix, but can't really test it on my computer for dumb reasons. Wanted travis to perform the testing, but it seems to have an issue as well with bringing up the osx containers.

I'll update when it's successful. You're welcome to comment on the fix (and find any holes in it...).

Harel

@harelba
Copy link
Owner

harelba commented Mar 23, 2020

@vi @mahiki @EdMUK @bitti @anthezium Would any of you mind manually doing some testing on the branch of PR #223 before i merge it?

I don't have much experience with crazy CTE queries, so want to try and make sure that the fix is good enough and doesn't break anything.

These are complicated times, and I'm sure that all of you have a lot on your mind, so obviously that totally fine if you can't spend time on this.

@harelba
Copy link
Owner

harelba commented Mar 23, 2020

My solution is too simplistic there, it will work only for simple use cases.

I'll work towards the filename "escaping" solution, and update.

@vi
Copy link

vi commented Feb 22, 2021

Is the filename escaping implement in some form?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants