-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues and new features #4
Comments
There's no support for multiple tuple variables at the moment. Maybe i should make that explicit with an error message |
The problem here is that there's no such conversion into dates yet. Also, it's assumed that the right hand side of the predicate is always a RelationPredicate, I'll investigate it further and see if it can be generalized. |
Yes, i also thought about this issue before. Imo the renaming should be bound to the existential or universal operator, as they introduce a new "scope" where now the variable should be treated differently. |
The problem here is that the R(t) is being skipped, i thought it wouldn't make a difference, but as seen here, it does. I'll take it into account. |
Good! It would meet the needs. |
Perhaps, a way to solve that would be to consider the base relation as |
Perhaps, the generalization could be something similar to valueExpressions from RA grammar: |
@rlaiola fixed on #6. I also generalized the relation predicate positioning. Now, it can be placed anywhere as long as it is in the correct scope. I've added an error message it is missing. Now a query like this works: { r | ∃p (p.a > 1 and R(p) ) and R(r) } |
I think it may be more complicated than that, but i'll take a look at it! Thanks for the suggestion! |
Idk, it looks a bit redundant, also as TRC relies on the truth or satisfaction of the conditions specified in the query, i don't see the benefit of adding these. What do you think? |
I'm not sure if this query makes sense. In this case there's no separation between the condition and the tuple variable, which goes against the very definition of a TRC query, which is of the form:
The pipe separates the condition from the tuple variable, which does not happen in this query. What do you think? |
Indeed, this one does not make sense (copy and paste from another example). The expression that I tried to represent has a rename operator in the projection, as follows: { t.a->z | R(t) } |
This expression could definitely be rewritten in a more elegant way but the main point that I wanted to convey here was that a predicate should be a general boolean value expression, as represented in the Relational Algebra grammar. For instance, this works: sigma (a>1) = true (R) I have the impression that we should try using a similar approach (represent predicates as valueExpressions). |
Great job! Something like this would also work? Note that I refer to { r | ∃p (p.a > r.a and R(p) ) and R(r) } |
Yes! Definitely, as long as you correctly define the RelationPredicate at the right scope, everything is supposed to work as expected. |
Oh, i see. In this case maybe it does make sense. I've got an idea on how to implement it, i'll take a look at it later. Thanks for the suggestion. |
It might help: https://github.com/KPMGE/relax/blob/development/src/db/parser/grammar_ra.pegjs#L418-L440 |
@rlaiola Implemented on #7 { r.a->x, r.b->y, r.c->z | R(r) }
{ r.a->x, r.b, r.c | R(r) }
{ r.a, r.b, r.c | R(r) } I've also added the rename operator(arrow) on the editor tab. |
Oh i see, that's a good point. I'll revisit it later. Thanks for the suggestion. |
@rlaiola Solved in #10 . I've incorporated the valueExpr from RA into the trc grammar. Now the predicate is a valueExpr so with some adjustments these cases are now covered. It's worth pointing that now, the not operator must have parenthesis for predicates, just like the ! do, so i don't think this one is a problem. The tests were adjusted accordingly. I've also took the liberty to come up with a slightly more interesting test case: { t | DEPENDENT(t) and date(t.Bdate) < date('1964-09-15') } This one is a bit more interesting as it actually filters out and returns some tuples based on the condition. The previous example also works, but it returns no tuples(as expected). |
Summary:
|
Fixed on #13 |
Summary:
Now a query like this works: { t.a->z, p.b, k | R(t) and S(p) and T(k) } This example is quite interesting because it shows that any combination of renaming and projecting can also be combined with this implementation (these scenarios were covered with unit tests) As pointed out in the #9, if one tries to define a RelationPredicate twice for any tuple variable in the same scope, an error is shown as before |
@KPMGE, you rock! With joins working the possibilities are endless! 🚀 Some remarks:
-- Works
{ t.a->z | R(t) } -- Does not work: Relation R has a column b
-- Error: could not add column "t.b" because of ambiguity
{ t.a->b | R(t) }
![]()
|
Well done! To not loose track, I write here the missing features that would be nice to have:
-- Dataset: RST
-- Error: Relation predicate must be defined!
{ concat(t.b, t.c)->bla | R(t) }
-- or
{ (t.b || t.c)->bla | R(t) } -- Dataset: RST
-- sigma abs(a)>0 (R)
-- Works!
{ t | R(t) and (abs(a)>0) }
-- Does not work
{ t | R(t) and abs(a)>0 } 2 ) Variable assignment X = { t | R(t) and t.a = 1 }
{ p | S(p) and ∃t (X(j)) and p.b = j.b } |
Fixed on #17 |
Fixed on #18 |
@rlaiola I'm not sure if variable assignment makes sense for TRC. RA and TRC are different by nature, RA is a procedural language, meaning it tells "how to retrieve x" so that, you can "build up" a query by just splitting these steps and saving them into variables. For TRC, it's the complete opposite, as it is a declarative language, meaning it tells "what you want to retrieve" instead of telling the steps to get to it. In this context, there's no really a notion of intermediate steps, the whole TRC expression is an unified logical expression that states the conditions for the data to be retrieved. I see that, in TRC the concept of variables is "encapsulated" in quantified operations, namely ∃ and ∀ In your example, let's say: X = { t | R(t) and t.a = 1 } That part could make sense, you kinda give a "label" to the tuples retrieved from this query. But now, as soon as you wanna use it, it becomes problematic: { p | S(p) and ∃t (X(j)) and p.b = j.b } You can't "inject" a variable into a set of tuples(which is the result of the previous query, stored in X). In TRC it's not possible to use X as "calling a function", the tuples must be quantified and that needs to be done in the TRC query context. As now there's support for multiple tuple variables, this kind of query becomes even easier to represent correctly: { p | S(p) and ∃t (R(t) and t.a = 1 and p.b = t.b) } What do you think? |
@KPMGE, I'm very glad with your answer! It shows that you really got the point! That said, I'll try to make an argument. While TRC doesn't formally support storing intermediate results the way procedural languages do, breaking down complex queries into smaller, reusable parts can greatly enhance legibility, maintainability, and reusability. We can see this modular and structured approach as an extension borrowed from Higher-Order Logic. In certain scenarios this would make TRC queries easier to write, understand, and maintain. We could say that these are as well some of the benefits of relational algebra's variables or views and CTEs in SQL. Not really mandatory but I see it as a interesting feature, especially for learning purposes. And as this implementation is based on the RA one, the realization seems "an inch away". My two cents! |
Testing the latest commit, I found the following issues: -- Generalized projection
-- Works in RA
-- pi (a*2)->doublea (R)
-- Does not work in TRC
-- Error: Relation predicate must be defined!
{ (t.a*2)->doublea | R(t) } -- Value Expressions with date (Do not work in RA not TRC)
-- Error: Cannot read properties of undefined (reading '0')
-- sigma date(Bdate) > date('1990-01-01') (DEPENDENT)
-- { t | DEPENDENT(t) and date(t.Bdate) > date('1990-01-01') }
-- sigma date(Bdate)<date('1964-09-15') (DEPENDENT)
{ t | DEPENDENT(t) and date(t.Bdate) < date('1964-09-15') } This latest one might be my fault 🙄 |
@rlaiola fixed on #19. Thanks for pointing out, unfortunately js by default sets undefined, which broke the program. I've just filtered the result, now it works |
That's very strange, i've already solved this issue before #4 (comment). I recently updated my repository with the code from relax, maybe that's where the issue came from. |
@rlaiola I've solved the issue, indeed it was a recently merged request on relax that caused this issue: dbis-uibk#201. I've opened a PR solving it: dbis-uibk#217 In this repository, i've already merged it though, so now it's working again: |
@KPMGE sorry, this thread is getting quite long. I've found another issue. I supposed that it was working before. -- Works
{ t | R(t) }
-- Do not work
-- Error: at line 2: Expected "!=", "%", "*", "+", "-", "/", "<", "<=", "<>", "=", ">", ">=", "ilike", "like", "||", "}", "≠", "≤", "≥", logical AND, logical IMPLICATION, logical OR, or logical XOR but "∈" found.
{ t | t ∈ R }
-- Do not work
{ t | t in R } |
@rlaiola thanks for pointing out! This issue has to do with the notation of TRC we're using. I've solved the issue #4 (comment) in which i've changed the order of evaluation of an AtomicFormula, so that queries like the one you've suggested would work. But then, the grammar matches incorrectly when you have a RelatonPredicate alone as you've pointed out here. If i change the order back the this issue will be solved, but the other one will go back. I'd suggest removing the R(t) notation of relation predicates and leaving only the last 2, that would work. What do you think? |
@rlaiola Good points! I agree that in SQL and RA, variable assignment is an interesting feature, that greatly reduces the complexity of some queries. But i'm not really sure if that's suitable for TRC, though. As i pointed out, both SQL and RA are procedural languages, that makes their expressions way more "self contained" as they're kind of like "instructions", so in this case it makes sense to me to break down a big query into smaller pieces. Also, because of the "procedureness" of SQL and RA, the expressions have a well defined scope. For example: pi a(sigma a > 3 (R)) In this case, it's clear that the query will be executed from the inner expression R, that will then be filtered and projected. This way, it makes sense to break it down: k = sigma a > 3 (R)
pi a(k) but now, in TRC, the scope for the variable must be defined, and for that a whole TRC query is necessary, for example, the following query(partial) doesn't make sense: R(t) and t.a > 3 So, if you need a whole query every time, you end up creating a whole new scope, for example, the hypothetical query has 2 scopes: x = { t | R(t) and t.a > 3 }
{ t.a | R(t) and x } In this case, the second query does not make sense, as the t for the first and second queries are different, they live in different scopes! So, to mix them up and try to make it work would require "come up with" a new syntax for TRC, which i'm not sure would be good for learning purposes. I think having a TRC query as a whole expression that tells what is to be retrieved is more suitable in this case, as breaking down a query and having to learn a new notation just to use this feature here would be a bit confused. Wouldn't it? For me, one should think of a TRC query having in mind its "declarativeness". I'm open to suggestions, though. |
In this example, the X variable represents a macro for a relation so the expressions could be rewritten as X = { t | R(t) and t.a > 3 } -- in this scope the tuple variable is t
{ k.a | X(k)} -- this is a new scope so the tuple variable is k (t is unknown)
The syntax would continue the same, this would be an alternative way.
Perhaps, we'd need to run a study with users to measure that anyway.
I do agree with you, let's at least keep as a suggestion for future work. |
PR #22 should fix this. |
@KPMGE found an issue when using a column by its index: -- Works
{ t.a, t.c, p.b, p.d | R(t) and S(p) }
-- Works
{ t.[1], t.[3], p.b, p.d | R(t) and S(p) }
-- Does not work
-- Error: invalid projection "t.a, t.c, p.[1], p.[2]": column index "p.[1]" is out of range in schema [t.a : number, t.b : string, t.c : string, p.b : string, p.d : number]; index starts at 1
{ t.a, t.c, p.[1], p.[2] | R(t) and S(p) } |
No, it wont. As i explained, the problem is that abs(something) has the same structure as R(t), both have a string followed by a string enclosed by parenthesis. This only has to do with the order in which a AtomicFormula is evaluated: If a Predicate is first evaluated, then it will try mathing a Predicate and will fail: If you change the order, then, as you've pointed out, it will work again: But then, this issue will come back #4 (comment) In this case, as abs(a) has the structure of a RelationPredicate, the grammar will incorrectly match it here. In this case, there are 2 solutions:
Imo, the second approach is a bit better, but i'm open to suggestions. What do you think? |
This issue has to do with the way the RA implementation was made. The same error occurs when trying do the same on a RA query: The issue is that, the result schema starts from 1. So because a cross join was made, the P's indexes actually start from 4, because you have the 3 R's columns followed by the 2 columns from s. [t.a, t.b, t.c, p.b, p.d] |
Good to know. It is worth submitting an issue in the main repository. |
@KPMGE just to make sure, have you tried running the app with the changes in #22? If not, please checkout and try this branch: https://github.com/rlaiola/relax/tree/add-iff-op_fix-in-op_cleanup All the exemples seem to be working here. -- Works
{ t | R(t) and abs(a)>3 }
-- Works
{ t | abs(a)>3 and R(t) }
-- Works
{ t | t in R and abs(a)>3 }
-- Works
{ t | abs(a)>3 and t in R }
-- Works
{ t | t ∈ R and abs(a)>3 }
-- Works
{ t | abs(a)>3 and t ∈ R } Thanks! |
It might help testing # clone repo, checkout remote branch and apply changes
git clone https://github.com/KPMGE/relax.git \
&& cd relax \
&& git remote add fork_rlaiola https://github.com/rlaiola/relax.git \
&& git fetch fork_rlaiola fix-bags \
&& git fetch fork_rlaiola add-iff-op_fix-in-op_cleanup \
&& git checkout -b fix-bags fork_rlaiola/fix-bags \
&& git checkout -b add-iff-op_fix-in-op_cleanup fork_rlaiola/add-iff-op_fix-in-op_cleanup \
&& git remote remove fork_rlaiola \
&& git checkout implement-trc \
&& git merge fix-bags -m 'merge 1' \
&& git merge add-iff-op_fix-in-op_cleanup -m 'merge 2'
# open in vs code
code .
# Restore repo
git reset --merge HEAD~2 |
@rlaiola sorry, i think i had a problem with cache or something, i've cleaned the cache and reinstalled the application, now it works! Sorry about that and thanks for the help. You've came up with a very interesting solution! congrats. |
@rlaiola since all issues have been resolved and all limitation have been addressed, do you think this issue can be closed? Thanks for the help. |
Below there are some use cases that need further investigation.
Issues (potentially)
New features
X = { t | R(t) and t.a = 1 }
The text was updated successfully, but these errors were encountered: