-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEW Add ORM abstraction for "WITH" clauses #10943
NEW Add ORM abstraction for "WITH" clauses #10943
Conversation
491d5b2
to
08ab07e
Compare
feb2260
to
53f3750
Compare
d4eac7c
to
983bd0a
Compare
/** | ||
* Tests that CTE queries have appropriate JOINs for subclass tables etc. | ||
* If `$query->query()->` was replaced with `$query->query->` in DataQuery::with(), this test would throw an exception. | ||
* @doesNotPerformAssertions | ||
*/ | ||
public function testWithUsingDataQueryAppliesRelations() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using @doesNotPerformAssertions
is the correct way to mark that a test passes simply by not throwing an exception. There's another test in the codebase that also does this.
The alternative would be to either:
- add an assertion in here like
$this->assertTrue(true);
which is redundant with the@doesNotPerformAssertions
annotation, or - add an assertion testing the result of the query, which makes it look like we care about the result of the query, which we don't
92b5d80
to
e68f33e
Compare
// Assume it's lower if it's not a proper version number | ||
if (!preg_match('/^(\d+\.){2}\d+$/', $actualVersion)) { | ||
return -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm doing this for safety against things like a proxy-db which could have a weird version syntax. I'd much rather just assume those don't support it so we can safely use alternative queries rather than throwing exceptions due to unexpected version strings or incorrectly saying it's supported when it isn't.
2ceb05e
to
ca92098
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only looked at the code portion, I haven't looked at tests yet
src/ORM/DataQuery.php
Outdated
* @param string[] $cteFields Aliases for any columns selected in $query which can be referenced in any queries | ||
* UNIONed to the $query and in this query directly, as though they were columns in a real table. | ||
* NOTE: If $query is a DataQuery, then cteFields must be the names of real columns on that DataQuery's data class. | ||
* @param string|string[] $onClause The "ON" clause (escaped SQL statement) for an INNER JOIN on the query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should just not have the $onClause param at, seems like we're trying to be helpful by making it a tiny bit easier to add an INNER JOIN at the cost of impurifying this method?
Honestly I think if you're using this API at this point you've probably already written you complex query with a CTE manually using DB::query() and now you're just trying to do it "properly" with more ORM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should just not have the $onClause param at, seems like we're trying to be helpful by making it a tiny bit easier to add an INNER JOIN at the cost of impurifying this method?
Inner join is (from what I've seen while I was looking into how to implement this) the most common type of join used in CTE queries. i.e. you're using the CTE to filter your result set to include only the records that are discovered by the CTE, and which exist in the main database table you're querying.
So this caters for the most common scenario while allowing alternatives.
Honestly I think if you're using this API at this point you've probably already written you complex query with a CTE manually using DB::query() and now you're just trying to do it "properly" with more ORM
I don't know where that assumption comes from. It hasn't been supported until MySQL 8 so the likelyhood of any DB::query()
using CTEs in existing project codebases is pretty low.
There's a lot of room to use this API in core code now that we're introducing it, and we don't have any purely raw SQL queries using CTEs right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry what I meant is that because CTE's are complex, personally I'd write it out as raw SQL first using DB::query() and then convert it to use the ORM later and validate it against my raw query. I wouldn't start out with alterDataQuery() :-)
I still don't think we should be polluting this method with this inner join "helper", after all you can just go $dataQuery->with(<withParams>)->innerJoin(<innerJoinParams>)
to do the same thing which also reads better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry what I meant is that because CTE's are complex, personally I'd write it out as raw SQL first using DB::query() and then convert it to use the ORM later and validate it against my raw query. I wouldn't start out with alterDataQuery() :-)
Ahh yup that makes sense, fair enough.
I still don't think we should be polluting this method with this inner join "helper", after all you can just go
$dataQuery->with(<withParams>)->innerJoin(<innerJoinParams>)
to do the same thing which also reads better
Inner join will be by far the most common join for a CTE query. Part of the point of having higher level abstractions is that they take care of common scenarios for you, which is what is happening here. So this really is all about convenience.
To put it into perspective:
- The only time you'd want a left join is if you want all of the records in the main query, and are only adding to the result set with the CTE. I can't think of a single scenario where you'd want to to that.
- You'd use a right join only when your CTE is filling in some gaps in your
DataQuery
result set. There's an example of that in the unit tests, where the CTE gets the full date range between the min and max dates in the result set, and adds a bunch of null results for the missing dates. There's really only a handful of cases where you'd want to do something like this, and in probably 90% of those cases you'd more likely just do it directly in PHP instead. - Inner join would be used when you want to effectively filter your result set based on the CTE query. This covers scenarios like getting the ancestors/descendents for a record, or getting all archived versions of records (where finding "wasDeleted" versions in the latest_versions set is the non-recursive CTE, which gets inner joined on ID = version ID).
In the scenarios I've identified so far where we could use CTEs to simplify or optimise queries in our codebase, none of them would use a left or right join.
Regarding the "pollution" of this method, I don't understand which part of it you consider pollution, or how it represents a pollution of the method?
Regarding chaining an innerJoin()
instead of passing in args, one of the main points of having the $onClause
argument is the array syntax, which greatly simplifies the information you need to pass in to create the join in the first place. The join method call would look something like this otherwise: innerJoin(DataObject::getSchema()->sqlColumnForField($dataQuery->dataClass(), 'JoinField') . ' = ' . Convert::symbol2sql([$cteName, 'CTEField'])]);
Yes obviously some of that can be made into variables but it's a lot more fussing about than ['JoinField' => 'CTEField']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we keep the $onClause param at very least we kind of need to get rid of array support because it's simply inconsistent with the rest of the ORM, e.g. DataList::innerJoin() which should be more "helpful" given it's the API projects actually use only supports the string param for $onClause not an array param
Once the with() method is down to only supporting string params then at that point all we're doing is $this->query->addInnerJoin($name, $onClause);
so at that point you may as well just go ->with()->innerJoin($onClause)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't understand where you're coming from with this, and especially I don't understand why you keep putting quotes around "helpful" lol. I can't see a world where this isn't useful additional abstraction.
However, given that you also seemingly can't see where I'm coming from, I'll remove this piece of code. Just don't be surprised if in a year's time we have a lot of this
$joinField = DataObject::getSchema()->sqlColumnForField($dataQuery->dataClass(), 'JoinField');
$cteJoinField = Convert::symbol2sql([$cteName, 'CTEField']);
$dataQuery->with(...)->innerJoin("$joinField = $cteJoinField");
that we want to abstract away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also
e.g. DataList::innerJoin() which should be more "helpful" given it's the API projects actually use only supports the string param for $onClause not an array param
I absolutely agree that the DataList
join abstractions are shit and should be made more useful. If adding this array syntax support to those would help you accept this I'd very gladly raise a PR to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the $onClause
parameter, though I still say this makes the abstraction considerably less useful.
ca92098
to
4a9d157
Compare
4a9d157
to
900b4c3
Compare
Ugh, now CI is failing lol. I'll take a look at what's going on there. |
931bc08
to
043f610
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there
src/ORM/Queries/SQLSelect.php
Outdated
* @param string[] $cteFields Aliases for any columns selected in $query which can be referenced in any queries | ||
* UNIONed to the $query and in this query directly, as though they were columns in a real table. | ||
*/ | ||
public function addWith(string $name, self $query, array $cteFields = [], bool $recursive = false): static |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public function addWith(string $name, self $query, array $cteFields = [], bool $recursive = false): static | |
public function addWith(string $name, SQLSelect $query, array $cteFields = [], bool $recursive = false): static |
Just make it a little easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - you allowed self
as the typehint for union in a recent PR so I've changed that as well to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple of tiny changes to make to not use self
for method param types
043f610
to
94dcd9f
Compare
Note that several times the code or comments in this PR refer to "CTE" - that stands for "Common Table Expression" and is a very very common way to refer to a
WITH
statement in SQL, making it a lot easier to find with a text search since "with" is such a common word.Issue