-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refine ut framework including Part 1 and Part 2 #10861
refine ut framework including Part 1 and Part 2 #10861
Conversation
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
build |
import org.apache.spark.sql.catalyst.util.{ArrayData, GenericArrayData, MapData, TypeUtils} | ||
import org.apache.spark.sql.internal.SQLConf | ||
import org.apache.spark.sql.rapids.utils.RapidsQueryTestUtil.isNaNOrInf | ||
import org.apache.spark.sql.types._ | ||
import org.apache.spark.unsafe.types.UTF8String | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -121,6 +121,12 @@ object RapidsSQLTestsBaseTrait { | |||
"org.apache.spark.sql.rapids.ExecutionPlanCaptureCallback") | |||
.set("spark.sql.warehouse.dir", warehouse) | |||
.set("spark.sql.cache.serializer", "com.nvidia.spark.ParquetCachedBatchSerializer") | |||
.set("spark.sql.session.timeZone", "UTC") | |||
.set("spark.rapids.sql.explain", "ALL") | |||
// uncomment below config to run `strict mode`, where fallback to CPU is treated as fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we want a parameterized test suit here for both strict mode
and non-strict mode
associated with different RapidsTestSettings
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot more test cases will fail in strict mode as shown in our report, and we'll need even more resource to take care of the failed test cases in strict mode.
Now that we're still struggling with non-strict mode, the result of strict mode is more of a reference than a guide to our daily priority. So it's not mandatory to do so at present? we can add that anytime when we have enough resource to deal with failed test cases in strict mode .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should consider how to parallel execute these two modes to avoid CI taking too much time
1072302
to
c612243
Compare
build |
c612243
to
c0ed014
Compare
build |
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
c0ed014
to
ada5334
Compare
build |
Having two PRs pending that do very similar things is confusing, especially when one of them already has a lot of active comments. Why not post a separate PR that just has the last commit? Yes, there's a small amount of conflicts to resolve when one goes in before the other, but fairly straightforward to resolve. As it is now, this could get approved and merged without addressing all of the comments in #10851. |
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Hi Jason, I also considered your approach, but that would result in a PR against binmahone/spark-rapids instread of NVIDIA/spark-rapids ? But I have to admit these two PRs are indeed consufing.. I had to do this because of the deadline of code freeze. "As it is now, this could get approved and merged without addressing all of the comments in #10851. " -- Now that all of the comments are addressed in this PR (#10861), including your latest comment #10851 (comment), let's go with this PR. If you don't have any other comments, please approve it ASAP so that I can merge it in? |
build |
1 similar comment
build |
Please do not force-push changes to a PR. It makes it very difficult for reviewers to track what has changed since they last looked at it. Instead of rebasing, push a merge commit instead.
No, there's two potential ways to do this.
This is a bit of a mess now, because there are review comments on #10851 that are only addressed here, and none of the review comments are here for posterity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK, but should not be merged until either #10851 is merged first or @revans2 and @gerashegalov who had review comments in #10851 are happy with this version of it.
The criteria is met (#10851 (comment)). I'm merging this PR now |
This PR also closes #10859 |
This PR is a superset of #10851, it closes #10850 and #10859
Compared to #10851 . it has one more additional commit to distinguish two modes for expression based suites: vectorized parameter mode and scalar parameter mode.
For RAPIDS expressions that only support scalar parameters (litOnlyTypes) , we have to use scalar parameter mode. For others we prefer to use vectorized parameter mode.
In case I can't get approval from reviewer because of the additional commit before 24.06 code freeze, we can merge #10851 , otherwise merge this PR.