Fix after rebase #90
Annotations
50 errors
DSV2CharVarcharTestSuite.char type values should be padded or trimmed: partitioned columns:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, c: string
SubqueryAlias testcat.t
+- Project [i#394703, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394704, 5, true, false, true) AS c#394705]
+- RelationV2[i#394703, c#394704] testcat.t testcat.t
== Optimized Logical Plan ==
Project [i#394703, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394704, 5, true, false, true) AS c#394705]
+- RelationV2[i#394703, c#394704] testcat.t
== Physical Plan ==
*(1) Project [i#394703, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394704, 5, true, false, true) AS c#394705]
+- BatchScan testcat.t[i#394703, c#394704] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,a ]
|
DSV2CharVarcharTestSuite.char type values should not be padded when charVarcharAsString is true:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['c]
+- 'Filter ('c = abc)
+- 'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
c: string
Project [c#394773]
+- Filter (c#394773 = abc)
+- SubqueryAlias testcat.t
+- RelationV2[a#394771, b#394772, c#394773] testcat.t testcat.t
== Optimized Logical Plan ==
Filter (isnotnull(c#394773) AND (c#394773 = abc))
+- RelationV2[c#394773] testcat.t
== Physical Plan ==
*(1) Project [c#394773]
+- *(1) Filter (isnotnull(c#394773) AND (c#394773 = abc))
+- BatchScan testcat.t[c#394773] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![abc]
|
DSV2CharVarcharTestSuite.varchar type values length check and trim: partitioned columns:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, c: string
SubqueryAlias testcat.t
+- RelationV2[i#394794, c#394795] testcat.t testcat.t
== Optimized Logical Plan ==
RelationV2[i#394794, c#394795] testcat.t
== Physical Plan ==
*(1) Project [i#394794, c#394795]
+- BatchScan testcat.t[i#394794, c#394795] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
|
DSV2CharVarcharTestSuite.SPARK-34233: char/varchar with null value for partitioned columns:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, c: string
SubqueryAlias testcat.t
+- Project [i#394842, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394843, 5, true, false, true) AS c#394844]
+- RelationV2[i#394842, c#394843] testcat.t testcat.t
== Optimized Logical Plan ==
Project [i#394842, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394843, 5, true, false, true) AS c#394844]
+- RelationV2[i#394842, c#394843] testcat.t
== Physical Plan ==
*(1) Project [i#394842, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#394843, 5, true, false, true) AS c#394844]
+- BatchScan testcat.t[i#394842, c#394843] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,null]
|
DSV2CharVarcharTestSuite.char type comparison: partitioned columns:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(('c1 = a), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias((a = 'c1), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias(('c1 = a ), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias(('c1 > a), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias('c1 IN (a,b), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias(('c1 = 'c2), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias(('c1 < 'c2), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias('c1 IN ('c2), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7)), unresolvedalias(('c1 <=> null), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7))]
+- SubqueryAlias testcat.t
+- Project [i#396439, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c1#396440, 2, true, false, true) AS c1#396442, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c2#396441, 5, true, false, true) AS c2#396443]
+- RelationV2[i#396439, c1#396440, c2#396441] testcat.t testcat.t
== Analyzed Logical Plan ==
(c1 = a): boolean, (a = c1): boolean, (c1 = a ): boolean, (c1 > a): boolean, (c1 IN (a, b)): boolean, (c1 = c2): boolean, (c1 < c2): boolean, (c1 IN (c2)): boolean, (c1 <=> NULL): boolean
Project [(c1#396442 = rpad(a, 2, )) AS (c1 = a)#396447, (rpad(a, 2, ) = c1#396442) AS (a = c1)#396448, (rpad(c1#396442, 3, ) = a ) AS (c1 = a )#396449, (c1#396442 > rpad(a, 2, )) AS (c1 > a)#396450, c1#396442 IN (rpad(a, 2, ),rpad(b, 2, )) AS (c1 IN (a, b))#396451, (rpad(c1#396442, 5, ) = c2#396443) AS (c1 = c2)#396452, (rpad(c1#396442, 5, ) < c2#396443) AS (c1 < c2)#396453, rpad(c1#396442, 5, ) IN (c2#396443) AS (c1 IN (c2))#396454, (c1#396442 <=> null) AS (c1 <=> NULL)#396455]
+- SubqueryAlias testcat.t
+- Project [i#396439, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c1#396440, 2, true, false, true) AS c1#396442, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c2#396441, 5, true, false, true) AS c2#396443]
+- RelationV2[i#396439, c1#396440, c2#396441] testcat.t testcat.t
== Optimized Logical Plan ==
Project [(c1#396442 = a ) AS (c1 = a)#396447, (a = c1#396442) AS (a = c1)#396448, (rpad(c1#396442, 3, ) = a ) AS (c1 = a )#396449, (c1#396442 > a ) AS (c1 > a)#396450, c1#396442 IN (a ,b ) AS (c1 IN (a, b))#396451, (rpad(c1#396442, 5, ) = c2#396443) AS (c1 = c2)#396452, (rpad(c1#396442, 5, ) < c2#396443) AS (c1 < c2)#396453, rpad(c1#396442, 5, ) IN (c2#396443) AS (c1 IN (c2))#396454, isnull(c1#396442) AS (c1 <=> NULL)#396455]
+- Project [staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c1#396440, 2, true, false, true) AS c1#396442, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c2#396441, 5, true, false, true) AS c2#396443]
+- RelationV2[c1#396440, c2#396441] testcat.t
== Physical Plan ==
*(1) Project [(c1#396442 = a ) AS (c1 = a)#396447, (a = c1#396442) AS (a = c1)#396448, (rpad(c1#396442, 3, ) = a ) AS (c1 = a )#396449, (c1#396442 > a ) AS (c1 > a)#396450, c1#396442 IN (a ,b ) AS (c1 IN (a, b))#396451, (rpad(c1#396442, 5, ) = c2#396443) AS (c1 = c2)#396452, (rpad(c1#396442, 5, ) < c2#396443) AS (c1 < c2)#396453, rpad(c1#396442, 5, ) IN (c2#396443) AS (c1 IN (c2))#396454, isnull(c1#396442) AS (c1 <=> NULL)#396455]
+- *(1) Project [staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c1#396440, 2, true, false, true) AS c1#396442, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c2#396441, 5, true, false, true) AS c2#396443]
+- BatchScan testcat.t[c1#396440, c2#396441] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![true,true,true,false,true,true,false,true,false]
|
DSV2CharVarcharTestSuite.SPARK-34233: char type comparison with null values:
DSV2CharVarcharTestSuite#L1069
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias('c IN (e,null), Some(org.apache.spark.sql.Column$$Lambda$4776/0x00007f2861726418@e2f3ab7))]
+- SubqueryAlias testcat.t
+- Project [i#396527, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#396528, 2, true, false, true) AS c#396529]
+- RelationV2[i#396527, c#396528] testcat.t testcat.t
== Analyzed Logical Plan ==
(c IN (e, NULL)): boolean
Project [cast(c#396529 as string) IN (cast(e as string),cast(null as string)) AS (c IN (e, NULL))#396537]
+- SubqueryAlias testcat.t
+- Project [i#396527, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#396528, 2, true, false, true) AS c#396529]
+- RelationV2[i#396527, c#396528] testcat.t testcat.t
== Optimized Logical Plan ==
Project [staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#396528, 2, true, false, true) IN (e,null) AS (c IN (e, NULL))#396537]
+- RelationV2[c#396528] testcat.t
== Physical Plan ==
*(1) Project [staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#396528, 2, true, false, true) IN (e,null) AS (c IN (e, NULL))#396537]
+- BatchScan testcat.t[c#396528] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![null]
|
DSV2CharVarcharTestSuite.char/varchar type values length check: partitioned columns of other types:
DSV2CharVarcharTestSuite#L1079
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, c: string
SubqueryAlias testcat.t
+- Project [i#398623, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#398624, 5, true, false, true) AS c#398625]
+- RelationV2[i#398623, c#398624] testcat.t testcat.t
== Optimized Logical Plan ==
Project [i#398623, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#398624, 5, true, false, true) AS c#398625]
+- RelationV2[i#398623, c#398624] testcat.t
== Physical Plan ==
*(1) Project [i#398623, staticinvoke(class org.apache.spark.sql.catalyst.util.CharVarcharCodegenUtils, StringType, readSidePadding, c#398624, 5, true, false, true) AS c#398625]
+- BatchScan testcat.t[i#398623, c#398624] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,1 ]
|
DSV2SQLInsertTestSuite.insert with column list - follow table output order + partitioned table:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t1], [], false
== Analyzed Logical Plan ==
c1: int, c2: int, c3: int, c4: int
SubqueryAlias testcat.t1
+- RelationV2[c1#92977, c2#92978, c3#92979, c4#92980] testcat.t1 testcat.t1
== Optimized Logical Plan ==
RelationV2[c1#92977, c2#92978, c3#92979, c4#92980] testcat.t1
== Physical Plan ==
*(1) Project [c1#92977, c2#92978, c3#92979, c4#92980]
+- BatchScan testcat.t1[c1#92977, c2#92978, c3#92979, c4#92980] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
!struct<c1:int,c2:int,c3:int,c4:int> struct<>
![1,2,3,4]
|
DSV2SQLInsertTestSuite.insert with column list - by name + partitioned table:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t1], [], false
== Analyzed Logical Plan ==
c1: int, c2: int, c3: int, c4: int
SubqueryAlias testcat.t1
+- RelationV2[c1#93238, c2#93239, c3#93240, c4#93241] testcat.t1 testcat.t1
== Optimized Logical Plan ==
RelationV2[c1#93238, c2#93239, c3#93240, c4#93241] testcat.t1
== Physical Plan ==
*(1) Project [c1#93238, c2#93239, c3#93240, c4#93241]
+- BatchScan testcat.t1[c1#93238, c2#93239, c3#93240, c4#93241] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
!struct<c1:int,c2:int,c3:int,c4:int> struct<>
![1,2,3,4]
|
DSV2SQLInsertTestSuite.insert overwrite with column list - by name + partitioned table:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t1], [], false
== Analyzed Logical Plan ==
c1: int, c2: int, c3: int, c4: int
SubqueryAlias testcat.t1
+- RelationV2[c1#93438, c2#93439, c3#93440, c4#93441] testcat.t1 testcat.t1
== Optimized Logical Plan ==
RelationV2[c1#93438, c2#93439, c3#93440, c4#93441] testcat.t1
== Physical Plan ==
*(1) Project [c1#93438, c2#93439, c3#93440, c4#93441]
+- BatchScan testcat.t1[c1#93438, c2#93439, c3#93440, c4#93441] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
!struct<c2:int,c1:int,c3:int,c4:int> struct<>
![2,1,3,4]
|
DSV2SQLInsertTestSuite.insert with column list - table output reorder + partitioned table:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t1], [], false
== Analyzed Logical Plan ==
c1: int, c2: int, c3: int, c4: int
SubqueryAlias testcat.t1
+- RelationV2[c1#93567, c2#93568, c3#93569, c4#93570] testcat.t1 testcat.t1
== Optimized Logical Plan ==
RelationV2[c1#93567, c2#93568, c3#93569, c4#93570] testcat.t1
== Physical Plan ==
*(1) Project [c1#93567, c2#93568, c3#93569, c4#93570]
+- BatchScan testcat.t1[c1#93567, c2#93568, c3#93569, c4#93570] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
!struct<c4:int,c3:int,c2:int,c1:int> struct<>
![4,3,2,1]
|
DSV2SQLInsertTestSuite.SPARK-34223: static partition with null raise NPE:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, c: string
SubqueryAlias testcat.t
+- RelationV2[i#93658, c#93659] testcat.t testcat.t
== Optimized Logical Plan ==
RelationV2[i#93658, c#93659] testcat.t
== Physical Plan ==
*(1) Project [i#93658, c#93659]
+- BatchScan testcat.t[i#93658, c#93659] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,null]
|
DSV2SQLInsertTestSuite.SPARK-33474: Support typed literals as partition spec values:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['name, unresolvedalias(cast('part1 as string), None), unresolvedalias(cast('part2 as string), None), unresolvedalias(cast('part3 as string), None), 'part4, 'part5, 'part6, 'part7]
+- 'UnresolvedRelation [t1], [], false
== Analyzed Logical Plan ==
name: string, part1: string, part2: string, part3: string, part4: string, part5: string, part6: string, part7: string
Project [name#93701, cast(part1#93702 as string) AS part1#93709, cast(part2#93703 as string) AS part2#93711, cast(part3#93704 as string) AS part3#93710, part4#93705, part5#93706, part6#93707, part7#93708]
+- SubqueryAlias testcat.t1
+- RelationV2[name#93701, part1#93702, part2#93703, part3#93704, part4#93705, part5#93706, part6#93707, part7#93708] testcat.t1 testcat.t1
== Optimized Logical Plan ==
Project [name#93701, cast(part1#93702 as string) AS part1#93709, cast(part2#93703 as string) AS part2#93711, cast(part3#93704 as string) AS part3#93710, part4#93705, part5#93706, part6#93707, part7#93708]
+- RelationV2[name#93701, part1#93702, part2#93703, part3#93704, part4#93705, part5#93706, part6#93707, part7#93708] testcat.t1
== Physical Plan ==
*(1) Project [name#93701, cast(part1#93702 as string) AS part1#93709, cast(part2#93703 as string) AS part2#93711, cast(part3#93704 as string) AS part3#93710, part4#93705, part5#93706, part6#93707, part7#93708]
+- BatchScan testcat.t1[name#93701, part1#93702, part2#93703, part3#93704, part4#93705, part5#93706, part6#93707, part7#93708] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![a,2019-01-01,2019-01-01 11:11:11,Spark SQL,p1,2019-01-01,2019-01-01 11:11:11,Spark SQL]
|
DSV2SQLInsertTestSuite.SPARK-34556: checking duplicate static partition columns should respect case sensitive conf:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: int, c: string, C: string
SubqueryAlias testcat.t
+- RelationV2[i#93803, c#93804, C#93805] testcat.t testcat.t
== Optimized Logical Plan ==
RelationV2[i#93803, c#93804, C#93805] testcat.t
== Physical Plan ==
*(1) Project [i#93803, c#93804, C#93805]
+- BatchScan testcat.t[i#93803, c#93804, C#93805] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![1,2,3]
|
DSV2SQLInsertTestSuite.SPARK-41982: treat the partition field as string literal when keepPartitionSpecAsStringLiteral is enabled:
DSV2SQLInsertTestSuite#L544
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [*]
+- 'Filter ('dt = 08)
+- 'UnresolvedRelation [t], [], false
== Analyzed Logical Plan ==
i: string, j: int, dt: string
Project [i#93858, j#93859, dt#93860]
+- Filter (dt#93860 = 08)
+- SubqueryAlias testcat.t
+- RelationV2[i#93858, j#93859, dt#93860] testcat.t testcat.t
== Optimized Logical Plan ==
Filter (isnotnull(dt#93860) AND (dt#93860 = 08))
+- RelationV2[i#93858, j#93859, dt#93860] testcat.t
== Physical Plan ==
*(1) Project [i#93858, j#93859, dt#93860]
+- *(1) Filter (isnotnull(dt#93860) AND (dt#93860 = 08))
+- BatchScan testcat.t[i#93858, j#93859, dt#93860] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![a,10,08]
|
DataFrameWriterV2Suite.Overwrite: overwrite by expression: true:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#553066L, data#553067] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#553066L, data#553067] testcat.table_name
== Physical Plan ==
*(1) Project [id#553066L, data#553067]
+- BatchScan testcat.table_name[id#553066L, data#553067] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DataFrameWriterV2Suite.Overwrite: overwrite by expression: id = 3:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#553140L, data#553141] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#553140L, data#553141] testcat.table_name
== Physical Plan ==
*(1) Project [id#553140L, data#553141]
+- BatchScan testcat.table_name[id#553140L, data#553141] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DataFrameWriterV2Suite.OverwritePartitions: overwrite conflicting partitions:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#553455L, data#553456] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#553455L, data#553456] testcat.table_name
== Physical Plan ==
*(1) Project [id#553455L, data#553456]
+- BatchScan testcat.table_name[id#553455L, data#553456] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DataFrameWriterV2Suite.Create: identity partitioned table:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#553994L, data#553995] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#553994L, data#553995] testcat.table_name
== Physical Plan ==
*(1) Project [id#553994L, data#553995]
+- BatchScan testcat.table_name[id#553994L, data#553995] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DataFrameWriterV2Suite.Replace: basic behavior:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#554214L, data#554215] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#554214L, data#554215] testcat.table_name
== Physical Plan ==
*(1) Project [id#554214L, data#554215]
+- BatchScan testcat.table_name[id#554214L, data#554215] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DataFrameWriterV2Suite.Replace: partitioned table:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string, even_or_odd: string
SubqueryAlias testcat.table_name
+- RelationV2[id#554298L, data#554299, even_or_odd#554300] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#554298L, data#554299, even_or_odd#554300] testcat.table_name
== Physical Plan ==
*(1) Project [id#554298L, data#554299, even_or_odd#554300]
+- BatchScan testcat.table_name[id#554298L, data#554299, even_or_odd#554300] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![4,d,even]
![5,e,odd]
![6,f,even]
|
DataFrameWriterV2Suite.CreateOrReplace: table exists:
DataFrameWriterV2Suite#L44
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'UnresolvedRelation [testcat, table_name], [], false
== Analyzed Logical Plan ==
id: bigint, data: string
SubqueryAlias testcat.table_name
+- RelationV2[id#554421L, data#554422] testcat.table_name testcat.table_name
== Optimized Logical Plan ==
RelationV2[id#554421L, data#554422] testcat.table_name
== Physical Plan ==
*(1) Project [id#554421L, data#554422]
+- BatchScan testcat.table_name[id#554421L, data#554422] class org.apache.spark.sql.connector.catalog.InMemoryBaseTable$InMemoryBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 3 == == Spark Answer - 0 ==
struct<> struct<>
![1,a]
![2,b]
![3,c]
|
DynamicPartitionPruningV2FilterSuiteAEOff.simple inner join triggers DPP with mock-up tables:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 2 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.DPP triggers only for certain types of query:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 2 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.filtering ratio policy fallback:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 25 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.filtering ratio policy with stats when the broadcast pruning is disabled:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 1 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.partition pruning in broadcast hash joins with aliases:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 1 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.partition pruning in broadcast hash joins:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['f.date_id, 'f.product_id, 'f.units_sold, 'f.store_id]
+- 'Filter ('s.country = DE)
+- 'Join Inner, ('f.store_id = 's.store_id)
:- 'SubqueryAlias f
: +- 'UnresolvedRelation [fact_stats], [], false
+- 'SubqueryAlias s
+- 'UnresolvedRelation [dim_stats], [], false
== Analyzed Logical Plan ==
date_id: int, product_id: int, units_sold: int, store_id: int
Project [date_id#84380, product_id#84382, units_sold#84383, store_id#84381]
+- Filter (country#84386 = DE)
+- Join Inner, (store_id#84381 = store_id#84384)
:- SubqueryAlias f
: +- SubqueryAlias testcat.fact_stats
: +- RelationV2[date_id#84380, store_id#84381, product_id#84382, units_sold#84383] testcat.fact_stats testcat.fact_stats
+- SubqueryAlias s
+- SubqueryAlias testcat.dim_stats
+- RelationV2[store_id#84384, state_province#84385, country#84386] testcat.dim_stats testcat.dim_stats
== Optimized Logical Plan ==
Project [date_id#84380, product_id#84382, units_sold#84383, store_id#84381]
+- Join Inner, (store_id#84381 = store_id#84384)
:- Filter (isnotnull(store_id#84381) AND dynamicpruning#84408 [store_id#84381])
: : +- Project [store_id#84384]
: : +- Filter ((isnotnull(country#84386) AND (country#84386 = DE)) AND isnotnull(store_id#84384))
: : +- RelationV2[store_id#84384, state_province#84385, country#84386] testcat.dim_stats
: +- RelationV2[date_id#84380, store_id#84381, product_id#84382, units_sold#84383] testcat.fact_stats
+- Project [store_id#84384]
+- Filter ((isnotnull(country#84386) AND (country#84386 = DE)) AND isnotnull(store_id#84384))
+- RelationV2[store_id#84384, state_province#84385, country#84386] testcat.dim_stats
== Physical Plan ==
*(2) Project [date_id#84380, product_id#84382, units_sold#84383, store_id#84381]
+- *(2) BroadcastHashJoin [store_id#84381], [store_id#84384], Inner, BuildRight, false
:- *(2) Project [date_id#84380, store_id#84381, product_id#84382, units_sold#84383]
: +- *(2) Filter isnotnull(store_id#84381)
: +- BatchScan testcat.fact_stats[date_id#84380, store_id#84381, product_id#84382, units_sold#84383] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: [dynamicpruningexpression(true)]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=115768]
+- *(1) Project [store_id#84384]
+- *(1) Filter ((isnotnull(country#84386) AND (country#84386 = DE)) AND isnotnull(store_id#84384))
+- BatchScan testcat.dim_stats[store_id#84384, state_province#84385, country#84386] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 4 == == Spark Answer - 0 ==
struct<> struct<>
![1030,2,10,3]
![1040,2,50,3]
![1050,2,50,3]
![1060,2,50,3]
|
DynamicPartitionPruningV2FilterSuiteAEOff.broadcast a single key in a HashedRelation:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 100 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.broadcast multiple keys in a LongHashedRelation:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 100 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.broadcast multiple keys in an UnsafeHashedRelation:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 100 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.different broadcast subqueries with identical children:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 100 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.avoid reordering broadcast join keys to match input hash partitioning:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 7 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.dynamic partition pruning ambiguity issue across nested joins:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
java.util.concurrent.ExecutionException: org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 4 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.Make sure dynamic pruning works on uncorrelated queries:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: Exception thrown in awaitResult:
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-32509: Unused Dynamic Pruning filter shouldn't affect canonicalization and exchange reuse:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
CTE [view1]
: +- 'SubqueryAlias view1
: +- 'Project ['f.product_id]
: +- 'Filter ('f.units_sold = 70)
: +- 'SubqueryAlias f
: +- 'UnresolvedRelation [fact_stats], [], false
+- 'Project [*]
+- 'Filter ('v1.product_id = 'v2.product_id)
+- 'Join Inner
:- 'SubqueryAlias v1
: +- 'UnresolvedRelation [view1], [], false
+- 'SubqueryAlias v2
+- 'UnresolvedRelation [view1], [], false
== Analyzed Logical Plan ==
product_id: int, product_id: int
WithCTE
:- CTERelationDef 27, false
: +- SubqueryAlias view1
: +- Project [product_id#85875]
: +- Filter (units_sold#85876 = 70)
: +- SubqueryAlias f
: +- SubqueryAlias testcat.fact_stats
: +- RelationV2[date_id#85873, store_id#85874, product_id#85875, units_sold#85876] testcat.fact_stats testcat.fact_stats
+- Project [product_id#85875, product_id#85879]
+- Filter (product_id#85875 = product_id#85879)
+- Join Inner
:- SubqueryAlias v1
: +- SubqueryAlias view1
: +- CTERelationRef 27, true, [product_id#85875]
+- SubqueryAlias v2
+- SubqueryAlias view1
+- CTERelationRef 27, true, [product_id#85879]
== Optimized Logical Plan ==
Join Inner, (product_id#85875 = product_id#85888)
:- Project [product_id#85875]
: +- Filter ((isnotnull(units_sold#85876) AND (units_sold#85876 = 70)) AND isnotnull(product_id#85875))
: +- RelationV2[date_id#85873, store_id#85874, product_id#85875, units_sold#85876] testcat.fact_stats
+- Project [product_id#85888]
+- Filter ((isnotnull(units_sold#85889) AND (units_sold#85889 = 70)) AND isnotnull(product_id#85888))
+- RelationV2[date_id#85886, store_id#85887, product_id#85888, units_sold#85889] testcat.fact_stats
== Physical Plan ==
*(5) SortMergeJoin [product_id#85875], [product_id#85888], Inner
:- *(2) Sort [product_id#85875 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(product_id#85875, 5), ENSURE_REQUIREMENTS, [plan_id=118793]
: +- *(1) Project [product_id#85875]
: +- *(1) Filter ((isnotnull(units_sold#85876) AND (units_sold#85876 = 70)) AND isnotnull(product_id#85875))
: +- BatchScan testcat.fact_stats[date_id#85873, store_id#85874, product_id#85875, units_sold#85876] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
+- *(4) Sort [product_id#85888 ASC NULLS FIRST], false, 0
+- ReusedExchange [product_id#85888], Exchange hashpartitioning(product_id#85875, 5), ENSURE_REQUIREMENTS, [plan_id=118793]
== Results ==
== Results ==
!== Correct Answer - 1 == == Spark Answer - 0 ==
struct<> struct<>
![3,3]
|
DynamicPartitionPruningV2FilterSuiteAEOff.Plan broadcast pruning only when the broadcast can be reused:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['f.date_id, 'f.store_id, 'f.product_id, 'f.units_sold]
+- 'Filter ('f.date_id <= 1030)
+- 'Join Inner, ('f.store_id = 's.store_id)
:- 'SubqueryAlias f
: +- 'UnresolvedRelation [fact_np], [], false
+- 'SubqueryAlias s
+- 'UnresolvedRelation [code_stats], [], false
== Analyzed Logical Plan ==
date_id: int, store_id: int, product_id: int, units_sold: int
Project [date_id#85958, store_id#85959, product_id#85960, units_sold#85961]
+- Filter (date_id#85958 <= 1030)
+- Join Inner, (store_id#85959 = store_id#85962)
:- SubqueryAlias f
: +- SubqueryAlias testcat.fact_np
: +- RelationV2[date_id#85958, store_id#85959, product_id#85960, units_sold#85961] testcat.fact_np testcat.fact_np
+- SubqueryAlias s
+- SubqueryAlias testcat.code_stats
+- RelationV2[store_id#85962, code#85963] testcat.code_stats testcat.code_stats
== Optimized Logical Plan ==
Project [date_id#85958, store_id#85959, product_id#85960, units_sold#85961]
+- Join Inner, (store_id#85959 = store_id#85962)
:- Filter ((isnotnull(date_id#85958) AND (date_id#85958 <= 1030)) AND isnotnull(store_id#85959))
: +- RelationV2[date_id#85958, store_id#85959, product_id#85960, units_sold#85961] testcat.fact_np
+- Project [store_id#85962]
+- Filter (isnotnull(store_id#85962) AND dynamicpruning#85983 [store_id#85962])
: +- Filter ((isnotnull(date_id#85958) AND (date_id#85958 <= 1030)) AND isnotnull(store_id#85959))
: +- RelationV2[date_id#85958, store_id#85959, product_id#85960, units_sold#85961] testcat.fact_np
+- RelationV2[store_id#85962, code#85963] testcat.code_stats
== Physical Plan ==
*(2) Project [date_id#85958, store_id#85959, product_id#85960, units_sold#85961]
+- *(2) BroadcastHashJoin [store_id#85959], [store_id#85962], Inner, BuildRight, false
:- *(2) Project [date_id#85958, store_id#85959, product_id#85960, units_sold#85961]
: +- *(2) Filter ((isnotnull(date_id#85958) AND (date_id#85958 <= 1030)) AND isnotnull(store_id#85959))
: +- BatchScan testcat.fact_np[date_id#85958, store_id#85959, product_id#85960, units_sold#85961] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=118958]
+- *(1) Project [store_id#85962]
+- *(1) Filter isnotnull(store_id#85962)
+- BatchScan testcat.code_stats[store_id#85962, code#85963] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: [dynamicpruningexpression(true)]
== Results ==
== Results ==
!== Correct Answer - 4 == == Spark Answer - 0 ==
struct<> struct<>
![1000,1,1,10]
![1010,2,1,10]
![1020,2,1,10]
![1030,3,2,10]
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-32659: Fix the data issue when pruning DPP on non-atomic type:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['f.date_id, 'f.product_id, 'f.units_sold, 'f.store_id]
+- 'Filter ('s.country = DE)
+- 'Join Inner, (struct(store_id, 'f.store_id) = struct(store_id, 's.store_id))
:- 'SubqueryAlias f
: +- 'UnresolvedRelation [fact_stats], [], false
+- 'SubqueryAlias s
+- 'UnresolvedRelation [dim_stats], [], false
== Analyzed Logical Plan ==
date_id: int, product_id: int, units_sold: int, store_id: int
Project [date_id#86034, product_id#86036, units_sold#86037, store_id#86035]
+- Filter (country#86040 = DE)
+- Join Inner, (struct(store_id, store_id#86035) = struct(store_id, store_id#86038))
:- SubqueryAlias f
: +- SubqueryAlias testcat.fact_stats
: +- RelationV2[date_id#86034, store_id#86035, product_id#86036, units_sold#86037] testcat.fact_stats testcat.fact_stats
+- SubqueryAlias s
+- SubqueryAlias testcat.dim_stats
+- RelationV2[store_id#86038, state_province#86039, country#86040] testcat.dim_stats testcat.dim_stats
== Optimized Logical Plan ==
Project [date_id#86034, product_id#86036, units_sold#86037, store_id#86035]
+- Join Inner, (struct(store_id, store_id#86035) = struct(store_id, store_id#86038))
:- Filter dynamicpruning#86062 [struct(store_id, store_id#86035)]
: : +- Project [store_id#86038]
: : +- Filter (isnotnull(country#86040) AND (country#86040 = DE))
: : +- RelationV2[store_id#86038, state_province#86039, country#86040] testcat.dim_stats
: +- RelationV2[date_id#86034, store_id#86035, product_id#86036, units_sold#86037] testcat.fact_stats
+- Project [store_id#86038]
+- Filter (isnotnull(country#86040) AND (country#86040 = DE))
+- RelationV2[store_id#86038, state_province#86039, country#86040] testcat.dim_stats
== Physical Plan ==
Project [date_id#86034, product_id#86036, units_sold#86037, store_id#86035]
+- BroadcastHashJoin [struct(store_id, store_id#86035)], [struct(store_id, store_id#86038)], Inner, BuildRight, false
:- Project [date_id#86034, store_id#86035, product_id#86036, units_sold#86037]
: +- BatchScan testcat.fact_stats[date_id#86034, store_id#86035, product_id#86036, units_sold#86037] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: [dynamicpruningexpression(struct(store_id, store_id#86035) IN dynamicpruning#86062)]
: +- SubqueryBroadcast dynamicpruning#86062, 0, [struct(store_id, store_id#86038)], [id=#119076]
: +- BroadcastExchange HashedRelationBroadcastMode(List(struct(store_id, input[0, int, true])),false), [plan_id=119075]
: +- Project [store_id#86038]
: +- Filter (isnotnull(country#86040) AND (country#86040 = DE))
: +- BatchScan testcat.dim_stats[store_id#86038, state_province#86039, country#86040] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
+- ReusedExchange [store_id#86038], BroadcastExchange HashedRelationBroadcastMode(List(struct(store_id, input[0, int, true])),false), [plan_id=119075]
== Results ==
== Results ==
!== Correct Answer - 4 == == Spark Answer - 0 ==
struct<> struct<>
![1030,2,10,3]
![1040,2,50,3]
![1050,2,50,3]
![1060,2,50,3]
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-32817: DPP throws error when the broadcast side is empty:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 25 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.Subquery reuse across the whole plan:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 100 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-34436: DPP support LIKE ANY/ALL expression:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 1 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-34595: DPP support RLIKE expression:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 4 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-34637: DPP side broadcast query stage is created firstly:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 25 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-35568: Fix UnsupportedOperationException when enabling both AQE and DPP:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['s.store_id, 'f.product_id]
+- 'Filter (('s.country = DE) AND ('s.rn = 1))
+- 'Join Inner, ('f.store_id = 's.store_id)
:- 'SubqueryAlias f
: +- 'Distinct
: +- 'Project [*]
: +- 'UnresolvedRelation [fact_sk], [], false
+- 'SubqueryAlias s
+- 'Project [*, 'ROW_NUMBER() windowspecdefinition('store_id, 'state_province DESC NULLS LAST, unspecifiedframe$()) AS rn#86402]
+- 'UnresolvedRelation [dim_store], [], false
== Analyzed Logical Plan ==
store_id: int, product_id: int
Project [store_id#86407, product_id#86405]
+- Filter ((country#86409 = DE) AND (rn#86402 = 1))
+- Join Inner, (store_id#86404 = store_id#86407)
:- SubqueryAlias f
: +- Distinct
: +- Project [date_id#86403, store_id#86404, product_id#86405, units_sold#86406]
: +- SubqueryAlias testcat.fact_sk
: +- RelationV2[date_id#86403, store_id#86404, product_id#86405, units_sold#86406] testcat.fact_sk testcat.fact_sk
+- SubqueryAlias s
+- Project [store_id#86407, state_province#86408, country#86409, rn#86402]
+- Project [store_id#86407, state_province#86408, country#86409, rn#86402, rn#86402]
+- Window [row_number() windowspecdefinition(store_id#86407, state_province#86408 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#86402], [store_id#86407], [state_province#86408 DESC NULLS LAST]
+- Project [store_id#86407, state_province#86408, country#86409]
+- SubqueryAlias testcat.dim_store
+- RelationV2[store_id#86407, state_province#86408, country#86409] testcat.dim_store testcat.dim_store
== Optimized Logical Plan ==
Project [store_id#86407, product_id#86405]
+- Join Inner, (store_id#86404 = store_id#86407)
:- Aggregate [date_id#86403, store_id#86404, product_id#86405, units_sold#86406], [store_id#86404, product_id#86405]
: +- Filter (isnotnull(store_id#86404) AND dynamicpruning#86431 [store_id#86404])
: : +- Project [store_id#86407]
: : +- Filter ((isnotnull(country#86409) AND (country#86409 = DE)) AND (rn#86402 = 1))
: : +- Window [row_number() windowspecdefinition(store_id#86407, state_province#86408 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#86402], [store_id#86407], [state_province#86408 DESC NULLS LAST]
: : +- Filter isnotnull(store_id#86407)
: : +- RelationV2[store_id#86407, state_province#86408, country#86409] testcat.dim_store
: +- RelationV2[date_id#86403, store_id#86404, product_id#86405, units_sold#86406] testcat.fact_sk
+- Project [store_id#86407]
+- Filter ((isnotnull(country#86409) AND (country#86409 = DE)) AND (rn#86402 = 1))
+- Window [row_number() windowspecdefinition(store_id#86407, state_province#86408 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#86402], [store_id#86407], [state_province#86408 DESC NULLS LAST]
+- WindowGroupLimit [store_id#86407], [state_province#86408 DESC NULLS LAST], row_number(), 1
+- Filter isnotnull(store_id#86407)
+- RelationV2[store_id#86407, state_province#86408, country#86409] testcat.dim_store
== Physical Plan ==
*(4) Project [store_id#86407, product_id#86405]
+- *(4) BroadcastHashJoin [store_id#86404], [store_id#86407], Inner, BuildRight, false
:- *(4) HashAggregate(keys=[date_id#86403, store_id#86404, product_id#86405, units_sold#86406], functions=[], output=[store_id#86404, product_id#86405])
: +- *(4) HashAggregate(keys=[date_id#86403, store_id#86404, product_id#86405, units_sold#86406], functions=[], output=[date_id#86403, store_id#86404, product_id#86405, units_sold#86406])
: +- *(4) Project [date_id#86403, store_id#86404, product_id#86405, units_sold#86406]
: +- *(4) Filter isnotnull(store_id#86404)
: +- BatchScan testcat.fact_sk[date_id#86403, store_id#86404, product_id#86405, units_sold#86406] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: [dynamicpruningexpression(true)]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=119826]
+- *(3) Project [store_id#86407]
+- *(3) Filter ((isnotnull(country#86409) AND (country#86409 = DE)) AND (rn#86402 = 1))
+- Window [row_number() windowspecdefinition(store_id#86407, state_province#86408 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#86402], [store_id#86407], [state_province#86408 DESC NULLS LAST]
+- WindowGroupLimit [store_id#86407], [state_province#86408 DESC NULLS LAST], row_number(), 1, Final
+- *(2) Sort [store_id#86407 ASC NULLS FIRST, state_province#86408 DESC NULLS LAST], false, 0
+- Exchange hashpartitioning(store_id#86407, 5), ENSURE_REQUIREMENTS, [plan_id=119816]
+- WindowGroupLimit [store_id#86407], [state_province#86408 DESC NULLS LAST], row_number(), 1, Partial
+- *(1) Sort [store_id#86407 ASC NULLS FIRST, state_province#86408 DESC NULLS LAST], false, 0
+- *(1) Project [store_id#86407, state_province#86408, country#86409]
+- *(1) Filter isnotnull(store_id#86407)
+- BatchScan testcat.dim_store[store_id#86407, state_province#86408, country#86409] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 4 == == Spark Answer - 0 ==
struct<> struct<>
![3,2]
![3,2]
![3,2]
![3,2]
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-36444: Remove OptimizeSubqueries from batch of PartitionPruning:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 3 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-38148: Do not add dynamic partition pruning if there exists static partition pruning:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 5 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-38570: Fix incorrect DynamicPartitionPruning caused by Literal:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.scalatest.exceptions.TestFailedException:
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project ['f.store_id, 'f.date_id, 's.state_province]
+- 'Filter ('s.country = US)
+- 'Join Inner, ('f.store_id = 's.store_id)
:- 'SubqueryAlias f
: +- 'Union false, false
: :- 'Project [4 AS store_id#87472, 'date_id, 'product_id]
: : +- 'Filter ('date_id >= 1300)
: : +- 'UnresolvedRelation [fact_sk], [], false
: +- 'Project [5 AS store_id#87473, 'date_id, 'product_id]
: +- 'Filter ('date_id <= 1000)
: +- 'UnresolvedRelation [fact_stats], [], false
+- 'SubqueryAlias s
+- 'UnresolvedRelation [dim_store], [], false
== Analyzed Logical Plan ==
store_id: int, date_id: int, state_province: string
Project [store_id#87472, date_id#87474, state_province#87483]
+- Filter (country#87484 = US)
+- Join Inner, (store_id#87472 = store_id#87482)
:- SubqueryAlias f
: +- Union false, false
: :- Project [4 AS store_id#87472, date_id#87474, product_id#87476]
: : +- Filter (date_id#87474 >= 1300)
: : +- SubqueryAlias testcat.fact_sk
: : +- RelationV2[date_id#87474, store_id#87475, product_id#87476, units_sold#87477] testcat.fact_sk testcat.fact_sk
: +- Project [5 AS store_id#87473, date_id#87478, product_id#87480]
: +- Filter (date_id#87478 <= 1000)
: +- SubqueryAlias testcat.fact_stats
: +- RelationV2[date_id#87478, store_id#87479, product_id#87480, units_sold#87481] testcat.fact_stats testcat.fact_stats
+- SubqueryAlias s
+- SubqueryAlias testcat.dim_store
+- RelationV2[store_id#87482, state_province#87483, country#87484] testcat.dim_store testcat.dim_store
== Optimized Logical Plan ==
Project [store_id#87472, date_id#87474, state_province#87483]
+- Join Inner, (store_id#87472 = store_id#87482)
:- Union false, false
: :- Project [4 AS store_id#87472, date_id#87474]
: : +- Filter (isnotnull(date_id#87474) AND (date_id#87474 >= 1300))
: : +- RelationV2[date_id#87474, store_id#87475, product_id#87476, units_sold#87477] testcat.fact_sk
: +- Project [5 AS store_id#87473, date_id#87478]
: +- Filter (isnotnull(date_id#87478) AND (date_id#87478 <= 1000))
: +- RelationV2[date_id#87478, store_id#87479, product_id#87480, units_sold#87481] testcat.fact_stats
+- Project [store_id#87482, state_province#87483]
+- Filter (((isnotnull(country#87484) AND (country#87484 = US)) AND ((store_id#87482 <=> 4) OR (store_id#87482 <=> 5))) AND isnotnull(store_id#87482))
+- RelationV2[store_id#87482, state_province#87483, country#87484] testcat.dim_store
== Physical Plan ==
*(4) Project [store_id#87472, date_id#87474, state_province#87483]
+- *(4) BroadcastHashJoin [store_id#87472], [store_id#87482], Inner, BuildRight, false
:- Union
: :- *(1) Project [4 AS store_id#87472, date_id#87474]
: : +- *(1) Filter (isnotnull(date_id#87474) AND (date_id#87474 >= 1300))
: : +- BatchScan testcat.fact_sk[date_id#87474, store_id#87475, product_id#87476, units_sold#87477] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
: +- *(2) Project [5 AS store_id#87473, date_id#87478]
: +- *(2) Filter (isnotnull(date_id#87478) AND (date_id#87478 <= 1000))
: +- BatchScan testcat.fact_stats[date_id#87478, store_id#87479, product_id#87480, units_sold#87481] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=120848]
+- *(3) Project [store_id#87482, state_province#87483]
+- *(3) Filter (((isnotnull(country#87484) AND (country#87484 = US)) AND ((store_id#87482 <=> 4) OR (store_id#87482 <=> 5))) AND isnotnull(store_id#87482))
+- BatchScan testcat.dim_store[store_id#87482, state_province#87483, country#87484] class org.apache.spark.sql.connector.catalog.InMemoryTableWithV2Filter$InMemoryV2FilterBatchScan RuntimeFilters: []
== Results ==
== Results ==
!== Correct Answer - 2 == == Spark Answer - 0 ==
struct<> struct<>
![4,1300,California]
![5,1000,Texas]
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-38674: Remove useless deduplicate in SubqueryBroadcastExec:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 1 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-39338: Remove dynamic pruning subquery if pruningKey's references is empty:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 5 partition values
|
DynamicPartitionPruningV2FilterSuiteAEOff.SPARK-39217: Makes DPP support the pruning side has Union:
DynamicPartitionPruningV2FilterSuiteAEOff#L1
org.apache.spark.SparkException: During runtime filtering, data source must either report the same number of partition values, or a subset of partition values from the original. Before: 0 partition values. After: 5 partition values
|