Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically gather jstacks for Spark applications #5

Closed
wants to merge 1,012 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1012 commits
Select commit Hold shift + click to select a range
e6d44e4
[SPARK-51100][ML][PYTHON][CONNECT] Replace transformer wrappers with …
zhengruifeng Feb 6, 2025
13eaf41
[SPARK-50799][PYTHON] Refine the docstring of rlike, length, octet_le…
drexler-sky Feb 6, 2025
0b7c4f1
[SPARK-50922][ML][PYTHON][CONNECT][FOLLOW-UP] Import pyspark.core mod…
HyukjinKwon Feb 6, 2025
93e9f6e
[SPARK-51084][SQL] Assign appropriate error class for `negativeScaleN…
asl3 Feb 6, 2025
96bd8f1
[SPARK-50605][CONNECT] Support SQL API mode for easier migration to S…
HyukjinKwon Feb 6, 2025
9dbbb5b
[SPARK-51096][SQL][TESTS] Splitting TransformWithStateSuite into Unsa…
ericm-db Feb 6, 2025
d36edda
[SPARK-51105][ML][PYTHON][CONNECT][TESTS] Add parity test for ml func…
zhengruifeng Feb 6, 2025
88485a3
[SPARK-51050][SQL][TESTS] Add group by alias tests to the group-by-al…
mihailoale-db Feb 6, 2025
84e924e
[SPARK-51010][SQL] Fix AlterColumnSpec not reporting resolved status …
ctring Feb 6, 2025
cfa8dbd
[SPARK-51057][SS] Remove scala option based variant API for value state
anishshri-db Feb 6, 2025
aefaa66
[SPARK-49371][SPARK-51086][SQL][CONNECT][DOCS] Fix various doc SQL is…
hvanhovell Feb 6, 2025
b968ce1
[SPARK-43415][CONNECT][SQL] Implement `KVGDS.agg` with custom `mapVal…
xupefei Feb 6, 2025
7416ae9
[SPARK-49698][CONNECT][SQL] Add ClassicOnly annotation for classic on…
hvanhovell Feb 6, 2025
a79ba48
[SPARK-51042][SQL] Read and write the month and days fields of interv…
jonathan-albrecht-ibm Feb 6, 2025
e89b19f
[SPARK-51104][DOC] Self-host JavaScript and CSS in Spark website
gengliangwang Feb 6, 2025
5a925c6
[SPARK-51104][DOC][FOLLOWUP] Self-host docsearch.min.css in Spark web…
gengliangwang Feb 6, 2025
8d18df3
[SPARK-51099][PYTHON] Add logs when the Python worker looks stuck
ueshin Feb 7, 2025
f4b729d
[SPARK-51107][CORE] Refactor CommandBuilderUtils#join to reuse the li…
RocMarshal Feb 7, 2025
935c2b0
[SPARK-51101][ML][PYTHON][CONNECT][TESTS] Add doctest for `pyspark.ml…
zhengruifeng Feb 7, 2025
7a1dcb3
[SPARK-51126][PS][DOCS] Optimize the memory usage in kde examples
zhengruifeng Feb 7, 2025
616baa8
[SPARK-51108][INFRA] Install Python packages for `yarn` module in `ma…
LuciferYang Feb 7, 2025
20bd4aa
[SPARK-51128][DOC] Self host docsearch.min.css.map in Spark website
gengliangwang Feb 7, 2025
a01ae99
[SPARK-49531][PYTHON][CONNECT][INFRA][FOLLOW-UP] Match pandas version…
HyukjinKwon Feb 7, 2025
f5f7c36
[SPARK-51093][SQL][TESTS] Fix minor endianness issues in tests
jonathan-albrecht-ibm Feb 7, 2025
bafd007
[SPARK-50075][FOLLOWUP][DOCS] Add table-valued function API docs
ueshin Feb 7, 2025
4305bb1
[SPARK-51131][SQL] Throw exception when SQL Script is found inside EX…
miland-db Feb 7, 2025
ad13a88
[SPARK-48353][SQL][FOLLOWUP] Enable ANSI for SQL Scripting execution …
miland-db Feb 7, 2025
717027d
[SPARK-51129][DOC] Fix code tab switching in Spark Website
gengliangwang Feb 7, 2025
df89c8e
[SPARK-51087][PYTHON][CONNECT] Raise a warning when memory-profiler i…
xinrong-meng Feb 7, 2025
3ef58c3
[SPARK-51103][PYTHON][DOCS] Add DataFrame conversion to table argumen…
xinrong-meng Feb 7, 2025
099a59c
[SPARK-51080][ML][PYTHON][CONNECT] Fix save/load for `PowerIterationC…
zhengruifeng Feb 8, 2025
5531497
[SPARK-50945][ML][PYTHON][CONNECT] Support Summarizer and SummaryBuil…
zhengruifeng Feb 8, 2025
a5e905d
[SPARK-51127][PYTHON] Kill the Python worker on idle timeout
ueshin Feb 8, 2025
af92420
[SPARK-51048][CORE] Support stop java spark context with exit code
prathit06 Feb 8, 2025
ba7849e
[SPARK-51130][YARN][TESTS] Run the test cases related to `connect` in…
LuciferYang Feb 8, 2025
301b666
[SPARK-51065][SQL] Disallowing non-nullable schema when Avro encoding…
ericm-db Feb 9, 2025
736737e
[SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.connect.…
HyukjinKwon Feb 10, 2025
d0848aa
[SPARK-51135][SQL] Fix ViewResolverSuite for ANSI modes
vladimirg-db Feb 10, 2025
59a6ca5
[SPARK-48239][INFRA][FOLLOWUP] Update the release docker image to fol…
cloud-fan Feb 10, 2025
c7edcae
[SPARK-51109][SQL] CTE in subquery expression as grouping column
cloud-fan Feb 10, 2025
2fc278d
[SPARK-51133][BUILD] Upgrade Apache `commons-pool2` to 2.12.1
wayneguow Feb 10, 2025
34e6e44
[SPARK-51139][ML][CONNECT] Refine error class `MLAttributeNotAllowedE…
zhengruifeng Feb 10, 2025
9f86647
[SPARK-50881][PYTHON] Use cached schema where possible in conenct dat…
garlandz-db Feb 10, 2025
dd15330
[SPARK-50953][PYTHON][CONNECT] Add support for non-literal paths in V…
harshmotw-db Feb 10, 2025
59dd406
[SPARK-48516][PYTHON][CONNECT] Turn on Arrow optimization for Python …
xinrong-meng Feb 10, 2025
e823afa
[SPARK-50917][EXAMPLES] Add Pi Scala example to work both for Connect…
yaooqinn Feb 10, 2025
222dd81
[SPARK-51073][SQL] Remove `Unstable` from `SparkSessionExtensionsProv…
dongjoon-hyun Feb 10, 2025
066abfa
[SPARK-50753][PYTHON][DOCS] Add pyspark plotting to API documentation
xinrong-meng Feb 10, 2025
56c7879
[SPARK-51145][BUILD][TESTS] Upgrade `mysql-connector-j`, `MySQL` dock…
wayneguow Feb 10, 2025
cd9c509
[SPARK-51023][CORE] log remote address on RPC exception
Feb 10, 2025
165e9d0
[SPARK-51106][CORE][TESTS] Use `try-with-resources` to ensure resourc…
LuciferYang Feb 10, 2025
8071868
[SPARK-51140][ML] Sort the params before saving
zhengruifeng Feb 11, 2025
22d2eb3
[SPARK-51143][PYTHON] Pin `plotly<6.0.0` and `torch<2.6.0`
zhengruifeng Feb 11, 2025
c91fdd4
Revert "[SPARK-51138][PYTHON][CONNECT][TESTS] Skip pyspark.sql.tests.…
HyukjinKwon Feb 11, 2025
9d88020
[SPARK-51112][CONNECT] Avoid using pyarrow's `to_pandas` on an empty …
vicennial Feb 11, 2025
937decc
[SPARK-51119][SQL] Readers on executors resolving EXISTS_DEFAULT shou…
szehon-ho Feb 11, 2025
bdb7704
[SPARK-51146][INFRA] Publish a new Spark distribution with Spark Conn…
cloud-fan Feb 11, 2025
6a71f76
[SPARK-51142][ML][CONNECT] ML protobufs clean up
zhengruifeng Feb 11, 2025
54959ab
[SPARK-51148][BUILD] Upgrade `zstd-jni` to 1.5.6-10
dongjoon-hyun Feb 11, 2025
cea79dc
[SPARK-51132][ML][BUILD] Upgrade `JPMML` to 1.7.1
wayneguow Feb 11, 2025
e7ceb5b
[MINOR][DOCS] Fix incorrect description of constraint on spark.sql.ad…
JoshRosen Feb 11, 2025
6a668cd
[SPARK-51150][ML] Explicitly pass the session in meta algorithm writers
zhengruifeng Feb 11, 2025
394458f
[SPARK-51147][SS] Refactor streaming related classes to a dedicated s…
bogao007 Feb 11, 2025
6416297
[SPARK-51153][BUILD] Remove unused `javax.activation:activation` depe…
wayneguow Feb 11, 2025
d3be2a7
[SPARK-51154][BUILD][TESTS] Remove unused `jopt` test dependency
cnauroth Feb 11, 2025
e7709f0
[SPARK-51155][CORE] Make `SparkContext` show total runtime after stop…
dongjoon-hyun Feb 11, 2025
5135329
[SPARK-51119][SQL][FOLLOW-UP] Readers on executors resolving EXISTS_D…
szehon-ho Feb 11, 2025
1ba759f
[SPARK-51157][SQL] Add missing @varargs Scala annotation for Scala fu…
yaooqinn Feb 11, 2025
533d9c3
[SPARK-51146][INFRA] Publish a new Spark distribution with Spark Conn…
cloud-fan Feb 11, 2025
ba3e271
[SPARK-45891][SQL][FOLLOWUP] Disable `spark.sql.variant.allowReadingS…
pan3793 Feb 11, 2025
ad8222a
[SPARK-51158][YARN][TESTS] No longer restrict the test cases related …
LuciferYang Feb 11, 2025
c66cc0e
[SPARK-50940][SPARK-50941][ML][PYTHON][CONNECT][FOLLOW-UP] Directly r…
zhengruifeng Feb 11, 2025
c9720af
[SPARK-50557][CONNECT][SQL] Support RuntimeConfig.contains(..) in Sca…
hvanhovell Feb 11, 2025
d07b560
[SPARK-51164][CORE][TESTS] Fix `CallerContext` test by enabling `hado…
dongjoon-hyun Feb 11, 2025
b21428f
[SPARK-51159][INFRA] Pin `plotly<6.0.0` and `torch<2.6.0` in release …
zhengruifeng Feb 12, 2025
6a4e043
[SPARK-51165][CORE] Enable `spark.master.rest.enabled` by default
dongjoon-hyun Feb 12, 2025
6f52d5a
[SPARK-51127][PYTHON][FOLLOWUP] Update version for the configs
ueshin Feb 12, 2025
e2c81ad
[SPARK-51164][CORE][TESTS][FOLLOWUP] Add hadoop.caller.context.enable…
cnauroth Feb 12, 2025
9793587
[SPARK-51170][PYTHON][CONNECT][TESTS] Separate local and local-cluste…
HyukjinKwon Feb 12, 2025
00b2bc4
[SPARK-51136][CORE] Set `CallerContext` for History Server
cnauroth Feb 12, 2025
f52a679
[SPARK-51114][SQL] Refactor PullOutNondeterministic rule
mihailoale-db Feb 12, 2025
decb677
[SPARK-51172][SS] Rename to spark.sql.optimizer.pruneFiltersCanPruneS…
HyukjinKwon Feb 12, 2025
274dc5e
[SPARK-51173][TESTS] Add `configName` Scalastyle rule
dongjoon-hyun Feb 12, 2025
2cd5365
[SPARK-42746][SQL][FOLLOWUP] Improve the golden files by print the he…
beliefer Feb 12, 2025
16784fa
[SPARK-50854][SS] Make path fully qualified before passing it to File…
vrozov Feb 12, 2025
00f1057
[SPARK-50582][SQL][PYTHON] Add quote builtin function
sarutak Feb 12, 2025
d6fe024
[MINOR][INFRA] List pip installation before test in python macos test…
zhengruifeng Feb 12, 2025
749eae4
[SPARK-51171][BUILD] Upgrade `checkstyle` to 10.21.2
wayneguow Feb 12, 2025
8bacf99
[SPARK-51141][ML][CONNECT][DOCS] Document the Support of ML on Connect
zhengruifeng Feb 12, 2025
42ecabf
[SPARK-51175][CORE] Make `Master` show elapsed time when removing dri…
dongjoon-hyun Feb 12, 2025
cb2732d
[SPARK-51146][INFRA][FOLLOWUP] Use awk to update release scripts
cloud-fan Feb 12, 2025
207390b
[SPARK-51008][SQL] Add ResultStage for AQE
liuzqt Feb 12, 2025
7aad4d0
[SPARK-51160][SQL] Refactor literal function resolution
mihailotim-db Feb 12, 2025
03868a8
[SPARK-51163][BUILD] Exclude duplicated jars from connect-repl
pan3793 Feb 12, 2025
d0e07ac
[SPARK-51113][SQL] Fix correctness with UNION/EXCEPT/INTERSECT inside…
vladimirg-db Feb 12, 2025
6440bde
Revert "[SPARK-51113][SQL] Fix correctness with UNION/EXCEPT/INTERSEC…
dongjoon-hyun Feb 12, 2025
340bc00
[SPARK-51184][CORE] Remove `TaskState.LOST` logic from `TaskScheduler…
dongjoon-hyun Feb 12, 2025
dc56829
[MINOR][SQL][TESTS] Remove redundant space at `PropagateEmptyRelation…
WweiL Feb 13, 2025
f749742
[SPARK-50960][PYTHON][CONNECT] Add `InvalidPlanInput` to Spark Connec…
itholic Feb 13, 2025
d669575
[SPARK-51188][BUILD] Upgrade Arrow to 18.2.0
LuciferYang Feb 13, 2025
3bc516e
[SPARK-51190][ML][PYTHON][CONNECT] Fix TreeEnsembleModel.treeWeights
zhengruifeng Feb 13, 2025
b3e64ad
[SPARK-51059][ML][CONNECT][DOCS] Document how ALLOWED_ATTRIBUTES works
zhengruifeng Feb 13, 2025
97372e0
[SPARK-51189][CORE] Promote `JobFailed` to `DeveloperApi`
pan3793 Feb 13, 2025
74d88b6
[SPARK-51185][CORE] Revert simplifications to PartitionedFileUtil API…
LukasRupprecht Feb 13, 2025
8e39b7f
[SPARK-51197][ML][PYTHON][CONNECT][TESTS] Unit test clean up
zhengruifeng Feb 13, 2025
0a10232
[SPARK-51193][CORE] Upgrade Netty to 4.1.118.Final and netty-tcnative…
dongjoon-hyun Feb 13, 2025
ae2d2e8
[SPARK-51195][BUILD][K8S] Upgrade `kubernetes-client` to 7.1.0
wayneguow Feb 13, 2025
3d4f6b4
[SPARK-51194][BUILD] Upgrade `scalafmt` to 3.8.6
wayneguow Feb 13, 2025
920486d
[SPARK-51198][CORE][DOCS] Revise `defaultMinPartitions` function desc…
dongjoon-hyun Feb 13, 2025
80fb9d0
[SPARK-51200][BUILD] Add SparkR deprecation info to `README.md` and `…
dongjoon-hyun Feb 13, 2025
18ff7b3
[SPARK-51181][SQL] Enforce determinism when pulling out non determini…
mihailoale-db Feb 13, 2025
e92e12a
[SPARK-51067][SQL] Revert session level collation for DML queries and…
dejankrak-db Feb 13, 2025
0d6bdce
[SPARK-51113][SQL] Fix correctness with UNION/EXCEPT/INTERSECT inside…
vladimirg-db Feb 13, 2025
49f9ecf
[SPARK-51205][BUILD][TESTS] Upgrade `bytebuddy` to 1.17.0 to support …
dongjoon-hyun Feb 13, 2025
e397207
[SPARK-51208][SQL] `ColumnDefinition.toV1Column` should preserve `EXI…
szehon-ho Feb 13, 2025
ffd58bc
[SPARK-51209][CORE] Improve `getCurrentUserName` to handle Java 24+
dongjoon-hyun Feb 14, 2025
78b1754
[SPARK-51210][CORE] Add `--enable-native-access=ALL-UNNAMED` to Java …
dongjoon-hyun Feb 14, 2025
2e0c02e
[SPARK-51177][PYTHON][CONNECT] Add `InvalidCommandInput` to Spark Con…
itholic Feb 14, 2025
c76ed49
[SPARK-51201][SQL] Make Partitioning Hints support byte and short values
yaooqinn Feb 14, 2025
55c65d9
[SPARK-51204][BUILD] Upgrade `sbt-assembly` to 2.3.1
wayneguow Feb 14, 2025
83ee2a0
[SPARK-51119][SQL][FOLLOW-UP] ColumnDefinition.toV1Column should pres…
szehon-ho Feb 14, 2025
e840255
[SPARK-51209][CORE][FOLLOWUP] Use `user.name` system property first a…
dongjoon-hyun Feb 14, 2025
7992a2f
[SPARK-51186][PYTHON] Add `StreamingPythonRunnerInitializationExcepti…
itholic Feb 14, 2025
47edf4b
[SPARK-51213][SQL] Keep Expression class info when resolving hint par…
yaooqinn Feb 14, 2025
d6ad779
[SPARK-51216][BUILD] Remove the useless `bigtop-dist` profile and the…
LuciferYang Feb 14, 2025
09b93bd
[SPARK-51214][ML][PYTHON][CONNECT] Don't eagerly remove the cached mo…
zhengruifeng Feb 15, 2025
f601eb7
[SPARK-51146][SQL][FOLLOWUP] Respect system env `SPARK_CONNECT_MODE` …
cloud-fan Feb 15, 2025
b2cbdaa
[SPARK-51226][BUILD] Upgrade `extra-enforcer-rules` to support Java 24
dongjoon-hyun Feb 16, 2025
c36916f
[SPARK-51225][BUILD] Upgrade several Maven plugins to the latest vers…
dongjoon-hyun Feb 16, 2025
0bcff44
[SPARK-51222][SQL] Optimize ReplaceCurrentLike
szehon-ho Feb 16, 2025
b8d395e
[SPARK-50098][PYTHON][CONNECT][FOLLOWUP] Fix PySpark Connect `_minimu…
dongjoon-hyun Feb 16, 2025
3f4ba72
[SPARK-51227][PYTHON][CONNECT] Fix PySpark Connect `_minimum_grpc_ver…
dongjoon-hyun Feb 16, 2025
f798e7a
[SPARK-51212][PYTHON] Add a separated PySpark package for Spark Conne…
ueshin Feb 17, 2025
e5a0ee9
[SPARK-51233][BUILD] Update `pom.xml` to use `https` for `licenses` e…
dongjoon-hyun Feb 17, 2025
b3dac88
[SPARK-51215][ML][PYTHON][CONNECT] Add a helper function to invoke he…
zhengruifeng Feb 17, 2025
d75a7d6
[SPARK-51217][ML][CONNECT] ML model helper constructor clean up
zhengruifeng Feb 17, 2025
ad3f13e
[SPARK-51231][BUILD] Add `--enable-native-access=ALL-UNNAMED` to `.mv…
dongjoon-hyun Feb 17, 2025
ec90862
[SPARK-51235][K8S][DOCS] Update `YuniKorn` docs with `1.6.1`
dongjoon-hyun Feb 17, 2025
b63b90e
[SPARK-51183][SQL] Link to Parquet spec in Variant docs
cashmand Feb 17, 2025
4c37a7a
[SPARK-51238][K8S][INFRA][DOCS] Upgrade Volcano to 1.11.0
dongjoon-hyun Feb 17, 2025
0c30983
[SPARK-51218][SQL] Avoid map/flatMap in NondeterministicExpressionCol…
vladimirg-db Feb 17, 2025
0002cdd
[SPARK-42746][SQL][FOLLOWUP] Correct the comments for SupportsOrderin…
beliefer Feb 17, 2025
8423d74
[SPARK-51232][PYTHON][DOCS] Remove PySpark 3.3 and older logic from `…
dongjoon-hyun Feb 17, 2025
e1f7851
[SPARK-50692][SQL][FOLLOWUP] Add comments for LPAD and RPAD
beliefer Feb 17, 2025
8a75e12
[SPARK-51228][SQL] Introduce subquery normalization to NormalizePlan
vladimirg-db Feb 17, 2025
17b9431
[SPARK-51237][SS] Add API details for new transformWithState helper A…
anishshri-db Feb 17, 2025
479e0b3
[SPARK-51192][CONNECT] Expose `processWithoutResponseObserverForTesti…
vicennial Feb 17, 2025
bd2b478
[SPARK-50849][CONNECT] Add example project to demonstrate Spark Conne…
vicennial Feb 17, 2025
aeea738
[SPARK-51085][SQL] Restore SQLContext Companion
hvanhovell Feb 17, 2025
2c76dff
[SPARK-51234][PYTHON][DOCS] Document an import change in `from pyspar…
zhengruifeng Feb 18, 2025
ef0685a
[SPARK-51152][PYTHON][SQL][DOCS] Add usage examples for the get_json_…
Feb 18, 2025
aa37f89
[SPARK-51219][SQL] Fix `ShowTablesExec.isTempView` to work with non-`…
ostronaut Feb 18, 2025
6dbd12a
[SPARK-51241][SQL][TESTS] Add test cases with ignore nulls for ANY_VALUE
beliefer Feb 18, 2025
a8b694f
[SPARK-50767][SQL] Remove codegen of `from_json`
cloud-fan Feb 18, 2025
500bf78
[SPARK-51246][SQL] Make InTypeCoercion produce resolved Casts
vladimirg-db Feb 18, 2025
ead7d58
[SPARK-48114][SQL] Move subquery validation out of CheckAnalysis
vladimirg-db Feb 18, 2025
4134e9f
[SPARK-51178][CONNECT][PYTHON] Raise proper PySpark error instead of …
itholic Feb 18, 2025
489ba0d
[SPARK-51242][CONENCT][PYTHON] Improve Column performance when DQC is…
itholic Feb 18, 2025
dc3fb50
[SPARK-51176][PYTHON][CONNECT] Meet consistency for unexpected errors…
itholic Feb 18, 2025
aa12070
[MINOR][DOCS] Add missing backticks in `Upgrading from PySpark 3.5 to…
zhengruifeng Feb 18, 2025
a44b613
[SPARK-51202][ML][PYTHON] Pass the session in meta algorithm python w…
zhengruifeng Feb 19, 2025
b32d3f7
[SPARK-51240][BUILD] Upgrade commons-codec to 1.18.0
LuciferYang Feb 19, 2025
0af25b8
[SPARK-51239][INFRA] Upgrade Github Action image for `TPCDSQueryBench…
wayneguow Feb 19, 2025
2c62899
[SPARK-51083][CORE] Modify JavaUtils to not swallow InterruptedExcept…
neilramaswamy Feb 19, 2025
53c326b
[SPARK-51255][INFRA] Install dependencies related to docs in release …
HyukjinKwon Feb 19, 2025
5759882
[SPARK-50655][SS] Move virtual col family related mapping into db lay…
anishshri-db Feb 19, 2025
26febf7
[SPARK-51247][SQL] Move SubstituteExecuteImmediate to 'resolution' ba…
dusantism-db Feb 19, 2025
cecafac
[SPARK-51257][SQL][TESTS] Add order-by-alias.sql
mihailoale-db Feb 19, 2025
948840b
[MINOR][PYTHON][DOCS] Refine the docstring of `VariantVal`
zhengruifeng Feb 20, 2025
2beb7ed
[SPARK-51260][SQL] Move V2ExpressionBuilder and PushableExpression to…
gengliangwang Feb 20, 2025
6603a4e
[SPARK-51254][PYTHON][CONNECT] Disallow --master with Spark Connect URL
HyukjinKwon Feb 20, 2025
35442ba
[SPARK-51150][ML][FOLLOW-UP] Pass session in `OneVsRestParams.saveImpl`
zhengruifeng Feb 20, 2025
f447c43
[SPARK-51219][SQL][TESTS][FOLLOWUP] ShowTablesExec` remove `ArrayImpl…
ostronaut Feb 20, 2025
a661f9f
[SPARK-51259][SQL] Refactor natural and using join keys computation
mihailotim-db Feb 20, 2025
a867a9e
[SPARK-51259][SQL][FOLLOWUP] Improve performance of natural and using…
mihailotim-db Feb 20, 2025
fb17856
[SPARK-48530][SQL] Support for local variables in SQL Scripting
dusantism-db Feb 20, 2025
62f0d29
[SPARK-51266][CORE] Remove the unused definition of `private[spark] o…
LuciferYang Feb 20, 2025
bbb9c2c
[SPARK-47208][DOCS][FOLLOWUP] Replace `spark.driver.minMemoryOverhead…
yaooqinn Feb 20, 2025
da1854e
[SPARK-51097][SS] Adding state store instance metrics for last upload…
zecookiez Feb 20, 2025
d6ca11e
[SPARK-51092][SS] Skip the v1 FlatMapGroupsWithState tests with timeo…
jonathan-albrecht-ibm Feb 20, 2025
42ab97a
[SPARK-51249][SS] Fixing the NoPrefixKeyStateEncoder and Avro encodin…
ericm-db Feb 21, 2025
1d72acc
[SPARK-50864][TESTS] Disable slow tests
ueshin Feb 21, 2025
f37be89
[SPARK-51222][SQL][FOLLOW-UP] Optimize ReplaceCurrentLike
szehon-ho Feb 21, 2025
4ffc398
[SPARK-51119][SQL][FOLLOW-UP] Add fallback to ResolveDefaultColumnsUt…
szehon-ho Feb 21, 2025
45900c4
[SPARK-48516][PYTHON][FOLLOW-UP] Add a note in migration guide about …
HyukjinKwon Feb 21, 2025
46e12a4
[SPARK-51267][CONNECT] Match local Spark Connect server logic between…
HyukjinKwon Feb 21, 2025
3d76e0b
[SPARK-51275][PYTHON][ML][CONNECT] Session propagation in python read…
zhengruifeng Feb 21, 2025
b70b904
[SPARK-51284][SQL] Fix SQL Script execution for empty result
davidm-db Feb 21, 2025
140a69b
[SPARK-51283][SQL][TESTS] Add test cases for LZ4 and SNAPPY for text
beliefer Feb 21, 2025
e1842c7
[SPARK-51263][CORE][SQL][TESTS] Clean up unnecessary `invokePrivate` …
LuciferYang Feb 21, 2025
eb4a28b
[SPARK-51274][PYTHON] PySparkLogger should respect the expected keywo…
ueshin Feb 21, 2025
9ac566d
[SPARK-51276][PYTHON] Enable spark.sql.execution.arrow.pyspark.enable…
HyukjinKwon Feb 22, 2025
666f45d
[SPARK-51279][CONNECT] Avoid constant sleep for waiting Spark Connect…
HyukjinKwon Feb 22, 2025
30f4f4e
[SPARK-51258][SQL] Remove unnecessary inheritance from SQLConfHelper
beliefer Feb 22, 2025
7e9547c
[SPARK-51156][CONNECT] Static token authentication support in Spark C…
Kimahriman Feb 23, 2025
9a1f921
[SPARK-51258][SQL][FOLLOWUP] Remove unnecessary inheritance from SQLC…
beliefer Feb 23, 2025
3027968
[SPARK-51292][SQL] Remove unnecessary inheritance from PlanTestBase, …
beliefer Feb 23, 2025
0b5b0d5
[SPARK-51293][CORE][SQL][SS][MLLIB][TESTS] Cleanup unused private fun…
LuciferYang Feb 23, 2025
b04e9c4
[SPARK-51294][SQL][CONNECT][TESTS] Improve the readability by split t…
beliefer Feb 24, 2025
f1d78dc
[SPARK-51297][DOCS] Fixed the scope of the query option in sql-data-s…
llphxd Feb 24, 2025
c6097c7
[MINOR][DOCS] Clarify spark.remote and spark.master in pyspark-connec…
HyukjinKwon Feb 24, 2025
3515b20
[SPARK-51300][PS][DOCS] Fix broken link for `ps.sql`
itholic Feb 24, 2025
67a337e
[SPARK-51278][PYTHON] Use appropriate structure of JSON format for `P…
itholic Feb 24, 2025
a084d64
[SPARK-50098][PYTHON][FOLLOW-UP] Update _minimum_googleapis_common_pr…
HyukjinKwon Feb 24, 2025
ee8e10f
[SPARK-50015][PYTHON][FOLLOW-UP] Update _minimum_grpc_version in setu…
HyukjinKwon Feb 24, 2025
968542f
[SPARK-51304][DOCS][PYTHON] Use `getCondition` instead of `getErrorCl…
itholic Feb 24, 2025
7e5bf72
[SPARK-51078][SPARK-50963][ML][PYTHON][CONNECT][TESTS][FOLLOW-UP] Add…
zhengruifeng Feb 24, 2025
48fc0fb
[SPARK-49912] Refactor simple CASE statement to evaluate the case var…
dusantism-db Feb 24, 2025
60bcc71
[SPARK-51095][CORE][SQL] Include caller context for hdfs audit logs f…
sririshindra Feb 24, 2025
9de3b7c
[SPARK-51156][CONNECT][FOLLOWUP] Remove unused `private val AUTH_TOKE…
LuciferYang Feb 24, 2025
9f637b5
[SPARK-51305][SQL][CONNECT] Improve `SparkConnectPlanExecution.create…
beliefer Feb 24, 2025
ea15a95
[SPARK-50692][SQL] Add the LPAD and RPAD pushdown support for H2
beliefer Feb 24, 2025
d4b4cfc
[SPARK-51099][PYTHON][FOLLOWUP] Avoid logging when selector.select re…
ueshin Feb 24, 2025
529e887
[SPARK-50914][PYTHON][CONNECT] Match GRPC dependencies for Python-onl…
HyukjinKwon Feb 25, 2025
352d1ed
[SPARK-51306][TESTS] Fix test errors caused by improper DROP TABLE/VI…
yaooqinn Feb 25, 2025
7feb911
[SPARK-50795][SQL][FOLLOWUP] Set isParsing to false for the timestamp…
yaooqinn Feb 25, 2025
0184c5b
[SPARK-50785][SQL] Refactor FOR statement to utilize local variables …
dusantism-db Feb 25, 2025
4f5ee80
[SPARK-51311][BUILD] Promote bcprov-jdk18on to compile scope
pan3793 Feb 25, 2025
93d10c2
[SPARK-50856][SS][PYTHON][CONNECT] Spark Connect Support for Transfor…
jingz-db Feb 26, 2025
8e8eccb
[SPARK-51317][PYTHON] Require pandas as well for Arrow-optimized Pyth…
HyukjinKwon Feb 26, 2025
85f05e3
[SPARK-51313][PYTHON] Fix timestamp format for PySparkLogger
ueshin Feb 26, 2025
14c1378
[SPARK-51221][CONNECT][TESTS] Use unresolvable host name in SparkConn…
vrozov Feb 26, 2025
381d5bc
[SPARK-51312][SQL] Fix createDataFrame from RDD[Row]
mihailom-db Feb 26, 2025
7d7a05b
[SPARK-51265][SQL] IncrementalExecution should set the command execut…
cloud-fan Feb 26, 2025
87fd092
[SPARK-51282][ML][PYTHON][CONNECT] Optimize OneVsRestModel transform …
zhengruifeng Feb 26, 2025
971b9a4
[SPARK-51309][BUILD] Upgrade rocksdbjni to 9.10.0
wayneguow Feb 26, 2025
ff7b4a4
[SPARK-51315][SQL] Enabling object level collations by default
dejankrak-db Feb 26, 2025
9efaa91
[SPARK-51277][PYTHON] Implement 0-arg implementation in Arrow-optimiz…
HyukjinKwon Feb 27, 2025
04af02f
[SPARK-51273][SQL] Spark Connect Call Procedure runs the procedure twice
szehon-ho Feb 27, 2025
9898446
[SPARK-51324][SQL] Fix nested FOR statement throwing error if empty r…
dusantism-db Feb 27, 2025
727167a
[SPARK-51206][PYTHON][CONNECT] Move Arrow conversion helpers out of S…
wengh Feb 27, 2025
2fca41c
[SPARK-50792][SQL][TESTS][FOLLOWUP] Test case should reuse the exists…
beliefer Feb 27, 2025
bbd4b96
[SPARK-51329][ML][PYTHON] Add `numFeatures` for clustering models
zhengruifeng Feb 27, 2025
53fc763
[SPARK-51316][PYTHON] Allow Arrow batches in bytes instead of number …
HyukjinKwon Feb 27, 2025
3ba50cd
[SPARK-51323][PYTHON] Duplicate "total" on Py SQL metrics
sebastianhillig-db Feb 27, 2025
70d532a
[SPARK-51302][CONNECT] Spark Connect supports JDBC should use the Dat…
beliefer Feb 27, 2025
b47b004
[SPARK-51333][ML][PYTHON][CONNECT] Unwrap `InvocationTargetException`…
zhengruifeng Feb 27, 2025
e61a885
[SPARK-51322][SQL] Better error message for streaming subquery expres…
cloud-fan Feb 27, 2025
07e6a06
[SPARK-50994][CORE] Perform RDD conversion under tracked execution
BOOTMGR Feb 27, 2025
412da42
[SPARK-51310][SQL] Resolve the type of default string producing expre…
stefankandic Feb 27, 2025
a3671e5
[SPARK-51281][SQL] DataFrameWriterV2 should respect the path option
cloud-fan Feb 27, 2025
89e05a8
[SPARK-51326][CONNECT] Remove LazyExpression proto message
ueshin Feb 27, 2025
b05f18b
[SPARK-51278][FOLLOWUP][DOCS] Update JSON format from documentation
itholic Feb 28, 2025
71422d1
[SPARK-51303][SQL][TESTS] Extend `ORDER BY` testing coverage
mihailoale-db Feb 28, 2025
88addf4
[SPARK-51337][SQL] Add maxRows to CTERelationDef and CTERelationRef
vladimirg-db Feb 28, 2025
3478e6b
[SPARK-51270][SQL] Support UUID type in Variant
cashmand Feb 28, 2025
ddd0af6
[SPARK-51339][BUILD] Remove `IllegalImportsChecker` for `s.c.Seq/Inde…
LuciferYang Feb 28, 2025
208a7ee
[SPARK-49756][SQL][FOLLOWUP] Use correct pgsql datetime fields when p…
cloud-fan Feb 28, 2025
5b45671
[SPARK-51316][PYTHON][FOLLOW-UP] Revert unrelated changes and mark ma…
HyukjinKwon Feb 28, 2025
2bb6398
Add ThreadDumpCollector
roczei Oct 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
3 changes: 2 additions & 1 deletion .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ Please clarify why the changes are needed. For instance,

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such as the documentation fix.
Note that it means *any* user-facing change including all aspects such as new features, bug fixes, or other behavior changes. Documentation-only updates are not considered user-facing changes.

If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
If no, write 'No'.
Expand Down
22 changes: 9 additions & 13 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,9 @@ SQL:
- changed-files:
- all-globs-to-any-file: [
'**/sql/**/*',
'!python/pyspark/sql/avro/**/*',
'!python/pyspark/sql/streaming/**/*',
'!python/pyspark/sql/tests/streaming/test_streaming*.py'
'!python/**/avro/**/*',
'!python/**/protobuf/**/*',
'!python/**/streaming/**/*'
]
- any-glob-to-any-file: [
'common/unsafe/**/*',
Expand All @@ -119,7 +119,7 @@ AVRO:
- changed-files:
- any-glob-to-any-file: [
'connector/avro/**/*',
'python/pyspark/sql/avro/**/*'
'python/**/avro/**/*'
]

DSTREAM:
Expand Down Expand Up @@ -152,18 +152,16 @@ ML:
MLLIB:
- changed-files:
- any-glob-to-any-file: [
'**/spark/mllib/**/*',
'mllib-local/**/*',
'python/pyspark/mllib/**/*'
'**/mllib/**/*',
'mllib-local/**/*'
]

STRUCTURED STREAMING:
- changed-files:
- any-glob-to-any-file: [
'**/sql/**/streaming/**/*',
'connector/kafka-0-10-sql/**/*',
'python/pyspark/sql/streaming/**/*',
'python/pyspark/sql/tests/streaming/test_streaming*.py',
'python/pyspark/sql/**/streaming/**/*',
'**/*streaming.R'
]

Expand Down Expand Up @@ -225,14 +223,12 @@ CONNECT:
- changed-files:
- any-glob-to-any-file: [
'sql/connect/**/*',
'connector/connect/**/*',
'python/pyspark/sql/**/connect/**/*',
'python/pyspark/ml/**/connect/**/*'
'python/**/connect/**/*'
]

PROTOBUF:
- changed-files:
- any-glob-to-any-file: [
'connector/protobuf/**/*',
'python/pyspark/sql/protobuf/**/*'
'python/**/protobuf/**/*'
]
4 changes: 2 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:
tpcds-1g-gen:
name: "Generate an input dataset for TPCDSQueryBenchmark with SF=1"
if: contains(inputs.class, 'TPCDSQueryBenchmark') || contains(inputs.class, '*')
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
env:
SPARK_LOCAL_IP: localhost
steps:
Expand Down Expand Up @@ -105,7 +105,7 @@ jobs:
uses: actions/checkout@v4
with:
repository: databricks/tpcds-kit
ref: 2a5078a782192ddb6efbcead8de9973d6ab4f069
ref: 1b7fb7529edae091684201fab142d956d6afd881
path: ./tpcds-kit
- name: Build tpcds-kit
if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true'
Expand Down
168 changes: 130 additions & 38 deletions .github/workflows/build_and_test.yml

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions .github/workflows/build_branch35.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ name: "Build (branch-3.5, Scala 2.13, Hadoop 3, JDK 8)"
on:
schedule:
- cron: '0 11 * * *'
workflow_dispatch:

jobs:
run-build:
Expand All @@ -37,6 +38,7 @@ jobs:
envs: >-
{
"SCALA_PROFILE": "scala2.13",
"PYSPARK_IMAGE_TO_TEST": "",
"PYTHON_TO_TEST": "",
"ORACLE_DOCKER_IMAGE_NAME": "gvenzl/oracle-xe:21.3.0"
}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/build_branch35_python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ name: "Build / Python-only (branch-3.5)"
on:
schedule:
- cron: '0 11 * * *'
workflow_dispatch:

jobs:
run-build:
Expand All @@ -36,6 +37,7 @@ jobs:
hadoop: hadoop3
envs: >-
{
"PYSPARK_IMAGE_TO_TEST": "",
"PYTHON_TO_TEST": ""
}
jobs: >-
Expand Down
53 changes: 53 additions & 0 deletions .github/workflows/build_branch40.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Build (branch-4.0, Scala 2.13, Hadoop 3, JDK 17)"

on:
schedule:
- cron: '0 12 * * *'
workflow_dispatch:

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
java: 17
branch: branch-4.0
hadoop: hadoop3
envs: >-
{
"SCALA_PROFILE": "scala2.13",
"PYSPARK_IMAGE_TO_TEST": "",
"PYTHON_TO_TEST": "",
"ORACLE_DOCKER_IMAGE_NAME": "gvenzl/oracle-free:23.6-slim"
}
jobs: >-
{
"build": "true",
"sparkr": "true",
"tpcds-1g": "true",
"docker-integration-tests": "true",
"k8s-integration-tests": "true",
"lint" : "true"
}
57 changes: 57 additions & 0 deletions .github/workflows/build_branch40_java21.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Build (branch-4.0, Scala 2.13, Hadoop 3, JDK 21)"

on:
schedule:
- cron: '0 5 * * *'
workflow_dispatch:

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
java: 21
branch: branch-4.0
hadoop: hadoop3
envs: >-
{
"PYSPARK_IMAGE_TO_TEST": "python-311",
"PYTHON_TO_TEST": "python3.11",
"SKIP_MIMA": "true",
"SKIP_UNIDOC": "true",
"DEDICATED_JVM_SBT_TESTS": "org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV1Suite,org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormatV2Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV1Suite,org.apache.spark.sql.execution.datasources.orc.OrcSourceV2Suite"
}
jobs: >-
{
"build": "true",
"pyspark": "true",
"sparkr": "true",
"tpcds-1g": "true",
"docker-integration-tests": "true",
"yarn": "true",
"k8s-integration-tests": "true",
"buf": "true",
"ui": "true"
}
35 changes: 35 additions & 0 deletions .github/workflows/build_branch40_maven.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Build / Maven (branch-4.0, Scala 2.13, Hadoop 3, JDK 17)"

on:
schedule:
- cron: '0 14 * * *'
workflow_dispatch:

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/maven_test.yml
if: github.repository == 'apache/spark'
with:
branch: branch-4.0
36 changes: 36 additions & 0 deletions .github/workflows/build_branch40_maven_java21.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Build / Maven (branch-4.0, Scala 2.13, Hadoop 3, JDK 21)"

on:
schedule:
- cron: '0 14 * * *'
workflow_dispatch:

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/maven_test.yml
if: github.repository == 'apache/spark'
with:
branch: branch-4.0
java: 21
53 changes: 53 additions & 0 deletions .github/workflows/build_branch40_non_ansi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

name: "Build / Non-ANSI (branch-4.0, Hadoop 3, JDK 17, Scala 2.13)"

on:
schedule:
- cron: '0 2 * * *'
workflow_dispatch:

jobs:
run-build:
permissions:
packages: write
name: Run
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
java: 17
branch: branch-4.0
hadoop: hadoop3
envs: >-
{
"PYSPARK_IMAGE_TO_TEST": "python-311",
"PYTHON_TO_TEST": "python3.11",
"SPARK_ANSI_SQL_MODE": "false",
}
jobs: >-
{
"build": "true",
"docs": "true",
"pyspark": "true",
"sparkr": "true",
"tpcds-1g": "true",
"docker-integration-tests": "true",
"yarn": "true"
}
Loading
Loading