Skip to content

Commit

Permalink
[SPARK-50394][PYTHON][INFRA] Reduce parallelism in Pure Python librar…
Browse files Browse the repository at this point in the history
…y builds

### What changes were proposed in this pull request?

This PR proposes to decrease parallelism in Pure Python library builds

### Why are the changes needed?

In order to make the tests more robust:

https://github.com/apache/spark/actions/workflows/build_python_connect.yml
https://github.com/apache/spark/actions/workflows/build_python_connect35.yml

Now they fail because of OOM.

### Does this PR introduce _any_ user-facing change?

No, test-only

### How was this patch tested?

Will monitor the build:

https://github.com/apache/spark/actions/workflows/build_python_connect.yml
https://github.com/apache/spark/actions/workflows/build_python_connect35.yml

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #48932 from HyukjinKwon/reduce-parallelism.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
HyukjinKwon committed Nov 22, 2024
1 parent 5e076ef commit f9d2f42
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build_python_connect.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ jobs:
# Several tests related to catalog requires to run them sequencially, e.g., writing a table in a listener.
./python/run-tests --parallelism=1 --python-executables=python3 --modules pyspark-connect,pyspark-ml-connect
# None of tests are dependent on each other in Pandas API on Spark so run them in parallel
./python/run-tests --parallelism=4 --python-executables=python3 --modules pyspark-pandas-connect-part0,pyspark-pandas-connect-part1,pyspark-pandas-connect-part2,pyspark-pandas-connect-part3
./python/run-tests --parallelism=2 --python-executables=python3 --modules pyspark-pandas-connect-part0,pyspark-pandas-connect-part1,pyspark-pandas-connect-part2,pyspark-pandas-connect-part3
# Stop Spark Connect server.
./sbin/stop-connect-server.sh
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_python_connect35.yml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ jobs:
# Run branch-3.5 tests
./python/run-tests --parallelism=1 --python-executables=python3 --modules pyspark-connect
# None of tests are dependent on each other in Pandas API on Spark so run them in parallel
./python/run-tests --parallelism=4 --python-executables=python3 --modules pyspark-pandas-connect,pyspark-pandas-slow-connect
./python/run-tests --parallelism=2 --python-executables=python3 --modules pyspark-pandas-connect,pyspark-pandas-slow-connect
- name: Upload test results to report
if: always()
uses: actions/upload-artifact@v4
Expand Down

0 comments on commit f9d2f42

Please sign in to comment.