Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] make deployed artifact pom self-contained without parent poms #10051

Open
gerashegalov opened this issue Dec 14, 2023 · 4 comments
Open
Labels
build Related to CI / CD or cleanly building tech debt

Comments

@gerashegalov
Copy link
Collaborator

gerashegalov commented Dec 14, 2023

As of 23.12 and before we rely on separately deployed intermediate parent poms such as

rapids-4-spark-parent/
rapids-4-spark-private-parent/

Moreover there are separate Scala 2.13 parent poms https://oss.sonatype.org/content/repositories/snapshots/com/nvidia/

rapids-4-spark-private-parent_2.12/
rapids-4-spark-private-parent_2.13/
rapids-4-spark-private_2.12/
rapids-4-spark-private_2.13/

There is no shared parent pom to avoid dependencies between pipelines

There is an additional complexity from the fact that we do not deploy:deploy the artifacts with all related projects but selectively deploy artifacts with deploy:deploy-file which makes the process susceptible to accidental omissions of intermediate dependencies.

This issues proposes to fold parent pom definitions into the deployed leaf poms with the effect that there is no implicit dependency on any internal pom. The self-contained pom should contain just enough details for the artifact to be used as a dependency for external projects and for dependency:get not require any intermediate poms. This may be achievable by flatten:flatten or some similar custom logic whose output will be used with deploy:deploy-file.

Note that this issue is only about the poms that are published externally to Central for other projects to consume spark-rapids as a dependency, which we already massage before deploying.

We should feel free to maintain any form most suitable for building spark-rapids.

@gerashegalov gerashegalov added feature request New feature or request ? - Needs Triage Need team to review and classify build Related to CI / CD or cleanly building labels Dec 14, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Dec 19, 2023
@NvTimLiu
Copy link
Collaborator

NvTimLiu commented Jan 2, 2024

@gerashegalov I'm adding dependency check for the pre-merge and nightly build jobs, according to our discuss method

1, Create a fresh local dir such as /tmp/local-release-repo
2, mvn deploy:deploy-file -Durl=file:/tmp/local-release-repo  xxx \
3, Create another fresh local directory for /tmp/test-get-dest
4, Create a fresh empty dir /tmp/m2-cache every iteration such that there are no cached side-effects from one 
 depenendcy:get to another
5, execute depenency:get xxx
    -Dmaven.repo.local=/tmp/m2-cache

The dependency check still failed on jdk-profiles on current branch-24.02, are you going to fix it first?

@gerashegalov
Copy link
Collaborator Author

Hi @NvTimLiu

The dependency check still failed on jdk-profiles on current branch-24.02, are you going to fix it first?

The test should mimic the release deploy logic irrespective whether this issue (eliminating parent and intermediate poms) is fixed.

If the test is failing it means it still does not reflect that currently deploy-file for jdk-profiles is required.

Ideally the test should share the same script

###### Deploy the parent pom file ######
$DEPLOY_CMD -Dfile=./pom.xml -DpomFile=./pom.xml
###### Deploy the artifact jar(s) ######
$DEPLOY_CMD -DpomFile=$POM_FILE \
-Dfile=$FPATH-$DEFAULT_CUDA_CLASSIFIER.jar \
-Dsources=$FPATH-sources.jar \
-Djavadoc=$FPATH-javadoc.jar \
-Dfiles=$DEPLOY_FILES \
-Dtypes=$DEPLOY_TYPES \
-Dclassifiers=$CLASSIFIERS

just with different parameters. It would be helpful to create a file with the list of the artifacts required for the release. Then we could use it as the input in the release/test script.

Once this issue is fixed we will drop the jdk-profiles module from this list, but the test remains valid.

@NvTimLiu
Copy link
Collaborator

NvTimLiu commented Jan 4, 2024

Hi @NvTimLiu

The dependency check still failed on jdk-profiles on current branch-24.02, are you going to fix it first?

The test should mimic the release deploy logic irrespective whether this issue (eliminating parent and intermediate poms) is fixed.

If the test is failing it means it still does not reflect that currently deploy-file for jdk-profiles is required.

Ideally the test should share the same script

###### Deploy the parent pom file ######
$DEPLOY_CMD -Dfile=./pom.xml -DpomFile=./pom.xml
###### Deploy the artifact jar(s) ######
$DEPLOY_CMD -DpomFile=$POM_FILE \
-Dfile=$FPATH-$DEFAULT_CUDA_CLASSIFIER.jar \
-Dsources=$FPATH-sources.jar \
-Djavadoc=$FPATH-javadoc.jar \
-Dfiles=$DEPLOY_FILES \
-Dtypes=$DEPLOY_TYPES \
-Dclassifiers=$CLASSIFIERS

just with different parameters. It would be helpful to create a file with the list of the artifacts required for the release. Then we could use it as the input in the release/test script.

Once this issue is fixed we will drop the jdk-profiles module from this list, but the test remains valid.

@gerashegalov Thanks for the explanation.

I agree we can reuse the deploy script [spark-rapids/jenkins/deploy.sh](https://github.com/NVIDIA/spark-rapids/blob/ed1fa9f0c92f8c7c1a4cbdedf2b4d61db6d1beca/jenkins/deploy.sh#L97-L107) to do dependency check for the to-be-released artifacts.

A file list of artifacts needs extra script for deploy.sh to work with, e.g.
file list: pom.xml, dist/pom.xml, rapids.jar, rapids-cuda11.jar, then deploy.sh should parse from the file list the parent pom, dist pom, classifier jar, main jar, etc.

Over all, we need to refactor deploy.sh [spark-rapids/jenkins/deploy.sh](https://github.com/NVIDIA/spark-rapids/blob/ed1fa9f0c92f8c7c1a4cbdedf2b4d61db6d1beca/jenkins/deploy.sh#L97-L107) to work with list file.

Let me check if we should break this issue into diff tasks.

@gerashegalov
Copy link
Collaborator Author

gerashegalov commented Jan 6, 2024

@NvTimLiu I'll file some other issue for discussing a reproducible deploy test. To reiterate, we will need this test regardless of what our poms look like.

This issue #10051 is about a potential simplification of published poms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building tech debt
Projects
None yet
Development

No branches or pull requests

5 participants