Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel fails to download packages when using a self-signed certificate #10

Open
adammoody opened this issue Feb 17, 2021 · 11 comments
Open

Comments

@adammoody
Copy link

I'm trying to build the Open-CE conda packages for my system, but it fails when building tensorboard. In particular, I'm trying to run the following command using a fresh clone:

./open-ce/open-ce/open-ce build env --output_folder /path/to/condabuild --mpi_types system ./open-ce-environments/envs/opence-env.yaml

I get an error like the following:

ERROR: Analysis of target '//tensorboard/pip_package:build_pip_package' failed; build aborted: no such package '@com_google_guava//': java.io.IOException: Error downloading [https://mirror.bazel.build/repo1.maven.org/maven2/com/google/guava/guava/25.1-jre/guava-25.1-jre.jar, https://repo1.maven.org/maven2/com/google/guava/guava/25.1-jre/guava-25.1-jre.jar] to /<snip>/27bd16533e5e4780bf8639cfd2308872/external/com_google_guava/guava-25.1-jre.jar: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

I found that the work around listed at bazelbuild/bazel#5741 (comment) seems to help if I modify tensorboard-feedstock/buildscripts/set_python_path_for_bazelrc.sh to point to the path to my cacerts file, e.g.,

cat > $BAZEL_RC_DIR/python_configure.bazelrc
startup --host_jvm_args=-Djavax.net.ssl.trustStore=/path/to/cacerts \
        --host_jvm_args=-Djavax.net.ssl.trustStorePassword=<password>

Is there an open-ce option to register a private cacerts file like this across the feedstocks?

@jayfurmanek
Copy link
Contributor

Is there an open-ce option to register a private cacerts file like this across the feedstocks?

Currently, no. Typically we use the ca-certs from Anaconda where we can but maybe you've found a build leak here.

For TensorFlow, we had to add a patch to not hardcode the ca bundle location:
https://github.com/open-ce/tensorflow-feedstock/blob/master/recipe/0107-do-not-hardcode-ca-cert-location.patch

We then set CURL_CA_BUNDLE in an activate script to pick up the anaconda certs:
https://github.com/open-ce/tensorflow-feedstock/blob/master/scripts/activate.sh#L41-L43

but we've not seen any other issues related to ca-certs. What OS are you using? And do you have the ca-certs in a different location. I'm wondering if bazel is typically finding the OS level certs for us and is not for you.

Does it work for you if you set the below?

--host_jvm_args=-Djavax.net.ssl.trustStore=${CONDA_PREFIX}/ssl/cacert.pem

@adammoody
Copy link
Author

adammoody commented Feb 17, 2021

Thanks, @jayfurmanek . In our case, the problem shows up due to a custom signed certificate we have for our web proxy. Rather than download files directly from the internet, everything passes through our web proxy, and that proxy has its own signed certificate. That certificate is not included in any public cacerts file. Instead, we have added it to a private cacerts file that is placed in a non-standard path on the file system.

There is some description in the linked issue above, and in this thread as well:
https://groups.google.com/g/bazel-discuss/c/13uPDObyfQg

I'm guessing anyone routing traffic through a web proxy with a custom signed certificate might run into a similar issue.

@adammoody
Copy link
Author

I am building on a RedHat system within a Singularity image that I think was produced from the Open-CE Docker file. I need to double check, but I think we used this Docker file as a base:
https://github.com/open-ce/open-ce/blob/master/images/builder-cuda-ppc64le/Dockerfile.cuda-10.2

@jayfurmanek
Copy link
Contributor

Ah, I see.
You may need to add the same thing for Tensorflow in the bazelrc as well as point CURL_CA_BUNDLE in the tensorflow activate script to point to you certificate. That last part is for when/if TF pulls things from tensorflow hub or any other online sources.

We don't have a current way of centrally controlling this, but we would accept any contributions to add it.

@adammoody
Copy link
Author

Thanks, @jayfurmanek . I'm still looking from my side whether I can do things more cleanly, where I could perhaps register our local cert into the various files where conda/bazel are already looking. If not, one option might be to define a new open-ce option where one could specify the path to their local cacerts file and then have that update the various feedstocks to add that path.

@jayfurmanek
Copy link
Contributor

yeah, we already widely use the CUDA_HOME environment variable to specify where the cudatoolkit lives, so we could do something similar for optional ca-cert locations.

@adammoody
Copy link
Author

I'm wondering whether things might work if I can add our local cert into the ${CONDA_PREFIX}/ssl/cacert.pem file. It sounds like conda refers to that file, and that you've also redirected tensorflow's curl command to that file as well.

Do you also have bazel pointing to that, or does bazel use the cacerts from its built-in jdk?

@jayfurmanek
Copy link
Contributor

Right, by default Bazel uses the ca-certs from the JVM it's using and you need to use the parameters you have above to tell it to use different ones.
The certificates from Anaconda are used at runtime (where bazel and java are not present) when Tensorflow needs to use curl to access remote resources.

@adammoody
Copy link
Author

I started down the hard way to create a patch for the bazelrc file in each feedstock repo, and then found that I could add just add these settings once to a global bazelrc file, e.g.,:

echo "startup --host_jvm_args=-Djavax.net.ssl.trustStore=/path/to/cacerts" >> ~/.bazelrc
echo "startup --host_jvm_args=-Djavax.net.ssl.trustStorePassword=changeit" >> ~/.bazelrc

In fact, there are two global resource files that one can use for this:
https://docs.bazel.build/versions/master/guide.html#bazelrc-the-bazel-configuration-file

/etc/bazel.bazelrc
~/.bazelrc

Bazel reads those files in addition to the project resource file and combines the settings. That greatly simplifies the process of pointing bazel to use a different trustStore file.

@jgallucci32
Copy link

@adammoody This is great info. Thanks for pointing out the default locations for the .bazelrc file.

@adammoody
Copy link
Author

Thanks, @jgallucci32 . I'm also content with this as a solution for this issue, so we can close it out. A one or two line change to my .bazelrc file is easy. It may be helpful to include this as a troubleshooting tip in the docs somewhere in case others run into the same problem.

npanpaliya pushed a commit to npanpaliya/tensorboard-feedstock that referenced this issue Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants