-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48867][BUILD] Upgrade okhttp
to 4.12.0, okio
to 3.9.0 and esdk-obs-java
to 3.24.3
#47795
base: master
Are you sure you want to change the base?
Conversation
okhttp
to 4.12.0 and okio
to 3.6.0okhttp
to 4.12.0 and okio
to 3.9.0
I have just seen that @melin opened the following Hadoop pull requests: HADOOP-19224 (3.4.1)Upgrade esdk to the latest version 3.24.3 apache:trunk: apache/hadoop#7003 where the esdk-obs-java library will be upgraded to 3.24.3. This is good news because it depends on okhttp 4.12.0: https://mvnrepository.com/artifact/com.huaweicloud/esdk-obs-java/3.24.3 It will be part of the next Hadoop release: 3.4.1. Currently Spark depends on Hadoop 3.4.0. If Spark upgrades to Hadoop 3.4.1, my changes in hadoop-cloud/pom.xml can be reverted |
@@ -154,6 +154,9 @@ json4s-scalap_2.13/4.0.7//json4s-scalap_2.13-4.0.7.jar | |||
jsr305/3.0.0//jsr305-3.0.0.jar | |||
jta/1.1//jta-1.1.jar | |||
jul-to-slf4j/2.0.16//jul-to-slf4j-2.0.16.jar | |||
kotlin-stdlib-jdk7/2.0.10//kotlin-stdlib-jdk7-2.0.10.jar | |||
kotlin-stdlib-jdk8/2.0.10//kotlin-stdlib-jdk8-2.0.10.jar | |||
kotlin-stdlib/2.0.10//kotlin-stdlib-2.0.10.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a little bit crazy to pull another JVM lang runtime into Spark classpath by just consuming an HTTP library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I agree that it is bad that we have to add kotlin-stdlib/2.0.10//kotlin-stdlib-2.0.10.jar as a dependency. I have just removed the other two extra dependencies because we do not need them. Related comment: #47795 (comment)
As I mentioned in the pull request's description the kubernetes-client's maintainers do not want upgrade to okhttp 4.x because it's based on Kotlin, they recommend to exclude 3.x. Related documentation:
https://github.com/fabric8io/kubernetes-client/blob/main/doc/KubernetesClientWithIPv6Clusters.md
Currently I do not see other solution to resolve these CVEs: CVE-2021-0341, CVE-2023-0833
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes-httpclient-vertx replace kubernetes-httpclient-okhttp to avoid this problem
fabric8io/kubernetes-client#2764
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melin the mockserver is not ported over..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hadoop uses the v3 mockwebserver in test dependencies only, which also seems to be the only place that koitlin comes in.
it's a little bit crazy to pull another JVM lang runtime into Spark classpath by just consuming an HTTP librar
mmm. maybe they've found koitlin a good language for coding; as more projects use then it will becomes less unusual -instead becoming just another dependency pain point à la guava
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roczei and @melin her fabric8io/kubernetes-client#5632 is the POC for replacing the MockWebServer but as the PR it is not merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -231,7 +231,9 @@ | |||
<!-- org.fusesource.leveldbjni will be used except on arm64 platform. --> | |||
<leveldbjni.group>org.fusesource.leveldbjni</leveldbjni.group> | |||
<kubernetes-client.version>6.13.3</kubernetes-client.version> | |||
<okio.version>1.17.6</okio.version> | |||
<okio.version>3.9.0</okio.version> | |||
<okhttp.version>4.12.0</okhttp.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have uncompressed the kotlin-stdlib-2.0.10.jar file and all class files are part of the kotlin directory / package:
[roczei@roczei-MBP16 2.0.10]$ jar tf kotlin-stdlib-2.0.10.jar | head
META-INF/
META-INF/MANIFEST.MF
META-INF/kotlin-stdlib.kotlin_module
kotlin/
kotlin/ArrayIntrinsicsKt.class
kotlin/BuilderInference.class
kotlin/CharCodeJVMKt.class
kotlin/CharCodeKt.class
kotlin/CompareToKt.class
kotlin/ContextFunctionTypeParams.class
[roczei@roczei-MBP16 2.0.10]$
All Spark unit tests have passed in this separate pull request: roczei#4. Currently I can only validate these, I hope it will be enough.
a202e72
to
bc59cfb
Compare
I have updated again this pull request and excluded the
https://mvnrepository.com/artifact/com.huaweicloud/esdk-obs-java/3.24.3 |
|
@dongjoon-hyun, could you please take a look again? I have answered all of the questions. What do you suggest, how should we go further with this pull request? |
921ad50
to
57a028a
Compare
I have uploaded a new version, excluded the vulnerable esdk-obs-java version and added the good one: 3.24.3 which depends on okhttp 4.12.0 Main benefit: there will be no user facing change because the org.apache.hadoop:hadoop-huaweicloud dependency will be not excluded. |
yes, we are getting closer now. if you update the commit message, it can be easier to follow what and why you have done things. add that esdk-obs-java is updated. and add that kubernet-client have now decided to updated okhttp from 3.x to 4.x for there upcoming version 7. |
I would like to make sure that I understand it correctly, so I should not use force push in the future when I change something in the code, each update should be a separate commit with a specific commit message which can help you to follow what I did, please confirm. |
update that and add that kubernet-client have now decided to updated okhttp from 3.x to 4.x for there upcoming version 7. "The kubernetes-client's maintainers do not want upgrade to okhttp 4.x |
okhttp
to 4.12.0 and okio
to 3.9.0okhttp
to 4.12.0, okio
to 3.9.0 and esdk-obs-java
to 3.24.3
Thanks @bjornjorgensen! Applied your recommendations. |
it seams that huawei have not updating there hadoop-huaweicloud client to esdk version 3.24.3 in master https://github.com/huaweicloud/obsa-hdfs/blob/2a6357f6689c731dacf1b28c025d462e9be0d6f4/hadoop-huaweicloud/pom.xml#L32 can it be that it don't work or ? |
</exclusion> | ||
</exclusions> | ||
</dependency> | ||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it actually work with this removal? if not best to stop trying to restore it and exclude all huaweicloud support with the release note/spark docs saying "explicitly import it"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply
will it actually work with this removal?
I do not have access to Huawei Cloud, therefore I could not test it with an obs bucket. Found this documentation for Spark testing: https://support.huaweicloud.com/intl/en-us/devg-dli/dli_09_0205.html#section6
I don't think apache spark have any tests for this, does hadoop have it
I did some research and found only Hadoop unit tests but these are disabled by default. Related documentation:
Similar configuration has to be created if somebody has such credentials:
$ cat src/test/resources/auth-keys.xml
<configuration>
<property>
<name>fs.contract.test.fs.obs</name>
<value>obs://testobscontract</value>
</property>
<property>
<name>fs.obs.access.key</name>
<value>secret</value>
</property>
<property>
<name>fs.obs.secret.key</name>
<value>secret</value>
</property>
</configuration>
$
Just to be on the safe side, I agree that we should do what @steveloughran suggested above:
if not best to stop trying to restore it and exclude all huaweicloud support with the release note/spark docs saying "explicitly import it"
@pan3793 / @panbingkun / @dongjoon-hyun / @melin / @bjornjorgensen what is your opinion about this suggestion? If you agre as well, I would like to implement these:
- Exclude the whole org.apache.hadoop:hadoop-huaweicloud artifact instead of com.huaweicloud:esdk-obs-java
- Update the "Does this PR introduce any user-facing change" section and mention that it includes this user facing change
- Request to add it to the release notes but I do not know what is the proper way to do it in case of Apache Spark. Please share with me if you know the official process for this. Thanks!
…`esdk-obs-java` to 3.24.3 ### What changes were proposed in this pull request? This PR aims to upgrade `okhttp` to 4.12.0, `okio` to 3.9.0 and `esdk-obs-java` to 3.24.3. ### Why are the changes needed? okhttp depends on okio which has to be upgraded as well. The new okhttp version fixes the following vulnerabilities: 1) CVE-2023-0833 - A flaw was found in Red Hat's AMQ-Streams, which ships a version of the OKHttp component with an information disclosure flaw via an exception triggered by a header containing an illegal value. This issue could allow an authenticated attacker to access information outside of their regular permissions. CVSSv3 Score:- 5.5(Medium) https://nvd.nist.gov/vuln/detail/CVE-2023-0833 2) CVE-2021-0341 - In verifyHostName of OkHostnameVerifier.java, there is a possible way to accept a certificate for the wrong domain due to improperly used crypto. This could lead to remote information disclosure with no additional execution privileges needed. User interaction is not needed for exploitation. CVSSv3 Score:- 7.5(High) https://nvd.nist.gov/vuln/detail/CVE-2021-0341 square/okhttp#6724 There are two places in the Spark repository where the okhttp dependency comes in as transitive dependency: 1) [INFO] +- org.apache.hadoop:hadoop-cloud-storage:jar:3.4.0:compile [INFO] | +- org.apache.hadoop:hadoop-annotations:jar:3.4.0:compile [INFO] | +- org.apache.hadoop:hadoop-aliyun:jar:3.4.0:compile [INFO] | | +- com.aliyun.oss:aliyun-sdk-oss:jar:3.13.2:compile [INFO] | | | +- org.jdom:jdom2:jar:2.0.6:compile [INFO] | | | +- com.aliyun:aliyun-java-sdk-core:jar:4.5.10:compile [INFO] | | | | +- org.ini4j:ini4j:jar:0.5.4:compile [INFO] | | | | +- io.opentracing:opentracing-api:jar:0.33.0:compile [INFO] | | | | \- io.opentracing:opentracing-util:jar:0.33.0:compile [INFO] | | | | \- io.opentracing:opentracing-noop:jar:0.33.0:compile [INFO] | | | +- com.aliyun:aliyun-java-sdk-ram:jar:3.1.0:compile [INFO] | | | \- com.aliyun:aliyun-java-sdk-kms:jar:2.11.0:compile [INFO] | | \- org.codehaus.jettison:jettison:jar:1.5.4:compile [INFO] | +- org.apache.hadoop:hadoop-azure-datalake:jar:3.4.0:compile [INFO] | | \- com.microsoft.azure:azure-data-lake-store-sdk:jar:2.3.9:compile [INFO] | \- org.apache.hadoop:hadoop-huaweicloud:jar:3.4.0:compile [INFO] | \- com.huaweicloud:esdk-obs-java:jar:3.20.4.2:compile [INFO] | +- com.jamesmurty.utils:java-xmlbuilder:jar:1.2:compile [INFO] | +- com.squareup.okhttp3:okhttp:jar:3.14.2:compile [INFO] | \- com.squareup.okio:okio:jar:1.17.6:compile The Hadoop team has attempted to remove okhttp from their codebase: remove okhttp usage: https://issues.apache.org/jira/browse/HADOOP-18890 Unfortunately the hadoop-huaweicloud dependency is still there which pulls in the vulnerable okhttp 3.x version. https://github.com/apache/hadoop/blob/trunk/hadoop-cloud-storage-project/hadoop-cloud-storage/pom.xml#L137C19-L137C37 Proposed solution for this: com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is vulnerable due to okhttp 3.x (CVE-2023-0833, CVE-2021-0341), it has to be upgraded to 3.24.3 which depends on okhttp 4.12.0 2) [INFO] +- org.apache.spark:spark-kubernetes_2.13:jar:4.0.0-SNAPSHOT:compile [INFO] | +- io.fabric8:kubernetes-httpclient-okhttp:jar:6.13.3:compile [INFO] | | +- io.fabric8:kubernetes-client-api:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-core:jar:6.13.3:compile [INFO] | | | | \- io.fabric8:kubernetes-model-common:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-gatewayapi:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-resource:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-rbac:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-admissionregistration:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-apps:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-autoscaling:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-apiextensions:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-batch:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-certificates:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-coordination:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-discovery:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-events:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-extensions:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-flowcontrol:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-networking:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-metrics:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-policy:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-scheduling:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-storageclass:jar:6.13.3:compile [INFO] | | | +- io.fabric8:kubernetes-model-node:jar:6.13.3:compile [INFO] | | | \- org.snakeyaml:snakeyaml-engine:jar:2.7:compile [INFO] | | +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile [INFO] | | | \- com.squareup.okio:okio:jar:1.17.6:compile [INFO] | | \- com.squareup.okhttp3:logging-interceptor:jar:3.12.12:compile kubernet-client maintainers have decided to update okhttp from 3.x to 4.x in their upcoming version 7: fabric8io/kubernetes-client#5778 My proposed solution based on the above finding: Exclude the 3.x version and switch to use okhttp 4.x. Source: https://github.com/fabric8io/kubernetes-client/blob/main/doc/KubernetesClientWithIPv6Clusters.md It is binary backwards compatible with okhttp 3.x. More details are here: https://square.github.io/okhttp/upgrading_to_okhttp_4/ ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No.
a206eef
to
5045380
Compare
FYI fabric8io/kubernetes-client#6661 feat: user Vert.x as the default HttpClient implementation |
Now we have a PR for upgrading to version 7.0.0 #49066 |
What changes were proposed in this pull request?
This PR aims to upgrade
okhttp
to 4.12.0,okio
to 3.9.0 andesdk-obs-java
to 3.24.3.Why are the changes needed?
okhttp depends on okio which has to be upgraded as well.
The new okhttp version fixes the following vulnerabilities:
CVE-2023-0833 - A flaw was found in Red Hat's AMQ-Streams,
which ships a version of the OKHttp component with an
information disclosure flaw via an exception triggered
by a header containing an illegal value. This issue could allow
an authenticated attacker to access information outside of their
regular permissions.
CVSSv3 Score:- 5.5(Medium)
https://nvd.nist.gov/vuln/detail/CVE-2023-0833
CVE-2021-0341 - In verifyHostName of OkHostnameVerifier.java,
there is a possible way to accept a certificate for the wrong
domain due to improperly used crypto. This could lead to remote
information disclosure with no additional execution privileges
needed. User interaction is not needed for exploitation.
CVSSv3 Score:- 7.5(High)
https://nvd.nist.gov/vuln/detail/CVE-2021-0341
square/okhttp#6724
There are two places in the Spark repository where the okhttp dependency comes
in as transitive dependency:
[INFO] +- org.apache.hadoop:hadoop-cloud-storage:jar:3.4.0:compile
[INFO] | +- org.apache.hadoop:hadoop-annotations:jar:3.4.0:compile
[INFO] | +- org.apache.hadoop:hadoop-aliyun:jar:3.4.0:compile
[INFO] | | +- com.aliyun.oss:aliyun-sdk-oss:jar:3.13.2:compile
[INFO] | | | +- org.jdom:jdom2:jar:2.0.6:compile
[INFO] | | | +- com.aliyun:aliyun-java-sdk-core:jar:4.5.10:compile
[INFO] | | | | +- org.ini4j:ini4j:jar:0.5.4:compile
[INFO] | | | | +- io.opentracing:opentracing-api:jar:0.33.0:compile
[INFO] | | | | - io.opentracing:opentracing-util:jar:0.33.0:compile
[INFO] | | | | - io.opentracing:opentracing-noop:jar:0.33.0:compile
[INFO] | | | +- com.aliyun:aliyun-java-sdk-ram:jar:3.1.0:compile
[INFO] | | | - com.aliyun:aliyun-java-sdk-kms:jar:2.11.0:compile
[INFO] | | - org.codehaus.jettison:jettison:jar:1.5.4:compile
[INFO] | +- org.apache.hadoop:hadoop-azure-datalake:jar:3.4.0:compile
[INFO] | | - com.microsoft.azure:azure-data-lake-store-sdk:jar:2.3.9:compile
[INFO] | - org.apache.hadoop:hadoop-huaweicloud:jar:3.4.0:compile
[INFO] | - com.huaweicloud:esdk-obs-java:jar:3.20.4.2:compile
[INFO] | +- com.jamesmurty.utils:java-xmlbuilder:jar:1.2:compile
[INFO] | +- com.squareup.okhttp3:okhttp:jar:3.14.2:compile
[INFO] | - com.squareup.okio:okio:jar:1.17.6:compile
The Hadoop team has attempted to remove okhttp from their codebase:
remove okhttp usage: https://issues.apache.org/jira/browse/HADOOP-18890
Unfortunately the hadoop-huaweicloud dependency is still there which
pulls in the vulnerable okhttp 3.x version.
https://github.com/apache/hadoop/blob/trunk/hadoop-cloud-storage-project/hadoop-cloud-storage/pom.xml#L137C19-L137C37
Proposed solution for this:
com.huaweicloud:esdk-obs-java:jar:3.20.4.2 is vulnerable due to
okhttp 3.x (CVE-2023-0833, CVE-2021-0341), it has to be upgraded to 3.24.3
which depends on okhttp 4.12.0
[INFO] +- org.apache.spark:spark-kubernetes_2.13:jar:4.0.0-SNAPSHOT:compile
[INFO] | +- io.fabric8:kubernetes-httpclient-okhttp:jar:6.13.3:compile
[INFO] | | +- io.fabric8:kubernetes-client-api:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-core:jar:6.13.3:compile
[INFO] | | | | - io.fabric8:kubernetes-model-common:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-gatewayapi:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-resource:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-rbac:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-admissionregistration:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-apps:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-autoscaling:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-apiextensions:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-batch:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-certificates:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-coordination:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-discovery:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-events:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-extensions:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-flowcontrol:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-networking:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-metrics:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-policy:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-scheduling:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-storageclass:jar:6.13.3:compile
[INFO] | | | +- io.fabric8:kubernetes-model-node:jar:6.13.3:compile
[INFO] | | | - org.snakeyaml:snakeyaml-engine:jar:2.7:compile
[INFO] | | +- com.squareup.okhttp3:okhttp:jar:3.12.12:compile
[INFO] | | | - com.squareup.okio:okio:jar:1.17.6:compile
[INFO] | | - com.squareup.okhttp3:logging-interceptor:jar:3.12.12:compile
kubernet-client maintainers have decided to update okhttp from 3.x to 4.x in their upcoming version 7:
fabric8io/kubernetes-client#5778
My proposed solution based on the above finding:
Exclude the 3.x version and switch to use okhttp 4.x. Source:
https://github.com/fabric8io/kubernetes-client/blob/main/doc/KubernetesClientWithIPv6Clusters.md
It is binary backwards compatible with okhttp 3.x. More details are here:
https://square.github.io/okhttp/upgrading_to_okhttp_4/
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the CIs.
Was this patch authored or co-authored using generative AI tooling?
No.