Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36695][Formats] Bump Parquet dependency to 1.14.4 #25646

Merged
merged 3 commits into from
Jan 17, 2025

Conversation

MartijnVisser
Copy link
Contributor

What is the purpose of the change

  • Bump Parquet dependency

Brief change log

  • Update POM and NOTICE files

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Nov 12, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@MartijnVisser
Copy link
Contributor Author

@flinkbot run azure

@snuyanzin
Copy link
Contributor

snuyanzin commented Nov 25, 2024

Hey @MartijnVisser
it seems the reason of failure

Nov 12 18:17:39 java.util.ServiceConfigurationError: com.fasterxml.jackson.databind.Module: Provider com.fasterxml.jackson.datatype.jdk8.Jdk8Module not found
Nov 12 18:17:39 	at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:589)

(could be seen at https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=63635&view=logs&j=e60d5d3b-2e2f-5bb7-484a-16135ea73209&t=9363ed9e-7ac2-5835-c98d-4fdb89b19329&l=20959)

is that with parquet-jackson 1.13.x there were jackson-jaxrs and jackson-xc coming

flink/flink-formats/flink-parquet$ ../../mvnw dependency:tree | grep jackson
[INFO] |  +- org.apache.flink:flink-shaded-jackson:jar:2.15.3-19.0:provided
[INFO] |     \- com.fasterxml.jackson.core:jackson-core:jar:2.15.3:compile
[INFO] |  +- org.apache.parquet:parquet-jackson:jar:1.13.1:runtime
[INFO] |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:provided
[INFO] |  |  \- org.codehaus.jackson:jackson-xc:jar:1.8.3:provided
[INFO] |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:provided
[INFO] |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:provided
[INFO] |  \- com.fasterxml.jackson.core:jackson-databind:jar:2.15.3:compile
[INFO] |     \- com.fasterxml.jackson.core:jackson-annotations:jar:2.15.3:compile

and with 1.14 it doesn't

flink/flink-formats/flink-parquet$ ../../mvnw dependency:tree | grep jackson
[INFO] |  +- org.apache.flink:flink-shaded-jackson:jar:2.15.3-19.0:provided
[INFO] |     \- com.fasterxml.jackson.core:jackson-core:jar:2.15.3:compile
[INFO] |  +- org.apache.parquet:parquet-jackson:jar:1.14.4:runtime
[INFO] |  +- com.fasterxml.jackson.datatype:jackson-datatype-jdk8:jar:2.15.3:compile
[INFO] |  |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.3:provided
[INFO] |  |  \- org.codehaus.jackson:jackson-xc:jar:1.8.3:provided
[INFO] |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:provided
[INFO] |  +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:provided
[INFO] |  \- com.fasterxml.jackson.core:jackson-databind:jar:2.15.3:compile
[INFO] |     \- com.fasterxml.jackson.core:jackson-annotations:jar:2.15.3:compile

thus since it seems flink-python relies on it then need to add extra dependencies there i guess

@davidradl
Copy link
Contributor

Reviewed by Chi on 28/11/24 business as usual progressing with committer involved.

@MartijnVisser
Copy link
Contributor Author

Reviewed by Chi on 28/11/24 business as usual progressing with committer involved.

I do think that it's very smart to NOT approve PRs where the CI hasn't passed, especially if the failure in CI is related to the change the PR is introducing.

Copy link
Contributor

@davidradl davidradl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiting for the CI to pass

@davidradl
Copy link
Contributor

Reviewed by Chi on 28/11/24 business as usual progressing with committer involved.

I do think that it's very smart to NOT approve PRs where the CI hasn't passed, especially if the failure in CI is related to the change the PR is introducing.

@MartijnVisser yes I had previously approved this - thinking it was straight forward. I did not revoke this approval when I realised the CI was failing (I have now). This is a good reminder to revoke my approval when I get more information and also check the the CI succeeds before approving. Thanks for the feedback.

@MartijnVisser MartijnVisser force-pushed the bump-parquet-dependency branch from cd96919 to 7bfeb9f Compare November 29, 2024 15:51
<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jdk8</artifactId>
<version>2.15.3</version>
Copy link
Contributor

@snuyanzin snuyanzin Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<version>2.15.3</version>
<version>${jackson-bom.version}</version>

I think it's better to reuse defined properties, will ease dependency bumps

@snuyanzin
Copy link
Contributor

there is one minor comment from my side,
the rest is lgtm 👍

Copy link
Contributor

@tomncooper tomncooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, agree with @snuyanzin that we should ref properties wherever possible.

@davidradl
Copy link
Contributor

Reviewed by Chi on 05/12/24 Approve - looking for committer to merge

@snuyanzin
Copy link
Contributor

snuyanzin commented Dec 5, 2024

Reviewed by Chi on 05/12/24 Approve - looking for committer to merge

@davidradl please do not confuse people with such comments
as mentioned above there is still one comment expecting a tiny change before merging
#25646 (comment)

UPD: I would consider a good approach to have an agreement on approval between participating reviewers before making decision that a committer is required for merging

@davidradl
Copy link
Contributor

Reviewed by Chi on 05/12/24 Approve - looking for committer to merge

@davidradl please do not confuse people with such comments as mentioned above there is still one comment expecting a tiny change before merging #25646 (comment)

UPD: I would consider a good approach to have an agreement on approval between participating reviewers before making decision that a committer is required for merging

@snuyanzin thanks for the feedback. I see how this could be misleading. I guess PRs could have
1- no reviewers
2- some reviewers
3- some reviewers with a committer involved - like this one.

So if there is an active committer moving the PR through - then this comment could be misleading as it implies something additional is needed, but everything is progressing. Maybe we just say business as usual or in hand, in this case; just to show we have looked at it.

Maybe we have another comment. Needs review approvals for the 2nd case.
And needs committer review / merge, if the reviewers have all approved.

Am I understanding your concern correctly? WDYT?

@snuyanzin
Copy link
Contributor

snuyanzin commented Dec 5, 2024

@davidradl not sure that this is the right place to discuss it here...
I have nothing against your comments in general.

My point is: if there is already a comment requesting some changes
then all the other comments after that one with "approve" without referring to the requesting changes
might lead to wrong understanding that everything is approved and losing that requesting change comment...
I guess same is happening here and now since in general this discussion has nothing to do with the PR
so may be you can have a special wiki page accumulating such feedback/comments

Copy link
Contributor

@snuyanzin snuyanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there was unplanned discussion
I would bubble up the actual feedback comment
#25646 (comment)

the rest is lgtm

@davidradl
Copy link
Contributor

davidradl commented Dec 6, 2024

@snuyanzin I agree that this is not the right place for discussion. Maybe add a comment to https://cwiki.apache.org/confluence/display/FLINK/Community+Health+Initiative+(CHI)+workgroup and/or we could use the dev list for the discussion so everyone can be involved.

@snuyanzin snuyanzin force-pushed the bump-parquet-dependency branch from 7bfeb9f to ef87a8a Compare January 16, 2025 22:41
@snuyanzin
Copy link
Contributor

Since 2.0 is coming to the end let's speed it up a bit

I resolved the conflicts and made use of property

@davidradl , @tomncooper can you please rereview this change since you've already looked at it

@@ -377,6 +377,12 @@ under the License.
<artifactId>commons-lang3</artifactId>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
Copy link
Contributor

@davidradl davidradl Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that https://github.com/FasterXML/jackson-datatype-jdk8?tab=readme-ov-file is deprecated. It recommends that the version from https://github.com/FasterXML/jackson-modules-java8 should be used. Is there a reason we have to use the deprecated maven dependancy?

Copy link
Contributor

@snuyanzin snuyanzin Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidradl what made you thinking we use deprecated?

I notice that https://github.com/FasterXML/jackson-datatype-jdk8?tab=readme-ov-file is deprecated

If you look at https://github.com/FasterXML/jackson-modules-java8
especially in README at this section https://github.com/FasterXML/jackson-modules-java8
there is

To include modules, you use some or all of:

<!--	Java 8 Datatypes	-->
<dependency>
    <groupId>com.fasterxml.jackson.datatype</groupId>
    <artifactId>jackson-datatype-jdk8</artifactId>
</dependency>

https://github.com/FasterXML/jackson-modules-java8?tab=readme-ov-file#maven-dependencies

so same as we have in this PR's pom

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are right - sorry for the confusion. I did not notice the jackson version we are using.

@snuyanzin snuyanzin merged commit d8cd559 into apache:master Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants