-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Druid - Spark interoperation is problematic due to Netty dependency mismatch #4390
Comments
I think we should patch spark to make it work with netty 4.1 and just call out in the spark batch indexer for the druid version which pulls in 4.1 everywhere that it only works with patched spark. |
Also @b-slim because it's related to Hadoop |
it seems this will also break anyone using https://github.com/SparklineData/spark-druid-olap . |
Not sure about https://github.com/SparklineData/spark-druid-olap, but for https://github.com/metamx/druid-spark-batch publishing |
Looking quickly at https://github.com/SparklineData/spark-druid-olap does not seem to be impacted by this anyway (don't see dependency on druid processing). |
reiterating @nishantmonu51 's question , what is forcing us to upgrade Netty at this point ? |
@himanshug probably nothing right now (not sure about druid-sql), but conceptually I'm against this approach. IMO huge projects should drive each other to update to the newer versions of libraries like Netty, not to stop each other from updating, because others are not updating. In comments here: https://issues.apache.org/jira/browse/SPARK-19552 a Spark committer Sean Owen said that they are not that much against updating Netty in Spark anymore, so IMO the solution is to update Spark to Netty 4.1, not to downgrade Druid to 4.0. |
@leventov yes, keeping dependency versions up-to-date is good in general. but in the current case, given that there is no definite need, I would let spark get updated first and then update Druid rather than trying to find workarounds for now. |
@himanshug we're looking to upgrade the http-client metamx/http-client#29 |
for note Spark 2.3 updated it's Netty dep to 4.1 |
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions. |
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time. |
Spark task (https://github.com/metamx/druid-spark-batch) which uses Druid fails with
AbstractMethodError
:It doesn't seem that Spark will upgrade to Netty 4.1 soon, so proposed solution is to isolate Druid's usage of Netty somehow, e. g. via shading.
Modules which currently depend on Netty 4.1 in Druid:
druid-sql
druid-rocketmq
druid-avro-extensions
druid-services
viadruid-sql
druid-historgram
viadruid-sql
druid-indexing-service
viahadoop-client
druid-indexing-hadoop
But pretty much all everything in Druid is going to depend on Netty 4.1 via http-client, see metamx/http-client#29.
@gianm @drcrallen @himanshug any thoughts?
The text was updated successfully, but these errors were encountered: