Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 2.2.0 compatibility #8

Open
glaphil opened this issue Dec 8, 2017 · 2 comments
Open

Spark 2.2.0 compatibility #8

glaphil opened this issue Dec 8, 2017 · 2 comments

Comments

@glaphil
Copy link

glaphil commented Dec 8, 2017

Hello Takeshi,

I'm working on a POC with kinesis + spark structured streaming.

I read that Spark Structured Streaming is marked as GA only from 2.2.0, so I'd like to use your lib with spark 2.2.0, but I got the following error :

{code}
17/12/08 13:43:09 ERROR StreamExecution: Query [id = 95bccb80-d234-449b-aaa7-ab19dda71d2b, runId = c6f1c512-7e63-4ead-ba42-1fe67318ddb2] terminated with error
java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.json.JSONOptions.(Lscala/collection/immutable/Map;)V
at org.apache.spark.sql.execution.datasources.json.JsonKinesisValueFormat.buildReader(JsonKinesisValueFormat.scala:105)
at org.apache.spark.sql.kinesis.KinesisSource.getBatch(KinesisSource.scala:477)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$apply$7.apply(StreamExecution.scala:607)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2$$anonfun$apply$7.apply(StreamExecution.scala:603)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at org.apache.spark.sql.execution.streaming.StreamProgress.foreach(StreamProgress.scala:25)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at org.apache.spark.sql.execution.streaming.StreamProgress.flatMap(StreamProgress.scala:25)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2.apply(StreamExecution.scala:603)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch$2.apply(StreamExecution.scala:603)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatch(StreamExecution.scala:602)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(StreamExecution.scala:306)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:290)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
{code}

Would you know how I can fix it ? Or should I come back to spark 2.1.0 (it runs on it) ?

I saw that you're no more maintaining this lib, you refered to databricks but I can't find any link to download a lib from their site or github... It seems that the only way to use their connector is by using their integrated data analysis solution....
Would you know if kinesis will be soon "officialy" supported by spark ?

Thanks!

Philippe

@kavyaMariGowda
Copy link

Hi ,

Is this issues reloved?

@glaphil
Copy link
Author

glaphil commented Jul 13, 2018

I came back to spark 2.1.0..
I hope databricks will open source their connector...
Be careful with that lib, at every run it will create dynamodb checkpoint tables on N. Virginia zone (due KCL lib used under the hood), and you can't set its names or zone...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants