Skip to content

Commit

Permalink
Performance improvement for BinaryFileReader and ImageReader
Browse files Browse the repository at this point in the history
Temporary workaround for a bug introduced in spark 2.1 (from 2.0).
  • Loading branch information
imatiach-msft authored and elibarzilay committed Aug 15, 2017
1 parent d8eb988 commit 2426bf0
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions src/readers/src/main/scala/BinaryFileReader.scala
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ object BinaryFileReader {
var data: RDD[(String, Array[Byte])] = null
try {
val streams = spark.sparkContext.binaryFiles(path, spark.sparkContext.defaultParallelism)
.repartition(spark.sparkContext.defaultParallelism)

// Create files RDD and load bytes
data = if (!inspectZip) {
Expand Down

0 comments on commit 2426bf0

Please sign in to comment.