-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Additional interval joins implementations #169
base: master
Are you sure you want to change the base?
Conversation
var buffer_l_list: mutable.HashMap[ (Int, Int), LListElem[V]] = mutable.HashMap[ (Int, Int), LListElem[V]]() | ||
var l_list: Array[LListElem[V]] = Array() | ||
var h_list: ArrayBuffer[HListElem] = ArrayBuffer() | ||
//var l_list_storage: ArrayBuffer[LListElem[V]] = ArrayBuffer[LListElem[V]]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove it.
|
||
|
||
override def postConstruct(domains: Option[Int]):Unit = { | ||
//if (!built) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented out blocks.
@@ -87,8 +87,11 @@ object IntervalTreeJoinOptimChromosomeImpl extends Serializable { | |||
} | |||
.collect() | |||
|
|||
val domainsStringParam : String = spark.sqlContext.getConf(InternalParams.domains, | |||
null ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather None?
val intervalTree = { | ||
val tree = new IntervalHolderChromosome[InternalRow](localIntervals, intervalHolderClassName) | ||
val tree = if (domainsStringParam != null) new IntervalHolderChromosome[InternalRow](localIntervals, intervalHolderClassName, Option.apply(domainsStringParam.toInt)) else new IntervalHolderChromosome[InternalRow](localIntervals, intervalHolderClassName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use Scala pattern matching mechanism with None
@@ -3,7 +3,7 @@ package org.biodatageeks.sequila.rangejoins.methods.IntervalTree | |||
import org.biodatageeks.sequila.rangejoins.IntervalTree.Interval | |||
import org.biodatageeks.sequila.rangejoins.methods.base.BaseIntervalHolder | |||
|
|||
class IntervalHolderChromosome[T](allRegions: Array[(String,Interval[Int],T)], intervalHolderClassName:String) extends Serializable { | |||
class IntervalHolderChromosome[T](allRegions: Array[(String,Interval[Int],T)], intervalHolderClassName:String, domains: Option[Int] = None) extends Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest passing a generic Conf-type object which is a key/value instead of passing it explicitly.
@@ -54,7 +54,6 @@ object InternalParams { | |||
final val maxGap = "spark.biodatageeks.rangejoin.maxGap" | |||
final val minOverlap = "spark.biodatageeks.rangejoin.minOverlap" | |||
final val intervalHolderClass = "spark.biodatageeks.rangejoin.intervalHolderClass" | |||
|
|||
|
|||
final val domains = "spark.biodatageeks.rangejoin.domains" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather name it as spark.biodatageeks.rangejoin.iitii.domainsNum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add documentation to : https://biodatageeks.github.io/sequila/docs/configuration/join/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As well section on available algos in a tabular form:
https://biodatageeks.github.io/sequila/docs/algorithms/join/
No description provided.