Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvements / bugfix in capped-regions partitioner #587

Merged
merged 4 commits into from
Sep 28, 2016

Conversation

ryan-williams
Copy link
Member

  • surface alternate coverage-computation strategy, "exploding" reads into per-base, depth-1 tuples, controllable by a flag
  • fix a bug with how read-starts are tallied given loci strings that start and stop in the interior of contigs
  • rename the loci-partitioners

@coveralls
Copy link

Coverage Status

Coverage increased (+0.08%) to 78.911% when pulling a8973a9 on ryan-williams:cci into 5fff3ea on hammerlab:master.

@@ -53,11 +52,28 @@ case class ContigCoverageIterator(halfWindowSize: Int,
else
lastStart = start

ends.enqueue(end)
numAdded += 1
if (end + halfWindowSize > locus) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I follow this change but not 100%, could you add some comments here about what should be happening?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, good idea, added a bunch!

also while explaining myself, found an edge case that I still wasn't handling correctly here, so fixed and added a test for it. thanks for prompting!

lowerBound = math.max(0, start - halfWindowSize)
upperBound = math.min(contigLength, end + halfWindowSize)
// Iterator over loci spanned by the current region, including the half-window buffer on each end.
val readLocusIterator = new LociIterator(Iterator(Interval(lowerBound, upperBound)).buffered)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mix of region and read here ... I don't love our use of region as our name for things with reference coordinates ... but until there is something better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea good point, done

@@ -26,88 +24,41 @@ class CoverageRDDSuiteRegistrar extends KryoTestRegistrar {
}
}

class RegionRDDSuite
class CoverageRDDSuite
extends GuacFunSuite
with RegionsRDDUtil
with ContigLengthsUtil {

override def registrar = "org.hammerlab.guacamole.readsets.rdd.CoverageRDDSuiteRegistrar"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter much here since this class is defined in this file, but you can do classOf[CoverageRDDSuiteRegistrar].getCanonicalName to protect against re-namings or packages moving around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea, changed here and elsewhere

@coveralls
Copy link

Coverage Status

Coverage increased (+0.08%) to 78.911% when pulling b133a5f on ryan-williams:cci into 5fff3ea on hammerlab:master.

@ryan-williams ryan-williams merged commit dcf6488 into hammerlab:master Sep 28, 2016
@ryan-williams ryan-williams deleted the cci branch September 28, 2016 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants