fixed bug in handling soft clipping at read start, updated tutorial #121
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fix is in response to to issues #120 and #119.
The changes to Cigar.pm fix the handling of soft-clipping at the start of read alignments: the original code appears to have assumed that the alignment start coordinate in the bam/sam file corresponds to the first base of the read. This isn't true for soft-clipped reads. I have tested this with some data I had with adapter contamination that leads to wide-spread soft clipping and hence an overestimation of unique insertion sites; bwa with and without adapter trimming now give much more similar results with this fix. It would probably be useful for someone else to double check the logic of Cigar.pm, and make sure I haven't missed something.
I have also updated the Bio-TraDIS tutorial to reflect changes to ENA, and that bwa is now the default mapper (i.e. I've included the --smalt tag in the bacteria_tradis call).