Add UMI support #72

remiolsen · 2024-02-08T13:40:03Z

Reconciled conflict with First version of the explore command #59
Added anglerfish-explore command as entrypoint

… umi_support

remiolsen · 2024-02-08T13:57:15Z

@alneberg The last feature before release. I would like to merge this before #71 if you don't mind. I guess my solution to reconciling our code is to split the baby in half then to sew her back together again (if that analogy works). My point is that it's a bit ugly but it works.

alneberg · 2024-02-09T10:47:27Z

anglerfish/demux/demux.py

    """
    nt = re.compile("\*n([atcg])")
    nts = "".join(re.findall(nt, cs_string))
-
+    if umi_before > 0:
+        nts = nts[umi_before:]


Does this really work? If there are substitutions/deletions/insertions in the alignment both the reference and query nucleotide will be in the cs string right?

For example:
cg:Z:6M2D21M cs:Z::1*at:2*ac:1-ac:21
Or am I missing something?

It works if you add the correct number of N's into the template sequence. So we need to know that the index region is e.g., exactly 10 nt index + 9 nt UMI long. That way we can read the sequence directly from the CS string. Unrelated example:

cs:Z::33*nt*nc*nc*ng*ng*na*ng*na:9

Insertions and deletions will of course affect this perfect picture, but I don't think too terribly.

Fair enough. I think we could use the fact that we're looking specifically for cases where there is an n to the left inside the *na* section. But we can leave that for later

alneberg · 2024-02-09T10:55:21Z

setup.py

@@ -38,6 +38,7 @@
    entry_points={
        "console_scripts": [
            "anglerfish=anglerfish.anglerfish:anglerfish",
+            "anglerfish-explore=anglerfish.explore.cli:main",


Nice! Although I don't know what the effect of this is. Does it make the explore command executable or add it to the path?

With conda, and I guess with most package managers, both! So now this will work:

% anglerfish-explore --help Usage: anglerfish-explore [OPTIONS] Options: -f, --fastq TEXT Fastq file to align [required] ...

I'm thinking maybe for a 1.0 release we can unify these commands. I'm open to having them as subcommands, eg. anglerfish run and anglerfish explore

alneberg

Great work! This breaks the explore command a tiny bit for UMI:s. I've located the issue and I should be able to create a fix for it. Separate PR I guess. Your call if it should be included in this release or not. If I can fix it quickly I think it would make sense to include it.

remiolsen · 2024-02-13T09:21:47Z

Great work! This breaks the explore command a tiny bit for UMI:s. I've located the issue and I should be able to create a fix for it. Separate PR I guess. Your call if it should be included in this release or not. If I can fix it quickly I think it would make sense to include it.

I was thinking of doing a release on Friday, if you can't get a fix out in time it's not a huge deal.

Also I think this code, the adaptor handling parts, could do with simplifying or refactoring. For instance the separate I7 and I5 handling could be DRY'd up. But that's for another time.

remiolsen added 9 commits January 25, 2024 16:08

First UMI support experiment

9308f2c

Fix umi detection

a3a7cdf

minor simplification

cc7fd2c

Merge branch 'master' of https://github.com/remiolsen/anglerfish into…

e01aa3c

… umi_support

Slowly unbreaking merge

92483fa

Merge reconciliation part 2

7713574

Added anglerfish-explore endpoint

51523bd

Merge reconciliation part 3

9f06527

Minor cleanup of comments

17a9667

remiolsen added this to the 0.6.1 milestone Feb 8, 2024

remiolsen requested a review from alneberg February 8, 2024 13:40

Keep devcontainer.json ugly

a52fd06

alneberg reviewed Feb 9, 2024

View reviewed changes

Changed to anglerfish-explore command

4b027d8

alneberg approved these changes Feb 13, 2024

View reviewed changes

remiolsen merged commit 580fc97 into master Feb 13, 2024
12 checks passed

remiolsen deleted the umi_support branch February 13, 2024 09:22

remiolsen mentioned this pull request Mar 5, 2024

Support indices with UMIs #38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UMI support #72

Add UMI support #72

remiolsen commented Feb 8, 2024

remiolsen commented Feb 8, 2024 •

edited

Loading

alneberg Feb 9, 2024

remiolsen Feb 9, 2024

alneberg Feb 13, 2024

alneberg Feb 9, 2024

remiolsen Feb 9, 2024

alneberg left a comment

remiolsen commented Feb 13, 2024

Add UMI support #72

Add UMI support #72

Conversation

remiolsen commented Feb 8, 2024

remiolsen commented Feb 8, 2024 • edited Loading

alneberg Feb 9, 2024

Choose a reason for hiding this comment

remiolsen Feb 9, 2024

Choose a reason for hiding this comment

alneberg Feb 13, 2024

Choose a reason for hiding this comment

alneberg Feb 9, 2024

Choose a reason for hiding this comment

remiolsen Feb 9, 2024

Choose a reason for hiding this comment

alneberg left a comment

Choose a reason for hiding this comment

remiolsen commented Feb 13, 2024

remiolsen commented Feb 8, 2024 •

edited

Loading