-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
phylogenetic build fails because of missing nextalign #2
Comments
I used |
The context here is that I'm assuming |
Ah, the workflow was created before Nextclade v3 was released. I think we'd want to migrate it to |
Thanks! I'll check out that guide. |
* `output.insertions` will be a TSV file now * `--reference` is now spelled `--input-ref` * `--genemap` is now spelled `--input-annotation` * `--retry-reverse-complement` is no longer supported * `--output-insertions` is now spelled `--output-tsv` Note: dropping `--retry-reverse-complement` is the one that I am most unsure about, but this version completes this step.
Initially, the workflow failed with the following error: ``` Error: 0: When reading genome annotation 1: When reading file: "config/hku1/genemap.gff" 2: Attempted to parse the genome annotation as JSON and as GFF, but both attempts failed: JSON error: invalid type: string "NC_006577.2\tfeature\tsource\t1\t29926\t.\t+\t.\tgene=nuc NC_006577.2\tfeature\tgene\t206\t13600 \t.\t+\t.\tgene=ORF1a NC_006577.2\tfeature\tgene\t13600\t21753\t.\t+\t.\tgene=ORF1b NC_006577.2\tfeature\tgene\t21773\t22933\t.\t+\t.\tg ene=HE NC_006577.2\tfeature\tgene\t22942\t27012\t.\t+\t.\tgene=Spike NC_006577.2\tfeature\tgene\t22978\t25221\t.\t+\t.\tgene=S1 NC_00657 7.2\tfeature\tgene\t27051\t27380\t.\t+\t.\tgene=S2 NC_006577.2\tfeature\tgene\t27051\t27380\t.\t+\t.\tgene=ORF4 NC_006577.2\tfeature\tge ne\t27373\t27621\t.\t+\t.\tgene=E NC_006577.2\tfeature\tgene\t27633\t28304\t.\t+\t.\tgene=M NC_006577.2\tfeature\tgene\t28320\t29645\t.\ t+\t.\tgene=N NC_006577.2\tfeature\tgene\t28342\t28959\t.\t+\t.\tgene=N2", expected struct GeneMap at line 2 column 1 GFF3 error: When processing gene, 'N': When processing feature group 'N' ('N') of type 'gene': genes must consist of exactly one f eature: Expected exactly one element, but found: 2 2: Location: /workdir/packages/nextclade/src/gene/gene_map.rs:56 ``` While looking at the referenced file, and comparing it to the other `genemap.gff` files in the config, I noticed that all the others used `gene_name` for everything after the first `gene` line. I changed this file to match, and the workflow got past the point where it was previously erroring out. I have no idea why this worked; hopefully somebody will explain in the code review.
We should probably also document the Nextalign-like usage in the main Nextclade docs, i.e. using Nextclade v3 without a dataset and providing individual files using Documenting it better would allow for smoother transition for v2 users and also highlight that Nextclade v3 can be used as an aligner even where there's no dataset for a particular organism. Upd: I created an issue: nextstrain/nextclade#1456 |
Current Behavior
Possible solution
Based on the archived repo,
nextalign
was moved intonextclade
-- but the page linked fornextalign-cli
404s.I'm guessing the right answer here is to update the Snakemake file to either replace the
nextalign
call withnextclade run
with some set of options, or (looking at thezika
repo) covert things over to usingaugur
for the alignment?@kimandrews any insight you can provide would be appreciated!
The text was updated successfully, but these errors were encountered: