Unitig-caller produces different number of unitigs when compared to DBGWAS #25

kristinakordova · 2023-09-23T16:33:45Z

I am running

unitig-caller --call --reads input_reads.txt --out output_folder --threads 76 --pyseer

and

./DBGWAS -strains input_strains.txt -keepNA -output output_folder -nb-cores 76

the two input files have the same assembled genomes and NA as phenotype. I was expecting to get an identical number of nodes in the graph but I am getting a mismatch of a few million - 2,251,639 (Uniting-caller) and 7,022,727 (DBGWAS). Does Uniting-caller have a filtering threshold? Where does the difference come from?

johnlees · 2023-09-25T13:49:37Z

To make the graph, DBGWAS uses GATB, unitig-caller uses bifrost -- I would not guarantee that these graphs are identical. I don't know if the default k-mer length of both tools is the same. If you wanted to compare more thoroughly, I would suggest running bifrost and bcalm on your dataset. I don't think that unitig caller should be doing any additional filtering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unitig-caller produces different number of unitigs when compared to DBGWAS #25

Unitig-caller produces different number of unitigs when compared to DBGWAS #25

kristinakordova commented Sep 23, 2023

johnlees commented Sep 25, 2023

Unitig-caller produces different number of unitigs when compared to DBGWAS #25

Unitig-caller produces different number of unitigs when compared to DBGWAS #25

Comments

kristinakordova commented Sep 23, 2023

johnlees commented Sep 25, 2023