You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Greetings everyone, and thank you for providing this fantastic pipeline.
I've encountered an issue while working with a eukaryote genome in multifasta format. Phame converts the genome to a single-line fasta format by concatenating the sequences, which is fine. However, when I analyzed the results, all genes were mapped to the first chromosome. I traced the error to the {genome}_cds.gff file and the CDScoords.txt file.
If my understanding is correct, the issue arises because Phame is not adjusting the coordinates in the original .gff file, which corresponds to a non-concatenated fasta genome. As a result, I only identified genes with SNPs on the first chromosome.
To address this, I'm concatenating the genome and generating a new .gff file that corresponds to the concatenated genome. I believe this will resolve the issue.
The text was updated successfully, but these errors were encountered:
Thank you for finding this.I think we should resolve this by making sure that coordinates are correctly transferred when phame processes them. For the workaround, did you rerun the concatenated fasta through the annotation pipeline again? Also, would you be able to post the CDScoords.txt file and the original gff file here, so that we can document the error and fix it. Also, apologies for a tardy response.
Hi! Yes, I rerun the concated fasta with the new gff (home made) and Phame do the work.
I clean the folder, so, I don't have the CDScoords.txt or any output from first run, sorry about that.
If you want I can give you the inputs files, genome, gff and some fastq and you can run it.
Hi, sorry for tardy response. Yes, it would be great if you can post the input file (or accession ids) so that we can recreate the issue here. Thank you.
Greetings everyone, and thank you for providing this fantastic pipeline.
I've encountered an issue while working with a eukaryote genome in multifasta format. Phame converts the genome to a single-line fasta format by concatenating the sequences, which is fine. However, when I analyzed the results, all genes were mapped to the first chromosome. I traced the error to the {genome}_cds.gff file and the CDScoords.txt file.
If my understanding is correct, the issue arises because Phame is not adjusting the coordinates in the original .gff file, which corresponds to a non-concatenated fasta genome. As a result, I only identified genes with SNPs on the first chromosome.
To address this, I'm concatenating the genome and generating a new .gff file that corresponds to the concatenated genome. I believe this will resolve the issue.
The text was updated successfully, but these errors were encountered: