You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for an awesome piece of software. I have used TranscriptClean on large-scale assemblies with high success before.
However I am now running a metatranscriptomics project in which I am only interested in reads that map to the COI gene of the taxon of interest. I have 6504 reads that map to the full-length reference COI sequence. I have sorted the .sam file with samtools, but when I run TranscriptClean I get the error that the list "index is out of range". When I inspect the mapping in a genome map viewer, it looks good albeit with some gaps here and there. Still, all my reads are within the reference.
I sort the samples
samtools sort -O sam -T sample.sort -o sample.sort.sam mapped1.sam
I run this commant python ....../TranscriptClean.py --sam sample.sort.sam --genome mygenome.fasta --out outfile
The program then returns list index out of range Took 0:00:54 to process transcript batch. Took 0:00:00 to combine all outputs.
Below is a snippet from the sorted .sam file.
Any ideas about what goes wrong? It looks like TrascriptClean cannot run without a proper genome map or chromosome list, but I wanted to ask in case others are getting the same "error".
Appreciate any help and I am open to other solutions.
Niklas
The text was updated successfully, but these errors were encountered:
Hi,
Thanks for an awesome piece of software. I have used TranscriptClean on large-scale assemblies with high success before.
However I am now running a metatranscriptomics project in which I am only interested in reads that map to the COI gene of the taxon of interest. I have 6504 reads that map to the full-length reference COI sequence. I have sorted the .sam file with samtools, but when I run TranscriptClean I get the error that the list "index is out of range". When I inspect the mapping in a genome map viewer, it looks good albeit with some gaps here and there. Still, all my reads are within the reference.
I sort the samples
samtools sort -O sam -T sample.sort -o sample.sort.sam mapped1.sam
I run this commant
python ....../TranscriptClean.py --sam sample.sort.sam --genome mygenome.fasta --out outfile
The program then returns
list index out of range Took 0:00:54 to process transcript batch. Took 0:00:00 to combine all outputs.
Below is a snippet from the sorted .sam file.
@HD VN:1.0 SO:coordinate @SQ SN:Facetotecta LN:1527 @RG ID:Unpaired_reads_assembled_against_Facetotecta SM: @PG ID:samtools PN:samtools VN:1.14 CL:samtools sort -O sam -T sample.sort -o sample.sort.sam mapped1_sorted_Facetotecta_cut_extraction.sam m54057_190926_040405/25100833/ccs_1 0 Facetotecta 1 255 2M1P2M1P1M5P1M10P1M3P1M2P1M5P1M4P1M1P1M2P1M2P1M1P1M1P1M1P2M3P1M2P1M2P1M1P1M2P1M1P1M2P4M3P1M1P1M1P1M2P2M1P1M2P1M5P2M3P1M1P1M2P2M4P2M1P1M2P1M2P1M8P1M3P1M21P2M4P1M5P1M2P1M2P1M3P1M4P1M5P1M3P1M5P1M1P1M3P2M3P1M1P1M6P1M1P1M3P1M2P1M1P1M10P1M2P1M18P1M1P2M4P1M6P1M1P1M8P2M3P2M11P1M2P1M6P1M2P1M6P1M3P2M2P1M7P1M6P1M1P1M1P1M1P1M1P1M9P1M8P3M5P1M1P1M1P1M1P1M1P2M7P1M3P1M1P1M1P1M2P1M1P1M14P1M1P1M4P1M4P1M1P1M12P3M2P1M6P1M1P1M3P2M2P1M1P1M1P2M3P1M3P2M3P1M3P2M1P1M4P2M23P1M4P1M4P1M8P1M2P1M1P1M1P1M17P1M1P1M5P1M3P1M1P1M16P1M1P1M1P2M3P1M5P1M1P3M1P3M1P1M2P2M4P2M1P1M1P1M5P2M3P1M7P1M5P1M2P1M2P1M1P1M1P1M1P1M1P1M3P2M1P1M30P1M1P1M2P1M1P2M8P1M3P1M8P1M1P1M1P1M8P2M1P2M1P1M1P1M4P2M5P1M1P1M4P1M10P1M5P1M4P1M5P1M2P1M10P1M1P1M1P1M3P1M4P1M4P2M1P2M9P2M4P2M3P2M2P1M1P1M3P2M2P1M2P2M2P3M3P1M17P1M4P1M1P1M3P2M2P1M4P1M8P1M1P1M1P1M2P1M2P3M1P1M3P1M4P1M1P2M2P1M1P1M3P1M3P1M3P1M4P1M3P1M1P1M3P1M1P1M5P2M2P1M1P2M1P1M3P1M1P1M1P1M3P1M8P2M1P2M1P1M2P2M11P1M3P2M8P1M1P2M14P1M14P2M8P1M1P2M3P1M4P1M5P1M1P1M1P1M1P1M2P2M6P1M1P1M1P1M1P1M4P1M2P1M3P1M6P1M2P1M2P2M2P1M1P3M1P1M2P1M6P1M2P1M1P1M2P1M9P1M2P2M4P1M4P1M5P1M6P1M1P1M1P1M3P1M3P1M4P1M7P1M8P1M8P1M9P1M16P1M2P1M18P1M4P1M12P1M6P1M3P1M3P1M2P1M6P1M1P1M12P1M1P1M1P1M12P2M7P1M1P1M3P1M3P1M1P2M1P1M4P1M3P1M3P1M1P1M7P1M3P1M2P2M21P1M6P1M3P1M1P2M29P2M2P1M2P1M1P1M81P2M1P1M4P2M1P1M2P1M2P1M1P2M1P1M1P1M1P2M1P1M4P2M1P1M2P2M10P1M3P1M3P1M1P1M4P1M1P1M1P1M1P1M2P1M1P1M1P2M6P1M9P1M3P1M2P2M3P1M7P1M2P1M3P1M1P3M4P1M6P1M2P1M1P2M1P1M3P1M1P1M9P1M1P1M1P2M3P1M4P2M3P3M1P1M10P1M8P1M4P1M2P1M4P1M2P1M2P1M4P2M5P1M2P1M5P1M1P2M3P1M1P1M1P2M13P1M1P1M1P1M2P1M2P1M12P1M9P1M1P1M1P1M1P1M2P1M3P2M2P1M2P1M8P1M1P2M1P2M1P3M10P2M4P1M2P1M4P1M4P1M1P1M8P1M2P1M1P1M4P2M1P2M2P1M2P1M3P1M9P1M5P2M4P2M17P1M1P1M13P1M2P1M3P1M11P1M2P1M10P1M2P1M22P1M1P1M19P1M4P1M3P1M14P1M5P1M3P1M2P1M3P1M5P1M12P1M11P1M2P1M2P1M6P1M2P1M10P1M1P1M9P1M3P1M1P1M4P1M2P1M2P1M1P1M12P1M3P1M2P1M1P1M1P1M2P1M2P1M3P1M2P1M4P1M5P2M1P1M2P1M2P1M1P1M2P1M3P1M3P1M6P1M1P1M3P1M2P1M6P1M3P1M6P1M1P1M3P1M1P1M1P1M4P1M4P1M8P1M6P1M1P1M1P1M2P2M * 0 0 ATGAAACGATGATTATTTTCCACTAACCACAAAGACATTGGTACAATGTACTTTATCCTGGGAGCGTGATCAGGTATAATCGGTACTGGTATAAGAATACTTATTCGAAGGGAACTAGGTCAACCCGGTAGACTTATTGGTAATGACCAAATTTACAACGTAATTGTTACAGCTCATGCATTTATCATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGCTTTGGCAATTGGCTTGTTCCTCTTATAATTGGAGCTCCTGATATAGCCTTCCCTCGAATAAACAATATAAGATTTTGACTTCTTCCTCCTTCCCTCTCTCTTCTTTTATCAAGAAGATTAACTGAATCTGGAGTTGGAACAGGATGAACAGTTTACCCTCCTCTTTCAAGTAATATTGCCCACAGTGGTATTTCCGTTGACTTAGCTATCTTCTCACTCCATTTGGCAGGAGCAAGATCAATTTTAGGTGCCATTAATTTCATTACTACTATCATCAATATACGTAATAAAATAATCACAATAGACCGATTACCTCTATTTGTATGATCAGTTTTCATCACAGCGTTTCTCC * RG:Z:Unpaired_reads_assembled_against_Facetotecta m54057_190926_040405/7602703/ccs_2 0 Facetotecta 1 255 2M1P2M1P1M5P1M10P1M3P1M2P1M5P1M4P1M1P1M2P1M2P1M1P1M1P1M1P2M3P1M2P1M2P1M1P1M2P1M1P1M2P4M3P1M1P1M1P1M2P2M1P1M2P1M5P2M3P1M1P1M2P2M4P2M1P1M2P1M2P1M8P1M3P1M21P2M4P1M5P1M2P1M2P1M3P1M4P1M5P1M3P1M5P1M1P1M3P2M3P1M1P1M6P1M1P1M3P1M2P1M1P1M10P1M2P1M18P1M1P2M4P1M6P1M1P1M8P2M3P2M11P1M2P1M6P1M2P1M6P1M3P2M2P1M7P1M6P1M1P1M1P1M1P1M1P1M9P1M8P3M5P1M1P1M1P1M1P1M1P2M7P1M3P1M1P1M1P1M2P1M1P1M14P1M1P1M4P1M4P1M1P1M12P3M2P1M6P1M1P1M3P2M2P1M1P1M1P2M3P1M3P2M3P1M3P2M1P1M4P2M23P1M4P1M4P1M8P1M2P1M1P1M1P1M17P1M1P1M5P1M3P1M1P1M16P1M1P1M1P2M3P1M5P1M1P3M1P3M1P1M2P2M4P2M1P1M1P1M5P2M3P1M7P1M5P1M2P1M2P1M1P1M1P1M1P1M1P1M3P2M1P1M30P1M1P1M2P1M1P2M8P1M3P1M8P1M1P1M1P1M8P2M1P2M1P1M1P1M4P2M5P1M1P1M4P1M10P1M5P1M4P1M5P1M2P1M10P1M1P1M1P1M3P1M4P1M4P2M1P2M9P2M4P2M3P2M2P1M1P1M3P2M2P1M2P2M2P3M3P1M17P1M4P1M1P1M3P2M2P1M4P1M8P1M1P1M1P1M2P1M2P3M1P1M3P1M4P1M1P2M2P1M1P1M3P1M3P1M3P1M4P1M3P1M1P1M3P1M1P1M5P2M2P1M1P2M1P1M3P1M1P1M1P1M3P1M8P2M1P2M1P1M2P2M11P1M3P2M8P1M1P2M14P1M14P2M8P1M1P2M3P1M4P1M5P1M1P1M1P1M1P1M2P2M6P1M1P1M1P1M1P1M4P1M2P1M3P1M6P1M2P1M2P2M2P1D1P3D1P1D2P1M6P1M2P1M1I1M2P1M8P1I1M2P2M4P1M4P1M5P1M6P1M1P1M1P1M3P1M3P1M4P1M4P3I1M8P1M8P1M9P1M16P1M2P1M18P1M4P1M12P1M6P1M3P1M3P1M2P1M6P1M1P1M12P1M1P1M1P1M12P2M7P1M1P1M3P1M3P1M1P2M1P1M4P1M3P1M3P1M1P1M7P1M3P1M2P2M21P1M6P1M3P1M1P2M29P2M2P1M2P1M1P1M81P2M1P1M4P2M1P1M2P1M2P1M1P2M1P1M1P1M1P2M1P1M4P2M1P1M2P2M10P1M3P1M3P1M1P1M4P1M1P1M1P1M1P1M2P1M1P1M1P2M6P1M9P1M3P1M2P2M3P1M7P1M2P1M3P1M1P3M4P1M6P1M2P1M1P2M1P1M3P1M1P1M9P1M1P1M1P2M3P1M4P2M3P3M1P1M10P1M8P1M4P1M2P1M4P1M2P1M2P1M4P2M5P1M2P1M5P1M1P2M3P1M1P1M1P2M13P1M1P1M1P1M2P1M2P1M12P1M9P1M1P1M1P1M1P1M2P1M3P2M2P1M2P1M8P1M1P2M1P2M1P3M10P2M4P1M2P1M4P1M4P1M1P1M8P1M2P1M1P1M4P2M1P2M2P1M2P1M3P1M9P1M5P2M4P2M17P1M1P1M13P1M2P1M3P1M11P1M2P1M10P1M2P1M22P1M1P1M19P1M4P1M3P1M14P1M5P1M3P1M2P1M3P1M5P1M12P1M11P1M2P1M2P1M6P1M2P1M10P1M1P1M9P1M3P1M1P1M4P1M2P1M2P1M1P1M12P1M3P1M2P1M1P1M1P1M2P1M2P1M3P1M2P1M4P1M5P2M1P1M2P1M2P1M1P1M2P1M3P1M3P1M6P1M1P1M3P1M2P1M6P1M3P1M6P1M1P1M3P1M1P1M1P1M4P1M4P1M8P1M6P1M1P1M1P1M2P2M4P3M2P1M2P1I1M3P1M4P1M20P1D4P1M2P1M1P1M1P2M13P1M6P2M2P1M5P2M2P2M1P1M1P1M2P1M1P1M1P1M1P1M1P1M4P3M1P2M1P1M2P1M1P1M3P6M4P1M1P2M1P2M2P2M2P1M5P1M1P1M13P1M2P1M1P1M1P1M1P1M8P1M8P1M2P1M4P1M3P1M1P2M14P3M1P1M4P1M3P1M2P1M2P1M12P2M1P3M2P2M2P2M11P1M1P1M2P1D6P1D3P1D7P1D2P1D14P1D3P1M2P1M1P1M3P2M1P1M4P2M2P3M1P2M1P1M1P2M1P1M1P1M2P2M2P2M1P2M8P2M1P3M1P1M1P5M2P1M8P2M1P1M1P2M2P1M1P2M1P1M2P1M1P1M1P3M2P2M1P1M1P1M1P2M2P1M8P1M1P1M1P2M3P1M1P3M2P1M4P1M2P2M2P1M1P1M1P1M1P2M3P1M2P2M51P1M1P1M1P1M1P1M2P2M1P2M208P1M1P1M1P1M1P1M1P1M1P1M2P1M1P1M1P2M9P1M1P2M1P2M4P1M1P1M4P1M1P1M1P1M3P1M3P1M1P1M1P1M1P1M1P1M14P1M4P1M * 0 0 ATGAAACGATGATTATTTTCAACCAATCATAAAGATATTGGAACTATATATATAATATTCGGCGCCTGATCCGGCACTATAGGAGTGGCAATAAGAATAATTATCCGTAGAGAACTAGGGCAACCCGGTTCTCTAATTGGTAACGATCAAATCTATAATGTAATTGTAACTGCCCACGCCTTTATCATAATTTTCTTTATAGTAATACCAATCATAATTGGAGGATTTGGAAACTGACTAATTCCTCTGATATTAGGATCCCCTGATATAGCATTTCCACGGATAAATAACATAAGATTCTGACTACTCCCCCCATCATTAATTCTTTTAATTAGAAGAAGACTAACAGAAAGGGGGGTAGGAACAGGATGAACGGTCTATCCTCCTCTTTCAAGAAATATCTCTCATAGAGGAGTCTCAGTAGACATGGCCATCTTCTCCCTCCACTTAGCTGGAGCAAGATCCATTTTAGGAGCCATTAATTTTATTACTACGATCATTAATATACGCAACAAAAACCTTTCTTTTGACCGTCTACCATTATTAGTATGATCTATCTTTATTACTACTATCCTTTTACTACTTTCTTTACCAGTACTTGCCGGAGCTATTACCATACTATTAACAGATCGAAATATTAATACTTCATTCTTTGATCCAGGTGGGGATCCTGTATTATATCAACATCTATTTTGATTTTTCGGACACCCAGAAGTTTATATTTTAATTCTACCAGGGTTTGGAATAGTTTCCCACATTATTAGACAAGAAAG *
Any ideas about what goes wrong? It looks like TrascriptClean cannot run without a proper genome map or chromosome list, but I wanted to ask in case others are getting the same "error".
Appreciate any help and I am open to other solutions.
Niklas
The text was updated successfully, but these errors were encountered: