-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors unsing figaro with v3-v4 sequencing file MiSeq #37
Comments
This error usually happens because of reads that were pre-trimmed and of varying length. Do you know if that was the case here?
From: vehamel ***@***.***>
Sent: Thursday, May 13, 2021 7:23 AM
To: Zymo-Research/figaro ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [Zymo-Research/figaro] Errors unsing figaro with v3-v4 sequencing file MiSeq (#37)
Hi!
It is the first time I am using figaro. I am not an usual user of Python, so that's why I am asking help. I don't know what to do with this output and how to interprete it. I was using figaro to help me choose how to Trim my sequences because I find the quality poor.
Thanks a lot for your help!
Here it is what it run :
Forward read files appear to be of different lengths or of varied lengths. {(300, 0.7550505050505051), (299, 0.945050505050505), (300, 0.6905050505050505), (299, 0.8581818181818182), (300, 0.8383838383838383), (299, 0.9797979797979798), (299, 1.3761616161616161), (299, 0.854949494949495), (299, 0.9090909090909091), (299, 1.9526262626262625), (299, 1.0460606060606061), (299, 0.9923232323232323), (300, 0.3405050505050505), (299, 1.9267676767676767), (299, 0.7550505050505051), (299, 1.0233333333333334), (300, 1.5252525252525253), (299, 0.831919191919192), (300, 0.8145454545454546), (299, 0.7963636363636364), (299, 0.8504040404040404), (300, 0.6540404040404041), (300, 0.7146464646464646), (299, 0.9465656565656566), (300, 0.797979797979798), (299, 1.0925252525252525), (300, 0.6440404040404041), (300, 1.4343434343434343), (300, 0.7337373737373738), (299, 0.8819191919191919), (299, 0.753939393939394), (299, 0.9716161616161616), (299, 1.2044444444444444), (299, 1.0953535353535353), (299, 0.8723232323232324), (299, 0.8145454545454546), (299, 0.8686868686868686), (300, 0.6944444444444444), (299, 0.9166666666666666), (299, 1.8226262626262626), (300, 0.7167676767676767), (299, 0.9837373737373737), (299, 1.1268686868686868), (299, 1.0920202020202021), (300, 0.7135353535353536), (299, 0.7975757575757575), (299, 1.7006060606060607)}
Reverse read files appear to be of different lengths or of varied lengths. {(300, 0.6524242424242425), (300, 0.7228282828282828), (300, 0.34454545454545454), (300, 0.2771717171717172), (300, 0.5175757575757576), (300, 0.805959595959596), (300, 0.39555555555555555), (300, 0.5716161616161616), (300, 0.5268686868686868), (300, 0.19989898989898988), (300, 0.2832323232323232), (300, 0.5268686868686869), (300, 0.34383838383838383), (300, 0.21575757575757576), (300, 0.5425252525252525), (300, 0.8117171717171717), (300, 0.7348484848484849), (300, 0.4302020202020202), (300, 0.4011111111111111), (300, 0.49737373737373736), (299, 0.8988888888888888), (300, 0.4908080808080808), (300, 0.7632323232323233), (300, 0.9586868686868687), (300, 0.5066666666666667), (300, 0.6475757575757576), (300, 0.16353535353535353), (300, 0.45202020202020204), (300, 0.6666666666666667), (300, 0.612020202020202), (300, 0.3106060606060606), (300, 0.9995959595959596), (300, 0.5732323232323232), (300, 0.7272727272727273), (300, 0.6565656565656566), (300, 0.553030303030303), (300, 0.4670707070707071), (300, 0.38383838383838387), (300, 0.8771717171717172), (300, 0.547070707070707), (300, 0.8484848484848485), (299, 0.9389898989898989), (300, 0.5276767676767676), (300, 0.7070707070707071), (300, 0.7485858585858586)}
Forward reads appear to not be of consistent length. {(300, 0.7550505050505051), (299, 0.945050505050505), (300, 0.6905050505050505), (299, 0.8581818181818182), (300, 0.8383838383838383), (299, 0.9797979797979798), (299, 1.3761616161616161), (299, 0.854949494949495), (299, 0.9090909090909091), (299, 1.9526262626262625), (299, 1.0460606060606061), (299, 0.9923232323232323), (300, 0.3405050505050505), (299, 1.9267676767676767), (299, 0.7550505050505051), (299, 1.0233333333333334), (300, 1.5252525252525253), (299, 0.831919191919192), (300, 0.8145454545454546), (299, 0.7963636363636364), (299, 0.8504040404040404), (300, 0.6540404040404041), (300, 0.7146464646464646), (299, 0.9465656565656566), (300, 0.797979797979798), (299, 1.0925252525252525), (300, 0.6440404040404041), (300, 1.4343434343434343), (300, 0.7337373737373738), (299, 0.8819191919191919), (299, 0.753939393939394), (299, 0.9716161616161616), (299, 1.2044444444444444), (299, 1.0953535353535353), (299, 0.8723232323232324), (299, 0.8145454545454546), (299, 0.8686868686868686), (300, 0.6944444444444444), (299, 0.9166666666666666), (299, 1.8226262626262626), (300, 0.7167676767676767), (299, 0.9837373737373737), (299, 1.1268686868686868), (299, 1.0920202020202021), (300, 0.7135353535353536), (299, 0.7975757575757575), (299, 1.7006060606060607)}
Reverse reads appear to not be of consistent length. {(300, 0.6524242424242425), (300, 0.7228282828282828), (300, 0.34454545454545454), (300, 0.2771717171717172), (300, 0.5175757575757576), (300, 0.805959595959596), (300, 0.39555555555555555), (300, 0.5716161616161616), (300, 0.5268686868686868), (300, 0.19989898989898988), (300, 0.2832323232323232), (300, 0.5268686868686869), (300, 0.34383838383838383), (300, 0.21575757575757576), (300, 0.5425252525252525), (300, 0.8117171717171717), (300, 0.7348484848484849), (300, 0.4302020202020202), (300, 0.4011111111111111), (300, 0.49737373737373736), (299, 0.8988888888888888), (300, 0.4908080808080808), (300, 0.7632323232323233), (300, 0.9586868686868687), (300, 0.5066666666666667), (300, 0.6475757575757576), (300, 0.16353535353535353), (300, 0.45202020202020204), (300, 0.6666666666666667), (300, 0.612020202020202), (300, 0.3106060606060606), (300, 0.9995959595959596), (300, 0.5732323232323232), (300, 0.7272727272727273), (300, 0.6565656565656566), (300, 0.553030303030303), (300, 0.4670707070707071), (300, 0.38383838383838387), (300, 0.8771717171717172), (300, 0.547070707070707), (300, 0.8484848484848485), (299, 0.9389898989898989), (300, 0.5276767676767676), (300, 0.7070707070707071), (300, 0.7485858585858586)}
Traceback (most recent call last):
File "C:\Users\veham18\figaro\figaro\figaro.py", line 218, in
main()
File "C:\Users\veham18\figaro\figaro\figaro.py", line 210, in main
resultTable, forwardCurve, reverseCurve = trimParameterPrediction.performAnalysisLite(parameters.inputDirectory.value, parameters.minimumCombinedReadLength.value, subsample = parameters.subsample.value, percentile = parameters.percentile.value, forwardPrimerLength=parameters.forwardPrimerLength.value, reversePrimerLength=parameters.reversePrimerLength.value, namingStandardAlias=fileNamingStandard)
File "C:\Users\veham18\figaro\figaro\trimParameterPrediction.py", line 448, in performAnalysisLite
forwardReadLength, reverseReadLength = checkReadLengths(fastqList)
File "C:\Users\veham18\figaro\figaro\trimParameterPrediction.py", line 407, in checkReadLengths
raise fastqHandler.FastqValidationError("Unable to validate fastq files enough to perform this operation. Please check log for specific error(s).")
fastqHandler.FastqValidationError: Unable to validate fastq files enough to perform this operation. Please check log for specific error(s).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#37> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACEYNLLY2DUIQ5U74FEWVV3TNPOC7ANCNFSM442URRMA> . <https://github.com/notifications/beacon/ACEYNLKJ4QL6GAAPEESXHGTTNPOC7A5CNFSM442URRMKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4NI436JQ.gif>
|
Hi! No, I tried to trimmed them, but I give figaro the original files. So, no they were not trimmed. But, yes, it seems they are of various length (299 or 300), which I think is kind of expected no, one nucleotide difference is not a big difference ... What can I do about that? |
Hello, I too am trying to use figaro for the first time and have been able to get it to now run but am getting a similar output. These reads were already trimmed of primers and barcodes. Since we used phasing in our primers, I am not surprised that I have varied lengths of forward and reverse reads. Does figaro require reads to be of the same length? |
Hello! Me too! I will need to remove first part of the sequences because they must be primers. I forget to do it and now I was thinking to change that to my script! |
Hi! I cannot still use the tool! Can you help me? |
Hello, I was able to get FIGARO to work by first running fastqc and multiqc to determine the length that I wanted to trim to and make all reads the same length. I then used trimmomatic to get all the reads the same length. Trimmomatic has the option to crop at a certain length and drop reads that are shorter or you can choose to crop at the shortest sequencing read length; that's what I did. I then used FIGARO on the trimmed reads and once reads were all a consistent length, it ran fine. Hope this is helpful. |
Hello! I understand! But it is not the goal of using Figaro to uptimize where we should trim our sequences? Maybe I don't understand correctly?! |
Hello, FIGARO helps to choose parameters for the filterAndTrim function in DADA2. For FIGARO to work, however, the reads going into it must be one consistent length. So for example, I had reads that ranged from 269-281 bases. I cropped all reads to 269 and then used those trimmed reads in FIGARO. The output of FIGARO then provided what it determined to be optimal settings for the truncLen and maxEE settings in DADA2. I still am hoping that eventually FIGARO will be able to handle varying lengths. |
Thanks a lot for the explanation! I will try that ;) |
Thanks for the community support. Sorry for being away for a bit, new baby over the last few weeks has been keeping me occupied. I agree very much with the approach above: if your reads only differ by a slight bit of length (a few bases here and there), just pretrim them to the shortest length, since you don't want to be selecting trimming parameters that are in the area where trimming may have happened to some reads. If your reads differ in length by a lot due to quality trimming, I recommend not doing that quality trim, as the purpose of FIGARO is to optimize the DADA2 native quality trimming methods. |
Well, now I am wondering, don't you have to trim to a consistent length in order to use FIGARO? Or is that now not the case? Thanks! |
All the reads in a given direction should be the same length going into FIGARO. Forward and reverse reads don’t need to be the same length, but all the forward and all the reverse reads should be the same length.
From: Janet ***@***.***>
Sent: Wednesday, June 2, 2021 7:56 PM
To: Zymo-Research/figaro ***@***.***>
Cc: Michael Weinstein ***@***.***>; Comment ***@***.***>
Subject: Re: [Zymo-Research/figaro] Errors unsing figaro with v3-v4 sequencing file MiSeq (#37)
Well, now I am wondering, don't you have to trim to a consistent length in order to use FIGARO? Or is that now not the case? Thanks!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#37 (comment)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACEYNLKQF3AKIKHZM3IV2TDTQ3VM5ANCNFSM442URRMA> . <https://github.com/notifications/beacon/ACEYNLKYLMMSY5V62MDZN5LTQ3VM5A5CNFSM442URRMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGLP3THA.gif>
|
Thanks! |
Hi Janetw, reading your comments really helped me going trhough my illumina v3-v4 data but I have some troubles and doubts for trimming my sequences in to a same lenght; since there are no adapters in my fastq files I supposed I only have to use de command "CROP" in trimmomatic, Is this correct? Hopping you can help me. |
Brenda, yup passing
@michael-weinstein congrats on the sprog! 😺
Remember, and as above, |
Hi!
It is the first time I am using figaro. I am not an usual user of Python, so that's why I am asking help. I don't know what to do with this output and how to interprete it. I was using figaro to help me choose how to Trim my sequences because I find the quality poor.
Thanks a lot for your help!
Here it is what it run :
Forward read files appear to be of different lengths or of varied lengths. {(300, 0.7550505050505051), (299, 0.945050505050505), (300, 0.6905050505050505), (299, 0.8581818181818182), (300, 0.8383838383838383), (299, 0.9797979797979798), (299, 1.3761616161616161), (299, 0.854949494949495), (299, 0.9090909090909091), (299, 1.9526262626262625), (299, 1.0460606060606061), (299, 0.9923232323232323), (300, 0.3405050505050505), (299, 1.9267676767676767), (299, 0.7550505050505051), (299, 1.0233333333333334), (300, 1.5252525252525253), (299, 0.831919191919192), (300, 0.8145454545454546), (299, 0.7963636363636364), (299, 0.8504040404040404), (300, 0.6540404040404041), (300, 0.7146464646464646), (299, 0.9465656565656566), (300, 0.797979797979798), (299, 1.0925252525252525), (300, 0.6440404040404041), (300, 1.4343434343434343), (300, 0.7337373737373738), (299, 0.8819191919191919), (299, 0.753939393939394), (299, 0.9716161616161616), (299, 1.2044444444444444), (299, 1.0953535353535353), (299, 0.8723232323232324), (299, 0.8145454545454546), (299, 0.8686868686868686), (300, 0.6944444444444444), (299, 0.9166666666666666), (299, 1.8226262626262626), (300, 0.7167676767676767), (299, 0.9837373737373737), (299, 1.1268686868686868), (299, 1.0920202020202021), (300, 0.7135353535353536), (299, 0.7975757575757575), (299, 1.7006060606060607)}
Reverse read files appear to be of different lengths or of varied lengths. {(300, 0.6524242424242425), (300, 0.7228282828282828), (300, 0.34454545454545454), (300, 0.2771717171717172), (300, 0.5175757575757576), (300, 0.805959595959596), (300, 0.39555555555555555), (300, 0.5716161616161616), (300, 0.5268686868686868), (300, 0.19989898989898988), (300, 0.2832323232323232), (300, 0.5268686868686869), (300, 0.34383838383838383), (300, 0.21575757575757576), (300, 0.5425252525252525), (300, 0.8117171717171717), (300, 0.7348484848484849), (300, 0.4302020202020202), (300, 0.4011111111111111), (300, 0.49737373737373736), (299, 0.8988888888888888), (300, 0.4908080808080808), (300, 0.7632323232323233), (300, 0.9586868686868687), (300, 0.5066666666666667), (300, 0.6475757575757576), (300, 0.16353535353535353), (300, 0.45202020202020204), (300, 0.6666666666666667), (300, 0.612020202020202), (300, 0.3106060606060606), (300, 0.9995959595959596), (300, 0.5732323232323232), (300, 0.7272727272727273), (300, 0.6565656565656566), (300, 0.553030303030303), (300, 0.4670707070707071), (300, 0.38383838383838387), (300, 0.8771717171717172), (300, 0.547070707070707), (300, 0.8484848484848485), (299, 0.9389898989898989), (300, 0.5276767676767676), (300, 0.7070707070707071), (300, 0.7485858585858586)}
Forward reads appear to not be of consistent length. {(300, 0.7550505050505051), (299, 0.945050505050505), (300, 0.6905050505050505), (299, 0.8581818181818182), (300, 0.8383838383838383), (299, 0.9797979797979798), (299, 1.3761616161616161), (299, 0.854949494949495), (299, 0.9090909090909091), (299, 1.9526262626262625), (299, 1.0460606060606061), (299, 0.9923232323232323), (300, 0.3405050505050505), (299, 1.9267676767676767), (299, 0.7550505050505051), (299, 1.0233333333333334), (300, 1.5252525252525253), (299, 0.831919191919192), (300, 0.8145454545454546), (299, 0.7963636363636364), (299, 0.8504040404040404), (300, 0.6540404040404041), (300, 0.7146464646464646), (299, 0.9465656565656566), (300, 0.797979797979798), (299, 1.0925252525252525), (300, 0.6440404040404041), (300, 1.4343434343434343), (300, 0.7337373737373738), (299, 0.8819191919191919), (299, 0.753939393939394), (299, 0.9716161616161616), (299, 1.2044444444444444), (299, 1.0953535353535353), (299, 0.8723232323232324), (299, 0.8145454545454546), (299, 0.8686868686868686), (300, 0.6944444444444444), (299, 0.9166666666666666), (299, 1.8226262626262626), (300, 0.7167676767676767), (299, 0.9837373737373737), (299, 1.1268686868686868), (299, 1.0920202020202021), (300, 0.7135353535353536), (299, 0.7975757575757575), (299, 1.7006060606060607)}
Reverse reads appear to not be of consistent length. {(300, 0.6524242424242425), (300, 0.7228282828282828), (300, 0.34454545454545454), (300, 0.2771717171717172), (300, 0.5175757575757576), (300, 0.805959595959596), (300, 0.39555555555555555), (300, 0.5716161616161616), (300, 0.5268686868686868), (300, 0.19989898989898988), (300, 0.2832323232323232), (300, 0.5268686868686869), (300, 0.34383838383838383), (300, 0.21575757575757576), (300, 0.5425252525252525), (300, 0.8117171717171717), (300, 0.7348484848484849), (300, 0.4302020202020202), (300, 0.4011111111111111), (300, 0.49737373737373736), (299, 0.8988888888888888), (300, 0.4908080808080808), (300, 0.7632323232323233), (300, 0.9586868686868687), (300, 0.5066666666666667), (300, 0.6475757575757576), (300, 0.16353535353535353), (300, 0.45202020202020204), (300, 0.6666666666666667), (300, 0.612020202020202), (300, 0.3106060606060606), (300, 0.9995959595959596), (300, 0.5732323232323232), (300, 0.7272727272727273), (300, 0.6565656565656566), (300, 0.553030303030303), (300, 0.4670707070707071), (300, 0.38383838383838387), (300, 0.8771717171717172), (300, 0.547070707070707), (300, 0.8484848484848485), (299, 0.9389898989898989), (300, 0.5276767676767676), (300, 0.7070707070707071), (300, 0.7485858585858586)}
Traceback (most recent call last):
File "C:\Users\veham18\figaro\figaro\figaro.py", line 218, in
main()
File "C:\Users\veham18\figaro\figaro\figaro.py", line 210, in main
resultTable, forwardCurve, reverseCurve = trimParameterPrediction.performAnalysisLite(parameters.inputDirectory.value, parameters.minimumCombinedReadLength.value, subsample = parameters.subsample.value, percentile = parameters.percentile.value, forwardPrimerLength=parameters.forwardPrimerLength.value, reversePrimerLength=parameters.reversePrimerLength.value, namingStandardAlias=fileNamingStandard)
File "C:\Users\veham18\figaro\figaro\trimParameterPrediction.py", line 448, in performAnalysisLite
forwardReadLength, reverseReadLength = checkReadLengths(fastqList)
File "C:\Users\veham18\figaro\figaro\trimParameterPrediction.py", line 407, in checkReadLengths
raise fastqHandler.FastqValidationError("Unable to validate fastq files enough to perform this operation. Please check log for specific error(s).")
fastqHandler.FastqValidationError: Unable to validate fastq files enough to perform this operation. Please check log for specific error(s).
The text was updated successfully, but these errors were encountered: