-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is filterAndTrim effect to NA results or because minBoot? What parameter setting should be used appropriately? #2008
Comments
I wouldn't suggest recalibrating the parameters, which I guess in this case would be reducing The most common cause of chunks of the data having |
Thank you for taking your valuable time to give us your advice. I removed adapters and low complexity sequences. I have tried reducing
Previously I forgot to mention that: (1) We used single-read sequencing (the resulting sequence length is between 100-150 bp), (2) the metabarcoding primer is trnL-P6. We have tried using a large reference database (299,137 accession numbers). Also, when we are trying to use the reference database only for flora in the study area, the results were still the same. Thus, I don't think the problem is the reference databases. |
To clarify what I said above, I would not suggest "suggest recalibrating the parameters, which I guess in this case would be reducing minBoot to below 50."
Large is not the same as comprehensive when it comes to reference databases. There may be taxa that are in your data but not well-represented in the reference database. I am not very familiar with |
Could you please clarify if this means that the minBoot value of 50 shouldn't be lowered further? Apologies for asking again, I just want to avoid any confusion.
Initially, we used the pre-made rCRUX database and then we filtered out species/genera that were not represented in the study area. After performing BLAST for NA, we found that the plant families and genera we identified matched those in GenBank. Importantly, these sequences are also present in our reference database. All the best, |
Don't lower
Are the hits you get BLAST-ing against nt for the NA ASVs hitting the trnL gene? The other way that NA is assigned is if the sequence is not discriminatory enough, that is the sequence is similar to several taxa at the (e.g.) genus level, which will usually yield and NA assignment at that level. If you are seeing NA results at the more resolved levels, but definite assignments at the higher levels (class, order, etc.) then this is likely what is going on. |
Dear dada2 enthusiasts,
I'm pretty new to dada2 so I would like to hear some opinions from those who are experts and use dada2 regularly about my issue analysis.
I have tried using filterAndTrim with multiple parameter setups (e.g. truncQ=2, 5, or 7; minLen = 50 or 75) and found that
out.plant <- filterAndTrim(plant.raw, filtered_plant, maxN=0, minLen = 50, maxEE=2, truncQ=2, rm.phix=TRUE, compress=TRUE, multithread=TRUE)
is the good parameter for now.
I followed the dada2 tutorial steps until I reached the assignTaxonomy function.
Then I use:
plant.taxa <- assignTaxonomy(seqtab.nochim, plant.ref, multithread = TRUE, tryRC = TRUE, minBoot=50, outputBootstraps = TRUE)
I found that after using assignTaxonomy with minBoot=50 I got half the NA results (which is not a good sign?).
I'm wonder if I missed a step or if I should recalibrate the parameters? But which parameters are actually appropriate?
Looking forward to hearing valuable advice from you guys and please don't hesitate to ask any further questions.
Thank you very much :)
The text was updated successfully, but these errors were encountered: