You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One thing that stands out is a low quality portion of the reads in the first approximately 40 base pairs, the same region that has been reported to have a high percentage N-count.
We could consider trimming out this region.
During trimming, the minimum length could be set to 50 and the phred quality score set to30.
Warning: Our data is characterized by low per base sequence content, abnormal GC content across the genome and a majority of the sequences are overrepresented and duplicated.
Mitigation (Things that could have been done):
Overrepresentation - normalization during library preparation
Study the genome to be able to know whether its characterized by repeats and whether it's AT rich or GC rich
Some sequences are contaminated with adapter sequences and removing the adapters will improve the quality score of the sequences.
The thing to note is that our data has has abnormal GC content, low per base content, high sequence duplication and high levels of overrepresented regions.
With the consideration that this is an RNA Seq library, the overrepresentation may be due to very abundant transcript as opossed to the normal conclusion of PCR enrichment bias.
View the HTML summary of the QC report and tell us what it means, what do we need to do based on the results?
The text was updated successfully, but these errors were encountered: