-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help to plot 'QQplot' with command 'fish$qqp()' and Coverage problem #8
Comments
my warning message
|
These warnings are fine - they just are letting you know that your ranges
have different seqlengths. If the ranges are from the same genome build
then no worries. This can happen often when you are using slightly
different versions of the same genome build (eg with different alternate
contigs), or you didn't initialize your granges with a seqlengths. To get
rid of the warnings just make sure that events, eligible, and genes have
the same seqlengths (see GenomicRanges package).
Is this happening with the tutorial data? Shouldn't be
…On Fri, Dec 6, 2019 at 9:16 AM a00101 ***@***.***> wrote:
my warning message
fish = Fish(hypotheses = genes[, 'gene_name'], events = events, eligible =
eligible, idcol = 'Tumor_Sample_Barcode')
FishHook 2019-12-06 23:15:20: Tallying events over eligible subsets of
hypotheses
FishHook 2019-12-06 23:15:20: Overlapping with covered intervals
FishHook 2019-12-06 23:15:20: Finished overlapping with covered intervals
FishHook 2019-12-06 23:15:21: Applying idcap of 1 on idcol
Tumor_Sample_Barcode.
FishHook 2019-12-06 23:15:21: Computing event counts
FishHook 2019-12-06 23:15:21: Finished counting events
FishHook 2019-12-06 23:15:21: Marking eligible vs. non-eligible events
FishHook 2019-12-06 23:15:21: Aggregating covariates over eligible subset
of hypotheses
FishHook 2019-12-06 23:15:21: Overlapping with covered intervals
FishHook 2019-12-06 23:15:22: Finished overlapping with covered intervals
Warning messages:
1: In gr.findoverlaps(hypotheses, covered, verbose = verbose > 1,
max.chunk = max.chunk, :
findOverlaps applied to ranges with non-identical seqlengths
2: In gr.findoverlaps(query, subject, ...) :
findOverlaps applied to ranges with non-identical seqlengths
3: In gr.findoverlaps(query, subject, ...) :
findOverlaps applied to ranges with non-identical seqlengths
4: In gr.findoverlaps(events, ov, max.chunk = max.chunk, mc.cores =
mc.cores, :
findOverlaps applied to ranges with non-identical seqlengths
5: In gr.findoverlaps(query, subject, ...) :
findOverlaps applied to ranges with non-identical seqlengths
6: In gr.findoverlaps(hypotheses, covered, verbose = verbose > 1,
max.chunk = max.chunk, :
findOverlaps applied to ranges with non-identical seqlengths
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC73JEJXOII7YSEBNNQ47FDQXJNCRANCNFSM4JWYAXGQ>
.
|
What happens when you run fish$qqp()?
You can make a coverage file from a bam file using bedtools or deeptools or
Rsamtools any number of tools / packages that output a bedgraph, .wig,
bigwig, GRanges, tsv, csv, etc. This is outside of fishHook.
…On Fri, Dec 6, 2019 at 9:00 AM a00101 ***@***.***> wrote:
1.
I ran test data (luad.maf and others).
Results are the same in the manual.
but 'fish$qqp()' not works.
how should I do that?
1. I have paired sequencing data. how to make coverage file?
one way is extracting common and another is extracting all.
which is right?
Thanks you for your reply in advance.
Thanks.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC73JEP64IJRPJLTOJ65RZDQXJLJFANCNFSM4JWYAXGQ>
.
|
No. But thanks. |
You can use normal coverage to determine the "eligible" territory eg for a
custom exome assay or panel.
The other approach is to use the assay manufacturer provided map of the
covered territory eg a bed file.
If you are using a custom targeted panel with unknown genomic distribution
then you have to compute the eligible yourself via some coverage
calculations eg generating per sample base level coverage via deeptools,
bedtools, samtools etc and then deciding on a cutoff eg 40X that you
require at a base to be in order for a mutation to be detectable in that
sample, and then finding all bases that are detectable on average in for
example >=90% of samples. These operations can be done via GenomicRanges
/ rtracklayer packages, and you can pipe the output to fishHook.
…On Sun, Dec 8, 2019 at 8:54 AM a00101 ***@***.***> wrote:
No. But thanks.
My point is "should I make pooled normal coverage?"
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC73JEN7GMI2DSRCB3DUCHTQXT4AHANCNFSM4JWYAXGQ>
.
|
You mean 'convert Normal bam to coverage called eligible' ? Thanks. |
The eligible territory is just a set of non-overlapping intervals i.e. a
mask, a bed file. You can compute it from coverage, e.g. by finding all
the regions above a certain coverage in your assay, or all bases that are
covered at least 40X in >90% of patients. Whatever you do, in the end,
the eligible region should be a set of intervals
…On Thu, Dec 12, 2019 at 1:55 AM a00101 ***@***.***> wrote:
You mean 'convert Normal bam to coverage called eligible' ?
Thanks.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC73JEJCNNIZGR7RWEC5HDTQYHN4JANCNFSM4JWYAXGQ>
.
|
For WGS the eligible territory can in theory be the whole genome, however
there are "bad areas" that are defined by regions of low mappability or
erroneous mapping / repetitive sequence where you will routinely see
recurrent false positive variants because of alignment errors. We mask
these out in hg19 by defining the eligible territory based around Heng Li's
mask, which throws out ~650Mbp of genome. Unfortunately no such mask
exists for hg38 - though we are working on some alternatives. In the
meantime, you can use mappability track from UCSC and define WGS eligible
around uniquely mapping regions eg 50mers.
For WES and panels you want to base your eligible territory on the part of
the genome that the assay reliably covers - that is more empirical and
assay specific / data driven. i.e. you can use the coverage data from your
WES / panel to identify bases that are reliably captured (eg in 95% of
experiments) .
Hope that makes sense.
…On Thu, Dec 19, 2019 at 12:36 AM a00101 ***@***.***> wrote:
Thanks. But still i don't understand exactly for making coverage.
First of all, I know which tool to use to create the coverage.
But what I still don't know is whether I can define the GRCh38 genome file
as eligible territory. In other words, should GRCh38 gtf be converted to
bed to bigwig?
Or, read coverage varies from sample to sample, so I have to use different
coverage files for each sample.
I'm very thank you for your kindness.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC73JEM4VORFWBYOHCJMCJDQZMB4NANCNFSM4JWYAXGQ>
.
|
I ran test data (luad.maf and others).
Results are the same in the manual.
but 'fish$qqp()' not works.
how should I do that?
one way is extracting common and another is extracting all.
which is right?
Thanks you for your reply in advance.
Thanks.
The text was updated successfully, but these errors were encountered: