The input data set is a tidy representation of a differential gene transcript abundance analysis
To install:
For linux systems, in order to exploit multi-threading, from R write:
fileConn<-file("~/.R/Makevars")
writeLines(c( "CXX14FLAGS += -O3","CXX14FLAGS += -DSTAN_THREADS", "CXX14FLAGS += -pthread"), fileConn)
close(fileConn)
Then, install with
devtools::install_github("stemangiola/ppcSeq")
You can get the test dataset with
ppcSeq::counts
You can convert a list of BAM/SAM files into a tidy data frame of annotated counts
counts.ppc =
ppcSeq::counts %>%
mutate(is_significant = FDR < 0.01) %>%
ppc_seq(
formula = ~ Label,
significance_column = PValue,
do_check_column = is_significant,
value_column = value,
percent_false_positive_genes = "5%"
)
## [1] "2019-08-08 21:29:49 AEST"
## [1] "2019-08-08 21:32:44 AEST"
## [1] "2019-08-08 21:33:06 AEST"
## [1] "2019-08-08 21:36:35 AEST"
The new posterior predictive check has been added to the original data frame
counts.ppc
## # A tibble: 15 x 4
## symbol `sample wise data` plot `tot deleterious outliers`
## <chr> <list<df[,11]>> <list> <int>
## 1 SLC16A12 [21 × 11] <gg> 0
## 2 CYP1A1 [21 × 11] <gg> 1
## 3 ART3 [21 × 11] <gg> 1
## 4 DIO2 [21 × 11] <gg> 0
## 5 OR51E2 [21 × 11] <gg> 0
## 6 MUC16 [21 × 11] <gg> 0
## 7 CCNA1 [21 × 11] <gg> 0
## 8 LYZ [21 × 11] <gg> 1
## 9 PPM1H [21 × 11] <gg> 0
## 10 SUSD5 [21 × 11] <gg> 0
## 11 TPRG1 [21 × 11] <gg> 0
## 12 EPB42 [21 × 11] <gg> 0
## 13 LRRC38 [21 × 11] <gg> 0
## 14 SUSD4 [21 × 11] <gg> 0
## 15 MMP8 [21 × 11] <gg> 1
The new data frame contains plots for each gene
We can visualise the top five differentially transcribed genes
counts.ppc %>%
slice(1:2) %>%
pull(plot) %>%
cowplot::plot_grid(plotlist = ., align = "v", ncol = 1, axis="b", rel_widths = 1 )