Skip to content

RNA Seq Differential Gene Expression Analysis

Roberto Vera Alvarez edited this page Jan 28, 2019 · 1 revision

This validation was designed to evaluate a Differential Gene Expression Analysis workflow following the same approach published in Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq by Williams at al.

We tested five workflows for the DGA. All workflows used the same aligner (STAR) and quantification (TPMCalculator) tool but different DGA software: EdgeR, Deseq2, SAMSeq, the union and the intercept off all identified genes from all DGA tools. Recall and precision values were calculated using the same approach described in the paper using the expressions:


Gr genes in reference and Gd genes identified

We used paper's figure 5a to plot our results for comparison with the paper published results. Although, we obtained recall values under the fitted line our precision is over the fitting line showing better results than those published in the paper, see final plots here.

This analysis is testing the whole workflow STAR-TPMCalculator-DGA tool. In our case, we use the same tools than in the paper for alignment (STAR) and DGA (EdgeR, Deseq2 and SAMSeq). The only difference is the quantification step using TPMCalculator. Additionally, we used their scripts and parameters for the DGA tools which all are R packages.

We see an increment of the precision despite of using the same first and last steps published in the paper for the STAR-quantification-DGA based workflows. We concluded that the increment in precision is due to the introduction of the TPMCalculator tool.

Our workflow is based on a set of Jupyter Notebooks and CWL workflows. The workflows execute the analysis using the following tools:

  • FastQC, for pre-processing quality control
  • Trimmomatic, for reads trimming
  • STAR, for reads alignment
  • RSeQC, for alignment quality control
  • TPMCalculator, for mRNA abundance quantification
  • Deseq2, for DGA
  • EdgeR, for DGA
  • SAMseq, for DGA

See full results at: https://ftp.ncbi.nlm.nih.gov/pub/RNASeqWF/