Latest version 0.0.3
This software is created to simulate RNA sequencing datasets (Illumina NGS) with pseudo-random intron retention events in coding RNA transcripts. The simulated datasets can be used to evaluate and benchmark intron-retention detection softwares or workflows. IRSim is a software package written in Python3 and employs DWGSIM for NGS read simulation.
The percent intron retention (PIR) of each retained intron is calculated by [100 x mean retention reads divided by the sum of retention reads and spliced intron reads] as seen in Braunschweig et al., 2014.
- Python 3 (3.6.7 or above)
- numpy (1.16.2 or above) Link
- pyfaidx (0.5.5.2 or above) Link
- DWGSIM (0.1.11 or above) Link
git clone https://github.com/cytham/irsim.git
cd irsim
chmod +x irsim
Make sure you have the following input files ready:
- A reference genome file in FASTA format
- cDNA FASTA file
- Annotation GTF file
- A tab-delimited FPKM model file (Column 1 - Gene id \t Column 2 - FPKM values)
- Add path to DWGSIM directory
DWGSIM directory = /path/to/DWGSIM_directory
- Edit all other parameters to your choice
Number of threads = 10
Number of replicates for sample A (Experimental sample) = 1
Number of replicates for sample B (Control sample) = 1
.
.
.
/path/to/irsim ref_genome.fa cDNA.fa annotation.gtf FPKM_model.tsv config.ini output_directory
- Gzipped FASTQ paired-read files for each sample/replicate.
- A report file showing the Percent Intron Retention (PIR) for each intron in each sample/replicate.
See Releases
- Tham Cheng Yong - cytham
This project is licensed under GNU General Public License - see LICENSE for details.