Output of Freddie's Snakemake pipeline

The folder structure is as follows:

genes: main folder
genes//: folder per gene. We have three genes:
- AR: we are interested in that
- PLSCR3: short gene but has 13 isoforms
- E2F1: gene with only a single isoform
genes//:
- P000R000: First ONT flow cell, first run
- ``P000R001`: First ONT flow cell, second run
- P000R002: Second ONT flow cell

Then, under each sample folder we have:

Files BEFORE running Freddie

gene.fasta: FASTA file of the gene
reads.fasta: FASTA file of the real reads that map to the gene's genomic region according to minimap2 whole genome mapping
transcripts.fasta: FASTA file of the ENSEMBL transcripts of this gene. The forward strand is shown here.
transcripts.tsv: Each record is a transcript. The columns are:
- Transcript ID
- Chromosome (all should be the same)
- Strand on reference
- Comma separated list of intervals for the transcript exons upstream forward strand of the gene
training*: Files generated by NanoSim while analyzing the real reads error profile. More details here.
simulated_error_profile and simulated.log: files generated by NanoSim that include log of errors simulated in each read and the overall log.
simulated_reads.fasta: Reads generated by NanoSim
simulated_reads.oriented.fasta: The same NanoSim reads but oriented on the forward strand of the gene/transcript
simulated_reads.oriented.tsv: Similar to transcripts.tsv but for the simulated reads:
- Read name
- Transcript ID
- Original strand on transcript/gene
- Comma separated list of intervals for the transcript exons upstream forward strand of the gene

simulated_reads.oriented.paf: The alignment PAF file. Standard PAF tags are used. oc:c:1 indicates that this alignment is used in the optimal chain.
simulated_reads.oriented.dot: DOT file from Freddie plotting
simulated_reads.oriented.pdf: PDF file from Freddie plotting. View this.
simulated_reads.oriented.<read name>.dot: DOT file from Freddie plotting with only <read-name> annotations kept
simulated_reads.oriented.<read name>.pdf: PDF file from Freddie plotting with only <read-name> annotations kept. View this.