Skip to content

Latest commit

 

History

History
43 lines (36 loc) · 2.32 KB

OUTPUT.md

File metadata and controls

43 lines (36 loc) · 2.32 KB

Output of Freddie's Snakemake pipeline

The folder structure is as follows:

  • genes: main folder
  • genes//: folder per gene. We have three genes:
    • AR: we are interested in that
    • PLSCR3: short gene but has 13 isoforms
    • E2F1: gene with only a single isoform
  • genes//:
    • P000R000: First ONT flow cell, first run
    • ``P000R001`: First ONT flow cell, second run
    • P000R002: Second ONT flow cell

Then, under each sample folder we have:

Files BEFORE running Freddie

  • gene.fasta: FASTA file of the gene
  • reads.fasta: FASTA file of the real reads that map to the gene's genomic region according to minimap2 whole genome mapping
  • transcripts.fasta: FASTA file of the ENSEMBL transcripts of this gene. The forward strand is shown here.
  • transcripts.tsv: Each record is a transcript. The columns are:
    • Transcript ID
    • Chromosome (all should be the same)
    • Strand on reference
    • Comma separated list of intervals for the transcript exons upstream forward strand of the gene
  • training*: Files generated by NanoSim while analyzing the real reads error profile. More details here.
  • simulated_error_profile and simulated.log: files generated by NanoSim that include log of errors simulated in each read and the overall log.
  • simulated_reads.fasta: Reads generated by NanoSim
  • simulated_reads.oriented.fasta: The same NanoSim reads but oriented on the forward strand of the gene/transcript
  • simulated_reads.oriented.tsv: Similar to transcripts.tsv but for the simulated reads:
    • Read name
    • Transcript ID
    • Original strand on transcript/gene
    • Comma separated list of intervals for the transcript exons upstream forward strand of the gene

Files AFTER running Freddie:

  • simulated_reads.oriented.paf: The alignment PAF file. Standard PAF tags are used. oc:c:1 indicates that this alignment is used in the optimal chain.
  • simulated_reads.oriented.dot: DOT file from Freddie plotting
  • simulated_reads.oriented.pdf: PDF file from Freddie plotting. View this.
  • simulated_reads.oriented.<read name>.dot: DOT file from Freddie plotting with only <read-name> annotations kept
  • simulated_reads.oriented.<read name>.pdf: PDF file from Freddie plotting with only <read-name> annotations kept. View this.