Skip to content

Running FICLE

SziKayLeung edited this page Oct 18, 2023 · 4 revisions

Table of Contents


FICLE is based on one executable python(3) script: The src folder contains all the auxiliary scripts and functions required.


Python-related libraries

  • gtfparse (v1.2.1)
  • pandas (v1.1.5)
  • NumPy (v1.19.5)


  • SQANTI3 (> v5.0)
  • CPAT (v3.0.2)
  • gtfToGenePred and genePredToBed


  1. Install Anaconda or Miniconda.
  2. Clone the git repository into folder of choice:
  3. Create a conda environment using the FICLE.conda_env.yml script available in the main FICLE folder:
conda env create -f ficle.condaEnv.yml
source activate ficle


  1. Download the reference genome annotation of interest in GTF format, which can be found in GENCODE or CHESS.

  2. Run SQANTI3 QC and filtering with generation of a filtered classification file. See SQANTI3 Git repository for more details.

  3. Run CPAT on the long-read-derived fasta (preferably from SQANTI3), which can be obtained from following the long-read processing pipeline, to generate the -x Mouse_Hexamer.tsv -d Mouse_logitModel.RData -g <path/to/longRead.fasta> --min-orf=50 --top-orf=50 -o <path/to/output/directory>
  1. Generate a bed file from the long-read-derived GTF.
gtfToGenePred <path/to/longRead.gtf> longRead.genePred
genePredToBed longRead.genePred longRead.bed12
sort -k1,1 -k2,2n longRead.bed12 > longRead_sorted.bed12

Getting ready

Before running FICLE, you will need to:

  1. Activate the ficle conda environment:
-bash-4.2$ source activate ficle
  1. Add scripts to path:
-bash-4.2$ FICLE_ROOT=<path/to/cloned/github/FICLE/>
-bash-4.2$ export PATH=$PATH:${FICLE_ROOT}
-bash-4.2$ export PATH=$PATH:${FICLE_ROOT}/reference

FICLE arguments and usage

FICLE accepts the following arguments:

usage: [-h] [-n GENENAME] [-r REFERENCE] [-b INPUT_BED]
                [-g INPUT_GTF] [-c INPUT_CLASS] [--cpat CPAT] [-o OUTPUT_DIR]

Full Isoform Characterisation from (Targeted) Long-read Experiments

optional arguments:
  -h, --help            show this help message and exit
  -n GENENAME, --genename GENENAME
                        Target gene symbol
  -r REFERENCE, --reference REFERENCE
                        Gene reference annotation (<gene>_gencode.gtf)
  -b INPUT_BED, --input_bed INPUT_BED
                        Input bed file of all the final transcripts in long-
                        read derived transcriptome.
  -g INPUT_GTF, --input_gtf INPUT_GTF
                        Input gtf file of all the final transcripts in long-
                        read derived transcriptome.
  -c INPUT_CLASS, --input_class INPUT_CLASS
                        SQANTI classification file
  --cpat CPAT           \ file generated from CPAT
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Output path for the annotation and associated files
  -v, --version         Display program version number.

Mandatory arguments
  1. --genename : the target gene symbol of interest (i.e. App/APP), the syntax of which should match the associated_gene column in the output SQANTI classification file
  2. --reference : target gene reference gtf (see Pre-requisite 1)
  3. --input_bed : long-read transcriptome sorted bed file (see Pre-requisite 4)
  4. --input_gtf : long-read gtf (from SQANTI3 filtering)
  5. --input_class : SQANTI filtering classification file (see Pre-requisite 2)
  6. --output_directory : path to output directory
Optional arguments
  1. --cpat : CPAT output file (see Pre-requisite 3)

Usage example

To characterise Trem2 using FICLE: --gene=Trem2 \
    --reference=<path/to/gencode_reference.gtf> \
    --input_bed=<path/to/longRead_sorted.bed12> \
    --input_gtf=<path/to/longRead.gtf>  \
    ---input_class=<path/to/SQANTI_classificiation.txt> \
    --cpat=<path/to/>  \
Clone this wiki locally