Skip to content
shiquan edited this page Nov 20, 2020 · 42 revisions

PISA is a suite of programs for processing and interacting with single-cell high-throughput sequencing data. The idea of PISA is trying to process different kinds of single-cell data into the universal file format with high performance. It is flexible and NOT designed for a specific library or platform. Users could use it to combine with the current-stat-of-art software to normalize and analyze single-cell sequencing data.

INSTALL

$ git clone https://github.com/shiquan/PISA
$ cd PISA
$ make

SYNOPSIS

PISA parse -config read_structure.json -1 reads.fq -report fastq_report.csv reads_1.fq.gz reads_2.fq.gz

PISA sam2bam -report alignment_report.csv in.sam -o out.bam

PISA anno -gtf refdata-cellranger-GRCh38-3.0.0/genes/genes.gtf -o anno.bam -@ 5 -report anno_report.csv aln.bam

PISA corr -tag UR -new-tag UB -tags-block CB,GN -cr -o final.bam -@ 5 anno.bam

PISA count -tag CB -anno-tag GN -umi UB -outdir raw_gene_expression -@ 5 final.bam

CHANGLOG

  • v0.7 2020/11/20

    • Introduce the PCR deduplicate method rmdup.
    • Mask read and qual field as * instead of sequence for secondary alignments in the BAM file.
  • v0.6 2020/10/29

    • PISA attrcnt, Skip secondary alignments before counting reads
    • PISA anno fix segments fault bugs when loading malformed GTF
  • v0.5 2020/08/27

    • Add PISA bam2frag function (experimental).
    • PISA anno Skip secondary alignments when counting total reads.
  • v0.4 2020/07/14

    • PISA sam2bam add mapping quality adjustment method
    • rewrite UMI correction index structure to reduce memory use
    • Fix bugs.
  • v0.4alpha 2020/05/2

    • PISA anno use UCSC bin scheme instead of linear search for reads query gene regions. Fix the bug of misannotated antisense reads.
    • PISA count use MEX output instead of plain cell vs gene table.
  • v0.3 2020/03/26

    • Fix bugs and improve preformance.
  • 0.0.0.9999 2019/05/19

    • Init.
Clone this wiki locally