Physlr constructs a de novo physical map using linked reads from 10X Genomics or stLFR. This physical map can then be used to scaffold an existing assembly to yield chromosomal level contiguity.
Additionally, we recommend using pypy3 over regular python3 for speed.
pip3 install --user git+https://github.com/bcgsc/physlr
git clone https://github.com/bcgsc/physlr
cd physlr/src && make install
To install Physlr in a specified directory:
pip3 install --user git+https://github.com/bcgsc/physlr
git clone https://github.com/bcgsc/physlr
cd physlr/src && make install PREFIX=/opt/physlr
To construct a physical map, you need linked reads from 10X Genomics or stLFR. In addition, to visualize the correctness and contiguity of the physical map, you need a reference genome.
In this example, the linked reads and reference genome are called linkedreads.fq.gz
and reference.fa
, respectively. The linked reads are from stLFR so we specify minimizer_overlap=stLFR
to use the default value for stLFR reads.
cd experiment
bin/physlr-make physical-map lr=linkedreads ref=ref minimizer_overlap=stLFR
To scaffold a draft assembly, you need linked reads from 10X Genomics or stLFR, and an existing assembly. In addition, to calculate Quast summary metrics for the Physlr scaffolded assembly, you need a reference genome.
In this example, the linked reads, draft assembly, and reference genome are called linkedreads.fq.gz
, draft.fa
, reference.fa
, respectively. The linked reads are from 10X Genomics so we specify minimizer_overlap=10X
to use the default value for 10X Genomics reads.
cd experiment
bin/physlr-make scaffolds lr=linkedreads ref=reference draft=draft minimizer_overlap=10X
See the help page for further information.
bin/physlr-make help
lr.physlr.physical-map.path
: Paths of barcodes (backbones).lr.physlr.physical-map.ref.n10.paf.gz.*.pdf
: Various graphs showing the contiguity and correctness of the backbones with respect to the reference.draft.physlr.fa
: Physlr scaffolded assembly using the physical map.draft.physlr.quast.tsv
: Quast metrics comparing the Physlr scaffolded assembly against the reference.
This projects uses:
- btl_bloomfilter BTL C/C++ Common bloom filters for bioinformatics projects implemented by Justin Chu
- nthash rolling hash implementation by Hamid Mohamadi
- readfq Fast multi-line FASTA/Q reader API implemented by Heng Li
- robin-map C++ implementation of a fast hash map and hash set using robin hood hashing by Thibaut G.