Use multiset jaccard #1106
Workflow file for this run
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
on: | |
push: | |
branches: [ main ] | |
paths-ignore: | |
- '**/*.md' | |
pull_request: | |
branches: [ main ] | |
paths-ignore: | |
- '**/*.md' | |
name: build and test | |
jobs: | |
build_and_test: | |
runs-on: ubuntu-20.04 | |
steps: | |
- uses: actions/checkout@v2 | |
- name: Install required packages | |
run: sudo apt-get update && sudo apt-get install -y | |
git | |
bash | |
cmake | |
make | |
g++ | |
python3-dev | |
libatomic-ops-dev | |
autoconf | |
libgsl-dev | |
zlib1g-dev | |
libdeflate-dev | |
libhts-dev | |
samtools | |
libjemalloc-dev | |
bedtools | |
- name: Init and update submodules | |
run: git submodule update --init --recursive | |
- name: Build wfmash | |
run: cmake -H. -Bbuild -D CMAKE_BUILD_TYPE=Debug -DWFA_PNG_AND_TSV=ON && cmake --build build -- -j 2 | |
- name: Test mapping coverage with 8 yeast genomes (PAF output) | |
run: ASAN_OPTIONS=detect_leaks=1:symbolize=1 LSAN_OPTIONS=verbosity=0:log_threads=1 build/bin/wfmash data/scerevisiae8.fa.gz -p 95 -n 7 -m -L -Y '#' > scerevisiae8.paf; scripts/test.sh data/scerevisiae8.fa.gz.fai scerevisiae8.paf 0.92 | |
- name: Test mapping+alignment with a subset of the LPA dataset (PAF output) | |
run: ASAN_OPTIONS=detect_leaks=1:symbolize=1 LSAN_OPTIONS=verbosity=0:log_threads=1 build/bin/wfmash data/LPA.subset.fa.gz -n 10 -L > LPA.subset.paf && head LPA.subset.paf | |
- name: Test mapping+alignment with a subset of the LPA dataset (SAM output) | |
run: ASAN_OPTIONS=detect_leaks=1:symbolize=1 LSAN_OPTIONS=verbosity=0:log_threads=1 build/bin/wfmash data/LPA.subset.fa.gz -N -a -L > LPA.subset.sam && samtools view LPA.subset.sam -bS | samtools sort > LPA.subset.bam && samtools index LPA.subset.bam && samtools view LPA.subset.bam | head | cut -f 1-9 | |
- name: Test mapping+alignment with short reads (500 bps) to a reference (SAM output) | |
run: ASAN_OPTIONS=detect_leaks=1:symbolize=1 LSAN_OPTIONS=verbosity=0:log_threads=1 build/bin/wfmash data/reference.fa.gz data/reads.500bps.fa.gz -s 0.5k -N -a > reads.500bps.sam && samtools view reads.500bps.sam -bS | samtools sort > reads.500bps.bam && samtools index reads.500bps.bam && samtools view reads.500bps.bam | head | |
- name: Test mapping+alignment with short reads (255bps) (PAF output) | |
run: ASAN_OPTIONS=detect_leaks=1:symbolize=1 LSAN_OPTIONS=verbosity=0:log_threads=1 build/bin/wfmash data/reads.255bps.fa.gz -w 16 -s 100 -L > reads.255bps.paf && head reads.255bps.paf | |
- name: Install Rust and Cargo | |
uses: actions-rs/toolchain@v1 | |
with: | |
toolchain: stable | |
override: true | |
- name: Install wgatools | |
run: cargo install --git https://github.com/wjwei-handsome/wgatools.git | |
- name: Install wgatools | |
run: cargo install --git https://github.com/ekg/pafcheck.git | |
- name: Run wfmash and generate PAF | |
run: build/bin/wfmash -t 8 -T SGDref -Q S288C -Y '#' data/scerevisiae8.fa.gz > test.paf | |
- name: check PAF coordinates and extended CIGAR validity | |
run: pafcheck --query-fasta data/scerevisiae8.fa.gz --paf test.paf | |
- name: Convert PAF to MAF using wgatools | |
run: wgatools paf2maf --target data/scerevisiae8.fa.gz --query data/scerevisiae8.fa.gz test.paf > test.maf | |
- name: Check if MAF file is not empty | |
run: | | |
if [ -s test.maf ]; then | |
echo "MAF file is not empty. Test passed." | |
else | |
echo "MAF file is empty. Test failed." | |
exit 1 | |
fi |