A collection of scripts I have written and found useful for various bioinformatics tasks
HHpred_automate.py
HHpred_automate.py -- Automates HHpred analysis for a set of protein fasta sequences
cat_fastq_index.py
cat_fastq_index.py -- Join read sample index onto read ID in a fastq file
generate_project_dir.py
generate_project_dir.py -- Setup new project directory folder structure
organise_files.sh
organise_files.sh -- Re-arrange files for easier processing
design_sgRNA.py
design_sgRNA.py -- Design sgRNA for oligo synthesis based on several metrics
generate_sbatch.py
generate_sbatch.py -- generate boilerplate sbatch file for running jobs on SLURM scheduler
rename_files.sh
rename_files.sh -- batch rename files
arrange_fastq.sh
dl-sra-fastqs.sh
dl-sra-fastqs.sh -- Download fastq files from SRA
get_genes_within_regions.R
get_genes_within_regions.R -- Download genes within regions of a BED file
split-multifasta.pl
split-multifasta.pl -- Split a multiple fasta file by sequence
aspera_dl.py
aspera_dl.py -- Download files from SRA using Aspera
extract_fasta_records.py
extract_fasta_records.py -- Extract records for fasta files
match_pairs.sh
match_pairs.sh -- Match pairs of sequences (e.g. Read1 Read2 fastq files) into a csv
test_indexes.R
test_indexes.R -- Determine index clashes from a list of sequences
aspera_dl.sh
aspera_dl.sh -- Download files from SRA using Aspera
filter_tsv.sh
filter_tsv.sh -- Filter tsv file based on your criteria
merge_bigwig.sh
merge_bigwig.sh -- Merge multiple bigwigs into a single file
calculate_index_distance.R
calculate_index_distance.R -- Plot heatmap of hamming distance between a list of indexes
generate_bash_runner.py
generate_bash_runner.py -- Generate generic launcher script for batch launching of samples onto SLURM scheduler
oligo_calc.py
oligo_calc.py -- Calculate oligo properties based on the 5' to 3' sequence