mamba env create -n envs/*.yaml
- Use
snakemake_PMC/
to download files.
snakemake --cores 20 --use-conda -s snakemake_PMC/Snakefile
- Prepare files with
scripts/make_test_corpus.py
script and save tocorpus/
directory.
- Adjust GPU cores to use for training in
ner.smk
file. - Adjust
config.yaml
for labels to train on and the number of epochs
rm -rf NER*; snakemake --cores 20 --use-conda -s ner.smk
- Adjust
config.yaml
for labels to train on and the number of epochs - Adjust GPU cores to use for training in
rel.smk
file.
rm -rf REL*; snakemake --cores 20 --use-conda -s rel.smk
snakemake --cores 20 --use-conda -s ner_pred.smk
snakemake --cores 20 --use-conda -s rel_pred.smk
- Run
ip.smk
to download and annotate assemblies. - Run with
scripts/ip_slurm.sh
to run using slurm.
- Run xgboost.smk snakemake pipeline, config in
config.yaml
file.
snakemake --cores 40 --use-conda -s xgboost.smk
- Then analyze with notebooks:
analyze_xgboost_binary.ipynb
andanalyze_xgboost_binary_gain.ipynb
for binary classification with either weight or gain as metric, respectively.
- Create evolution dataset using
scripts/create_evolution_dataset.py
. - Run
evolution.smk
to make alignments and calculate selective pressures.
snakemake --cores 20 --use-conda -s evolution.smk
- Analyze with
analyze_evolution.ipynb
.