From 44cf2dd4bb9a6c8aa4d8ad66aace45cf52358ead Mon Sep 17 00:00:00 2001 From: Jesse Bloom Date: Wed, 18 Dec 2024 09:36:39 -0800 Subject: [PATCH] add citation and clean up README (#74) * add citation and clean up README * re-run pipeline --- README.md | 11 +- config.yaml | 9 +- docs/index.html | 5 +- docs/notebooks/analyze_variant_counts.html | 8952 +++++++++++++++++ ...nfigure_dms_viz_domainIII_CR57_escape.html | 22 +- ..._viz_extended_intermediate_cell_entry.html | 30 +- ...figure_dms_viz_ph_domain_RVC20_escape.html | 30 +- ...nfigure_dms_viz_prefusion_17C7_escape.html | 30 +- ...igure_dms_viz_prefusion_RVA122_escape.html | 38 +- ...configure_dms_viz_prefusion_ab_escape.html | 30 +- ...onfigure_dms_viz_prefusion_cell_entry.html | 22 +- homepage/index.md | 4 +- homepage/public/appendix.html | 5 +- .../notebooks/analyze_variant_counts.html | 8952 +++++++++++++++++ ...nfigure_dms_viz_domainIII_CR57_escape.html | 22 +- ..._viz_extended_intermediate_cell_entry.html | 30 +- ...figure_dms_viz_ph_domain_RVC20_escape.html | 30 +- ...nfigure_dms_viz_prefusion_17C7_escape.html | 30 +- ...igure_dms_viz_prefusion_RVA122_escape.html | 38 +- ...configure_dms_viz_prefusion_ab_escape.html | 30 +- ...onfigure_dms_viz_prefusion_cell_entry.html | 22 +- results/dms-viz/domainIII_CR57_escape.json | 2 +- .../extended_intermediate_cell_entry.json | 2 +- results/dms-viz/ph_domain_RVC20_escape.json | 2 +- results/dms-viz/prefusion_17C7_escape.json | 2 +- results/dms-viz/prefusion_RVA122_escape.json | 2 +- results/dms-viz/prefusion_ab_escape.json | 2 +- results/dms-viz/prefusion_cell_entry.json | 2 +- 28 files changed, 18258 insertions(+), 98 deletions(-) create mode 100644 docs/notebooks/analyze_variant_counts.html create mode 100644 homepage/public/notebooks/analyze_variant_counts.html diff --git a/README.md b/README.md index 83efbef..c855f47 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,6 @@ -# Deep mutational scanning of the Rabies glycoprotein (G) Pasteur Strain using a barcoded pseudotyped lentiviral platform +# Pseudovirus deep mutational scanning of the rabies glycoprotein (G) from the Pasteur strain Study by Arjun Aditham, Caelan Radford, Caleb Carr, and Jesse Bloom. +Please see [Aditham et al (2024)](https://www.biorxiv.org/content/10.1101/2024.12.17.628970v1) for full details about the study. This repo contains data and analyses from deep mutational scanning experiments on the Rabies glycoprotein (G). All experiments were performed on the Pasteur strain of rabies [NC_001542.1](https://www.ncbi.nlm.nih.gov/nuccore/NC_001542.1). @@ -42,9 +43,9 @@ Due to space, only some results are tracked. For those that are not, see the [.g The pipeline builds HTML documentation for the pipeline in [./docs/](docs), and a nicely formatted set is put in [./homepage/](homepage). These docs are rendered for viewing at [https://dms-vep.org/RABV_Pasteur_G_DMS/](https://dms-vep.org/RABV_Pasteur_G_DMS/) as stated above. ### Non-pipeline analyses -Additional analyses run outside the core pipeline are in [./non-pipeline_analyses/](non-pipeline_analyses), and are described by README files within that subdirectory. - -[./Additional_Notebooks](https://github.com/dms-vep/RABV_Pasteur_G_DMS/tree/main/non-pipeline_analyses/Additional_Notebooks) contains notebooks and raw for most of the figures in the manuscript. +Additional analyses run outside the core pipeline are in [./non-pipeline_analyses/](non-pipeline_analyses), and are described by README files within that subdirectory: + - [./non-pipeline_analyses/Additional_Notebooks](non-pipeline_analyses/Additional_Notebooks) contains notebooks and raw for most of the figures in the manuscript. + - [./non-pipeline_analyses/RABV_nextstrain](non-pipeline_analyses/RABV_nextstrain) contains notebooks and raw for most of the figures in the manuscript. ## Running the pipeline To run the pipeline, build the conda environment `dms-vep-pipeline-3` in the `environment.yml` file of [dms-vep-pipeline-3](https://github.com/dms-vep/dms-vep-pipeline-3), activate it, and run [snakemake](https://snakemake.readthedocs.io/), such as: @@ -55,3 +56,5 @@ To run the pipeline, build the conda environment `dms-vep-pipeline-3` in the `en To run on the Hutch cluster via [slurm](https://slurm.schedmd.com/), you can run the file [run_Hutch_cluster.bash](run_Hutch_cluster.bash): sbatch -c 32 run_Hutch_cluster.bash + +Note that if you are just cloning this repo and want to re-run it without having to obtain and re-parse all the FASTQ files, you can use the pre-existing barcode count files by setting the `use_precomputed_barcode_counts` key in [config.yaml](config.yaml) to `true`. If you are running the pipeline not on the Fred Hutch server with the FASTQs, this is the recommended approach (otherwise you will need to download the FASTQs and re-assign the paths in `barcode_runs`). diff --git a/config.yaml b/config.yaml index 181763a..7443799 100644 --- a/config.yaml +++ b/config.yaml @@ -29,7 +29,14 @@ github_blob_url: https://github.com/dms-vep/RABV_Pasteur_G_DMS/blob/main # Some descriptions and metadata about the analysis. description: Deep mutational scanning of rabies G (Pasteur strain) year: 2024 -authors: Arjun Aditham, Caelan Radford, Caleb Carr, and Jesse Bloom +authors: "[Aditham et al](https://www.biorxiv.org/content/10.1101/2024.12.17.628970v1)" + +# ---------------------------------------------------------------------------- +# Set the `use_precomputed_barcode_counts` option to `true` if you want to +# re-run this pipeline from the barcode counts already calculated from the +# FASTQs rather than re-running the barcode counting. +# ---------------------------------------------------------------------------- +use_precomputed_barcode_counts: false # ---------------------------------------------------------------------------- # Site numbering, mutation classification, and neut standards diff --git a/docs/index.html b/docs/index.html index be26a28..9502020 100644 --- a/docs/index.html +++ b/docs/index.html @@ -1,5 +1,5 @@

Deep mutational scanning of rabies G (Pasteur strain)

-

Analysis by Arjun Aditham, Caelan Radford, Caleb Carr, and Jesse Bloom (2024)

+

Analysis by Aditham et al (2024)

See https://github.com/dms-vep/RABV_Pasteur_G_DMS for full code.

Contents

Count barcodes for variants

Analysis notebooks

+

Data files