Name		Name	Last commit message	Last commit date
parent directory ..
PacBio_amplicon.gb		PacBio_amplicon.gb
PacBio_feature_parse_specs.yaml		PacBio_feature_parse_specs.yaml
PacBio_runs.csv		PacBio_runs.csv
README.md		README.md
barcode_runs.csv		barcode_runs.csv
neutralization_standard_barcodes.csv		neutralization_standard_barcodes.csv
polyclonal_config.yaml		polyclonal_config.yaml
reference_site_to_region.csv		reference_site_to_region.csv

README.md

Input data

PacBio full-length variant sequencing

PacBio_runs.csv contains information on PacBio runs linking the barcodes to the mutations. It has the following columns:

library: name of the library sequenced
run: date of the pacbio library submission (use this date to refer to experimental notebook)
fastq: FASTQ file from running CCS

The amplicons itself is defined in PacBio_amplicon.gb, and PacBio_feature_parse_specs.yaml specifies how to parse that with alignparse.

Illumina barcode sequencing

barcode_runs.csv has the Illumina barcode-sequencing runs used to count barcodes in different conditions. It describes the samples, which should be named as clearly as possible. It has the following columns:

date: date experiment was performed in YYYY-MM-DD encoding.
virus_batch: batch of virus used for the experiment.
library: which virus library was used.
sample_type: can be one of the following:
- VSVG_control: entry mediated by VSVG
- no-antibody_control: entry mediated by VEP of interest
- antibody: encompasses sera and antibodies
antibody: name of the antibody if this sample has sample_type of antibody
antibody_concentration: concentration of antibody if this sample has sample_type of antibody. For sera, should be a fraction < 1 giving dilution (not a dilution factor).
replicate: experimental replicate.
fastq_R1: path to R1 FASTQ file, or semi-colon de-limited list of multiple FASTQs
exclude_after_counts: set to yes if barcode run should be excluded after counting barcodes
notes: any other notes about the sample.

The file neutralization_standard_barcodes.csv has the barcodes for the neutralization standard that is not expected to be neutralized by the sera or antibodies.

Assign sites to protein regions (domains)

reference_site_to_region.csv assigns sites in the protein to a region (protein domain).

Configuration for `polyclonal` analysis

polyclonal_config.yaml specifies how the analysis with polyclonal is done. For each antibody listed in barcode_runs.csv, specify:

max_epitopes: the maximum number of epitopes to test. The fitting keeps testing more epitopes up to this max until additional epitopes have no additional value.
n_bootstrap_samples: number of bootstrap samples to use.
min_epitope_activity_to_include: keep adding epitopes until activity <= this.
fit_kwargs: keyword arguments passed to Polyclonal.fit
plot_kwargs: keyword arguments passed to Polyclonal.plot_mut_escape.

Validation neutralization assays

validation_IC50s.csv contains mutations for which we want to calculate IC50s using the polyclonal fits and compare to measured values in validation experiments. Required columns are "antibody", "aa_substitutions", "measured IC50". If multiple mutations, they should be space delimited in "aa_substitutions" columns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

Input data

PacBio full-length variant sequencing

Illumina barcode sequencing

Assign sites to protein regions (domains)

Configuration for `polyclonal` analysis

Validation neutralization assays

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Input data

PacBio full-length variant sequencing

Illumina barcode sequencing

Assign sites to protein regions (domains)

Configuration for polyclonal analysis

Validation neutralization assays

Configuration for `polyclonal` analysis