Skip to content

Commit

Permalink
Refinements to phylogenetic build [#2]
Browse files Browse the repository at this point in the history
* Switch Auspice to default color-by=region
* Remove extra `metadata_columns` (and associated config) when calling
  `augur export v2`
* Move `--columns` arg of `augur traits` into config
* Add `--metadata-id-columns` arg to `augur traits` invocation --
  fixes issue where traits were being improperly inferred
* Use a mid-point tree rooting
  • Loading branch information
genehack committed Jul 31, 2024
1 parent 6f2158b commit ebf1fde
Show file tree
Hide file tree
Showing 7 changed files with 24 additions and 19 deletions.
10 changes: 8 additions & 2 deletions phylogenetic/defaults/auspice_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@
"key": "country",
"title": "Country",
"type": "categorical"
},
{
"key": "host",
"title": "Host",
"type": "categorical"
}
],
"geo_resolutions": [
Expand All @@ -39,13 +44,14 @@
],
"display_defaults": {
"map_triplicate": true,
"color_by": "clade"
"color_by": "region"
},
"filters": [
"clade",
"region",
"country",
"author"
"author",
"host"
],
"metadata_columns": [
"author"
Expand Down
4 changes: 2 additions & 2 deletions phylogenetic/defaults/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ refine:
clock_filter_iqd: 4
ancestral:
inference: "joint"
export:
metadata_columns: "strain division location"
traits:
columns: "region"
File renamed without changes.
6 changes: 4 additions & 2 deletions phylogenetic/rules/annotate_phylogeny.smk
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,8 @@ rule traits:
output:
node_data = "results/{gene}/traits.json",
params:
columns = "region"
columns = config["traits"]["columns"],
strain_id = config["strain_id_field"],
log:
"logs/{gene}/traits.txt",
benchmark:
Expand All @@ -66,9 +67,10 @@ rule traits:
"""
augur traits \
--tree {input.tree:q} \
--metadata-id-columns {params.strain_id:q} \
--metadata {input.metadata:q} \
--output {output.node_data:q} \
--columns {params.columns:q} \
--columns {params.columns} \
--confidence \
2> {log:q}
"""
2 changes: 2 additions & 0 deletions phylogenetic/rules/construct_phylogeny.smk
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,13 @@ rule refine:
augur refine \
--tree {input.tree:q} \
--alignment {input.alignment:q} \
--root mid_point \
--metadata {input.metadata:q} \
--metadata-id-columns {params.strain_id:q} \
--output-tree {output.tree:q} \
--output-node-data {output.node_data:q} \
--coalescent {params.coalescent:q} \
--timetree \
--date-confidence \
--date-inference {params.date_inference:q} \
--clock-filter-iqd {params.clock_filter_iqd:q} \
Expand Down
2 changes: 0 additions & 2 deletions phylogenetic/rules/export.smk
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ rule export:
auspice_json = "auspice/yellow-fever-virus_{gene}.json"
params:
strain_id = config["strain_id_field"],
metadata_columns = config["export"]["metadata_columns"]
log:
"logs/{gene}/export.txt",
benchmark:
Expand All @@ -32,7 +31,6 @@ rule export:
--metadata-id-columns {params.strain_id:q} \
--node-data {input.branch_lengths:q} {input.traits:q} {input.nt_muts:q} {input.aa_muts:q} \
--colors {input.colors:q} \
--metadata-columns {params.metadata_columns:q} \
--auspice-config {input.auspice_config:q} \
--include-root-sequence-inline \
--output {output.auspice_json:q} \
Expand Down
19 changes: 8 additions & 11 deletions phylogenetic/rules/prepare_sequences.smk
Original file line number Diff line number Diff line change
Expand Up @@ -49,24 +49,21 @@ rule filter:
rule align:
input:
sequences="results/genome/filtered.fasta",
reference=config["files"]["reference_fasta"],
reference=config["files"]["reference_gb"],
genemap=config["files"]["genemap"],
output:
alignment="results/{gene}/aligned.fasta",
insertions="results/{gene}/insertions.tsv",
log:
"logs/{gene}/align.txt",
benchmark:
"benchmarks/{gene}/align.txt"
shell:
"""
(
nextclade run \
--input-ref {input.reference:q} \
--input-annotation {input.genemap:q} \
--output-fasta - \
--output-tsv {output.insertions:q} \
{input.sequences:q} \
| seqkit seq -i > {output.alignment:q} \
) 2> {log:q}
augur align \
--sequences {input.sequences} \
--reference-sequence {input.reference} \
--output {output.alignment} \
--fill-gaps \
--remove-reference \
2> {log:q}
"""

0 comments on commit ebf1fde

Please sign in to comment.