Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish modernization #20

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Finish modernization #20

wants to merge 20 commits into from

Conversation

genehack
Copy link
Contributor

@genehack genehack commented Nov 16, 2024

Description of proposed changes

Preview builds on staging:

This PR adds the following changes:

  • Pull Nextclade dataset during ingest; use that for genotype/clade assignment
  • Add upload action to ingest workflow
  • Convert nextclade and phylogenetic workflows to download data from S3
  • Sets an explicit, hard-coded clock rate of 2e-04±1e-05 for both genome and prM-E builds
  • Modify builds so that only genome gets run with --timetree
  • Fix authors versus abbreviated authors so that Auspice tips are labeled with abbreviated version
  • Adds frequencies panel to both builds
  • Filters out probable Clade I travel cases (country=China and dates around 2016-2017)
  • Adds in color.tsv generation based on what is in dengue repo, for better country-level chroma/geography matchup
  • Adds CI and ingest-to-phylo github actions

Related issue(s)

Checklist

  • Checks pass

...instead of depending on having one built locally.

Also corrects nextclade dataset name in the config.
This makes the metadata annotations in the Auspice visualization look
much nicer.
This is _largely_ copy-pasta-ed from the `measles` repo; the one
significant change is I enabled the frequencies on both the whole
genome and prM-E builds, and I set the `min_date` param to
`2017-01-01`, because virtually all of the post-1927 samples are dated
2017 and later.
Also small script tweak
This is the full NCBI yellow-fever dataset (taxon ID 11089) as
downloaded on 21 Nov 2024.

CI _could_ work just fine without embedding this data in the repo, but
having a hard-coded example-data file means that we don't depend on
NCBI for successful CI runs.
@genehack genehack marked this pull request as ready for review November 21, 2024 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modernize repo
1 participant