Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update and standardize Nextclade datasets for all lineages and gene segments #186

Open
huddlej opened this issue Oct 15, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@huddlej
Copy link
Contributor

huddlej commented Oct 15, 2024

Context

Only H3N2 and H1N1pdm currently have Nextclade datasets for all 8 gene segments, while B/Vic has HA and NA datasets and B/Yam only has HA. The default HA and NA datasets for H3N2 and H1N1pdm use more modern reference strains (e.g., A/Darwin/6/2021 for H3N2), while the other segments use older reference strains (e.g., A/NewYork/392/2004 for all other H3N2 genes).

Only HA and NA datasets include shortcut aliases like flu_h3n2_ha while other gene segments have a single longer name like nextstrain/flu/h3n2/ns.

Description

We should update the Nextclade datasets for all lineages and genes to use the same modern reference strains and provide the same shortcut aliases.

This means getting additional gene sequences and coordinates for A/Darwin/6/2021 (H3N2), A/Wisconsin/588/2019 (H1N1pdm), and B/Brisbane/60/2008 (B/Vic). For completeness, we could add all remaining genes for B/Wisconsin/01/2010 (B/Yam). We wouldn't use the Yam dataset for surveillance analyses, but it could be helpful for historical analyses.

If we wanted to update the references to newer strains for all subtypes, this could be a good time to make that change, too.

@huddlej huddlej added the enhancement New feature or request label Oct 15, 2024
@huddlej huddlej self-assigned this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant