-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add H3N2 HA emerging clades #228
Conversation
Updates the Auspice JSON tree to include emerging clades for H3N2 HA including J.1.1, J.2.1, and J.2.2. Related to nextstrain/seasonal-flu#181 which updates the Nextclade dataset workflow to produce these new annotations.
"clades": 30, | ||
"clades": 28, | ||
"customClades": { | ||
"subclade": 36, | ||
"short-clade": 30 | ||
"subclade": 34, | ||
"short-clade": 28, | ||
"emerging_subclade": 37 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the number of "big" clades, subclades and short clades (as counted on the tree nodes) all decreased by 2. Not sure if that's something expected or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good eye. The old tree has clade 3C and 3C.2a1b which are missing from the new tree. 3C only had one sample in the old tree and has no samples in the new tree. 3C.2a1b has no samples in either tree, but it was annotated in the old tree and not in the new tree. I suspect that the workflow dropped these clades during subsampling, as we sample more newer sequences.
![image](https://private-user-images.githubusercontent.com/85372/366658107-7f7ba6fb-c45b-4530-ba4b-6dfeaadc9f95.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk4MDQzMDMsIm5iZiI6MTczOTgwNDAwMywicGF0aCI6Ii84NTM3Mi8zNjY2NTgxMDctN2Y3YmE2ZmItYzQ1Yi00NTMwLWJhNGItNmRmZWFhZGM5Zjk1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDE0NTMyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWZkNWM2M2FmNTY0MzZkYTFiZmI2Zjg1OTg3YjcwMTgxOWY2MTJhMmUwOWQxOTllZmE5NTEyNGZlMjQ1ZTU5NzEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.SR3obBaa-FX35TlgOPclQ_d0KylRJY0z2Bg-9GAog7Y)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Which is to say that for the "recent H3N2 HA" dataset, those missing clades are not a blocking issue for this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SC2 and mpox workflows, I force-include at least one representative sequence for each clade I want to include in a build so that all clades I want are represented. Maybe you could adopt some strategy like this to have less randomness involved?
|
||
## 2024-08-08T05:08:21Z | ||
|
||
Fix numbering of RBD sites it the `pathogen.json`. The relevant positions were indexed 1-based, when they should have been indexed 0-based. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix numbering of RBD sites it the `pathogen.json`. The relevant positions were indexed 1-based, when they should have been indexed 0-based. | |
Fix numbering of RBD sites in the `pathogen.json`. The relevant positions were indexed 1-based, when they should have been indexed 0-based. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been lying around for a month - was it waiting for anything in particular other than a merge from us? I've reviewed and have a few comments - not necessarily blocking but might be nice to address anyways.
I can't seem to sort by emerging subclade, why is that @ivan-aksamentov?
![image](https://private-user-images.githubusercontent.com/25161793/377099833-7fb1f8e2-ae1e-4872-8d9a-9b966c3f5dcb.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk4MDQzMDMsIm5iZiI6MTczOTgwNDAwMywicGF0aCI6Ii8yNTE2MTc5My8zNzcwOTk4MzMtN2ZiMWY4ZTItYWUxZS00ODcyLThkOWEtOWI5NjZjM2Y1ZGNiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDE0NTMyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWYwNTkxOTllY2Y2MDU2ODg2Y2MzYWE1ODc3NjhlMjdjMjA5ZDNmOGZmN2UzMDhlNjkwYjc5NWY4YTRkZmY0YjEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.FKvNw8SLEem2ETpt9O64FTOta1NDjIYZID3UH2mrjCo)
@huddlej what distinguishes an emerging subclade from a subclade? Why have that extra column? Does emerging mean provisional and hence what is meant by J.2.1 might change in the future? Otherwise why not just designate as a proper new clade?
Maybe the display name shouldn't have that underscore emerging_subclade
but be Emerging subclade
- also a short description would be nice for the tooltip. Right now it's empty:
Something like this is possible:
Lastly, it would be nice to maybe add some new example sequences that are part of these new emerging clades.
Here's the tree with coloring by emerging clades:
![Brave Browser 2024-10-16 16 38 28](https://private-user-images.githubusercontent.com/25161793/377103910-0bb2a97a-5d19-4d87-b771-d644b5ca8620.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk4MDQzMDMsIm5iZiI6MTczOTgwNDAwMywicGF0aCI6Ii8yNTE2MTc5My8zNzcxMDM5MTAtMGJiMmE5N2EtNWQxOS00ZDg3LWI3NzEtZDY0NGI1Y2E4NjIwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE3VDE0NTMyM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWYwMWQ0ZmU0ZDk0NDJjMzU3OGMwMmRlMTYyMDc0NmQ5NjBmY2Y4MjcwZDFlNGIxMDMwOTIxMmMzOGY5NzdiZGYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.BYfH3cMwcDTZ4NXl2H4UoJvTXAC8eb5BLWGm-ENkH7U)
It's literally because the text doesn't wrap, which forces the sort asc/desc icons out of view. If I make the text wrap, you can see/use them. It's conventional to allow clicking the column name/text itself to toggle thru sort state (asc, desc, none), which would at least restore functionality if not the indicators. |
Yep, if the text does not fit, it will push arrows away from the view - this is a CSS bug. (as a funny workaround you can scroll them back in if you select the text and drag the selection all the way to the right). The easiest is to pick names that are short words or even abbreviations/acronyms, space separated, instead of underscore-separated. The explanation can be tucked into details in the tooltip. But it's true that I need to return to the table sometimes, it is one of the oldest components and can def use some love. Some more discussion is in the nextstrain/nextclade#1537 |
@corneliusroemer I shared some initial context in a related issue that may be helpful background for this PR. This PR is waiting on two things:
|
Thanks @huddlej for the response, a PR here is enough to have a "prerelease" dataset that's available through for example: https://master.clades.nextstrain.org/?dataset-server=gh:@add-h3n2-ha-emerging-clades@ (and an equivalent invocation of I'll convert this PR to draft state then as it's not actually ready to be reviewed/merged at this point in time. |
I was hoping for something a little more visible to users of the web UI like a H3N2 HA dataset with both "official" and "experimental" labels on the production website. This would allow folks to use emerging annotations ahead of the various WHO meetings but before they've been released officially. But I'm happy to discuss any potential solutions. |
Few thoughts:
|
Updates the Auspice JSON tree to include emerging clades for H3N2 HA including J.1.1, J.2.1, and J.2.2.
Related to nextstrain/seasonal-flu#181 which updates the Nextclade dataset workflow to produce these new annotations.
preview: https://master.clades.nextstrain.org/?dataset-server=gh:@add-h3n2-ha-emerging-clades@