Add CCHFV to nextclade_data. #199

anna-parker · 2024-05-15T18:25:21Z

Using the https://github.com/neherlab/CCHFV repository and NCBI Virus I was able to create nextclade_data sets for CCHFV which can then be used by nextclade run.

Auspice trees for the three segments can be built

- independently from each other
- dependent on each other (choosing only samples with all segments to allow for the creation of tanglegrams)
- dependent on each other with additional recombination site inference (using TreeKnit to infer ARGs we can better estimate branch lengths).

I chose the second option for now - but this can be changed.

Additionally, I chose to only name 3 genes: RdRp(RNA-dependent RNA polymerase, product: putative polyprotein) and GCP (product: glycoprotein precursor) and NP (product: nucleoprotein).

Potentially we would like to also name the non-structural S protein (NSS), details in https://www.mdpi.com/1999-4915/8/4/106.

corneliusroemer

Excellent! It works! Preview here: https://clades.nextstrain.org/?dataset-server=https://raw.githubusercontent.com/nextstrain/nextclade_data/45c6711e83448fa39ddd2996902f4ba4cf9bbd9d/data_output

ivan-aksamentov · 2024-05-15T21:53:35Z

@anna-parker I fixed a few bugs in file declarations and added changelogs. We disabled automated CI for forks due to security concerns, so I pushed the processed files (data_output/) myself.

The datasets can be accessed if you provide --server CLI arg or dataset-server URL param:

https://clades.nextstrain.org/?dataset-server=gh:anna-parker/nextclade_data@cchfv@data_output&dataset-name=nextstrain/cchfv/linked/S

If you have access to nextstrain org, then it makes sense to work directly in the nextstrain/nextclade_data repo. This way checks will run automatically. If you don't have it, Richard can probably arrange it.

ivan-aksamentov · 2024-05-15T21:57:01Z

Please thoroughly consider:

which collection/organization you want this dataset to be in. Right now it's in nextstrain collection, even though you are pushing from a fork. For third parties we recommend using community collection. This is mostly political, and to avoid dramas like: who will be allowed to make changes? who will maintain it? who decides what the clades/lineages are if there's no consensus?
path of each dataset. In particular with relation of what clades/flavors/hosts are there now and which ones you want to add in the future. This is a technical & bioinformatics decision. Paths are immutable you cannot change paths or delete datasets later. See the docs/ for more details.
if the dataset is not well tested and/or if there's any concerns with regards to quality or correctness, then it is appropriate to set .attributes.experimental = true in pathogen.json

anna-parker · 2024-05-16T05:26:55Z

Thanks so much @ivan-aksamentov! @rneher do you have any concerns about CCHFV being in the nextstrain collection? We can also discuss offline if that is easier.

…nextclade (i.e. rename gene as CDS and give genes of interest `Name` label).

Add CCHFV to nextclade_data.

d986f95

anna-parker mentioned this pull request May 15, 2024

Add CCHFV to loculus loculus-project/loculus#1920

Merged

10 tasks

ivan-aksamentov added 4 commits May 15, 2024 23:36

fix: file declarations in pathogen.json

99218cd

fix: add changelogs

10e2c7c

Merge remote-tracking branch 'origin/master' into cchfv

b1304a7

chore: rebuild

45c6711

corneliusroemer reviewed May 15, 2024

View reviewed changes

anna-parker added 3 commits May 16, 2024 14:51

Fix: incorrect examples for segment L and adapt NCBI gene naming for …

237be43

…nextclade (i.e. rename gene as CDS and give genes of interest `Name` label).

Update index.json accordingly.

29ead67

Remove duplicate CDS entries.

1eea91b

corneliusroemer had a problem deploying to refs/pull/200/merge May 16, 2024 14:27 — with GitHub Actions Error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CCHFV to nextclade_data. #199

Add CCHFV to nextclade_data. #199

anna-parker commented May 15, 2024

corneliusroemer left a comment

ivan-aksamentov commented May 15, 2024 •

edited

Loading

ivan-aksamentov commented May 15, 2024 •

edited

Loading

anna-parker commented May 16, 2024

Add CCHFV to nextclade_data. #199

Are you sure you want to change the base?

Add CCHFV to nextclade_data. #199

Conversation

anna-parker commented May 15, 2024

corneliusroemer left a comment

Choose a reason for hiding this comment

ivan-aksamentov commented May 15, 2024 • edited Loading

ivan-aksamentov commented May 15, 2024 • edited Loading

anna-parker commented May 16, 2024

ivan-aksamentov commented May 15, 2024 •

edited

Loading

ivan-aksamentov commented May 15, 2024 •

edited

Loading