Skip to content

Commit

Permalink
Merge pull request #73 from nextstrain/sc2-update-2022-04-18
Browse files Browse the repository at this point in the history
SC2 update 2023-04-18
  • Loading branch information
corneliusroemer authored Apr 19, 2023
2 parents 2ab8a4f + 332964e commit 46e43a2
Show file tree
Hide file tree
Showing 17 changed files with 93,083 additions and 1 deletion.
162 changes: 161 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,165 @@
# CHANGELOG

## 2023-04-18

### New SARS-CoV-2 dataset version (tag `2023-04-18T12:00:00Z`)

#### `SARS-CoV-2` and `SARS-CoV-2-21L`

- New Nextstrain clade 23B (XBB.1.16) was added, see this [ncov PR](https://github.com/nextstrain/ncov/pull/1059) for a detailed discussion of this clade
- The `SARS-CoV-2` dataset now shows WHO variant name in the web results table instead of unaliased Pango lineage. This is a web-display-only change: All output files (web/CLI) are unchanged. Unaliased Pango lineage is still available in the `SARS-CoV-2-21L` dataset.
- Pango lineages designated between 2023-03-15 and 2023-04-17 are now included, unfold below to see a list with designation dates:

<details>
<summary> Newly included lineages, with designation date in parentheses</summary>

- ET.1 (2023-03-16)
- XBC.1.6 (2023-03-16)
- XBB.1.17 (2023-03-18)
- XBB.1.17.1 (2023-03-18)
- XBB.1.17.2 (2023-03-18)
- XBB.1.5.22 (2023-03-18)
- XBB.1.5.23 (2023-03-18)
- XBB.1.18 (2023-03-18)
- XBB.1.18.1 (2023-03-18)
- XBB.1.5.24 (2023-03-18)
- XBB.1.5.25 (2023-03-18)
- XBU (2023-03-19)
- XBB.1.19 (2023-03-19)
- XBB.1.19.1 (2023-03-19)
- XBB.1.20 (2023-03-19)
- XBV (2023-03-19)
- DV.1.1 (2023-03-19)
- CP.8 (2023-03-19)
- XBW (2023-03-19)
- XBY (2023-03-19)
- XBB.1.5.26 (2023-03-19)
- EU.1 (2023-03-19)
- EU.1.1 (2023-03-19)
- XBF.8.1 (2023-03-19)
- XBB.1.21 (2023-03-19)
- XBC.2.1 (2023-03-19)
- XBB.1.22 (2023-03-19)
- XBB.1.22.1 (2023-03-19)
- XBB.1.22.2 (2023-03-19)
- XBB.1.9.5 (2023-03-19)
- XBB.1.9.4 (2023-03-19)
- XBB.1.23 (2023-03-19)
- XBJ.1 (2023-03-19)
- XBJ.1.1 (2023-03-19)
- XBJ.2 (2023-03-19)
- XBJ.3 (2023-03-19)
- XBJ.4 (2023-03-19)
- XBB.2.7 (2023-03-19)
- XBB.1.27 (2023-03-19)
- XBB.1.24 (2023-03-20)
- XBB.1.25 (2023-03-20)
- XBB.1.26 (2023-03-20)
- CH.1.1.16 (2023-03-20)
- CH.1.1.17 (2023-03-20)
- DV.5 (2023-03-20)
- EV.1 (2023-03-21)
- EW.1 (2023-03-21)
- EW.2 (2023-03-21)
- EW.3 (2023-03-21)
- EY.1 (2023-03-21)
- EZ.1 (2023-03-21)
- XBB.1.16.1 (2023-03-22)
- XBB.1.5.27 (2023-03-22)
- EK.3 (2023-03-22)
- EK.2.1 (2023-03-22)
- XBB.1.5.28 (2023-03-22)
- BE.11 (2023-03-22)
- BE.12 (2023-03-22)
- BE.13 (2023-03-22)
- FA.1 (2023-03-22)
- FA.2 (2023-03-22)
- DR.2 (2023-03-22)
- FB.1 (2023-03-22)
- FB.2 (2023-03-22)
- BQ.1.1.72 (2023-03-22)
- FC.1 (2023-03-22)
- FD.1 (2023-03-22)
- FD.1.1 (2023-03-22)
- FE.1 (2023-03-22)
- XBB.1.5.29 (2023-03-22)
- XBB.1.5.30 (2023-03-22)
- FF.1 (2023-03-22)
- CP.8.1 (2023-03-23)
- BN.1.3.9 (2023-03-23)
- XBB.1.5.31 (2023-03-24)
- XBB.1.5.32 (2023-03-24)
- XBB.1.5.33 (2023-03-24)
- XBB.2.3.1 (2023-03-24)
- XBB.2.3.2 (2023-03-24)
- CH.1.1.18 (2023-03-24)
- XBB.1.28 (2023-03-24)
- XBB.1.29 (2023-03-24)
- XBB.1.5.34 (2023-03-24)
- XBB.1.30 (2023-03-24)
- XBB.1.5.35 (2023-03-24)
- XBB.1.5.36 (2023-03-24)
- BL.1.5 (2023-03-24)
- BQ.1.26.2 (2023-03-24)
- XBF.9 (2023-03-24)
- FG.1 (2023-03-24)
- FG.2 (2023-03-24)
- FG.3 (2023-03-24)
- CH.1.1.19 (2023-03-24)
- CH.1.1.20 (2023-03-24)
- CM.8.1.3 (2023-03-24)
- CM.8.1.4 (2023-03-24)
- DN.1.1.3 (2023-03-24)
- DN.1.1.4 (2023-03-24)
- XBB.2.8 (2023-03-24)
- XBB.2.7.1 (2023-03-24)
- BF.5.5 (2023-03-24)
- XBB.1.5.37 (2023-03-27)
- FD.2 (2023-03-27)
- XBB.1.5.38 (2023-03-27)
- FH.1 (2023-03-27)
- XBB.1.5.39 (2023-03-27)
- XBZ (2023-03-28)
- CH.1.1.21 (2023-03-29)
- CH.1.1.22 (2023-03-29)
- FJ.1 (2023-03-29)
- BN.1.3.10 (2023-03-30)
- FK.1 (2023-03-31)
- BQ.1.1.73 (2023-04-03)
- XCA (2023-04-03)
- EG.2 (2023-04-04)
- XBB.1.31 (2023-04-04)
- XBB.1.32 (2023-04-04)
- FL.1 (2023-04-04)
- XBB.2.3.3 (2023-04-04)
- XBB.2.3.4 (2023-04-04)
- XBB.1.5.40 (2023-04-04)
- FM.1 (2023-04-04)
- FM.2 (2023-04-04)
- FL.2 (2023-04-04)
- BQ.1.1.74 (2023-04-05)
- FN.1 (2023-04-05)
- EG.1.1 (2023-04-07)
- XBB.1.28.1 (2023-04-07)
- CJ.1.2 (2023-04-07)
- CJ.1.3 (2023-04-07)
- XBF.1.1 (2023-04-08)
- XBF.10 (2023-04-08)
- FP.1 (2023-04-11)
- BF.42 (2023-04-11)
- CH.1.1.23 (2023-04-11)
- XBB.2.3.5 (2023-04-12)
- FK.1.1 (2023-04-13)
- XCB (2023-04-14)
- FQ.1 (2023-04-17)
- XBB.1.33 (2023-04-18)

</details>

#### `SARS-CoV-2-no-recomb` will no longer be updated

Starting with this update, the `SARS-CoV-2-no-recomb` dataset - an auxiliary dataset with niche usage - will no longer be updated. This dataset was created to mitigate bias Nextclade had in previous versions to attach incomplete sequences to recombinants. This bias has been fixed in Nextclade version 2.13.0 (see this [CHANGELOG entry](https://github.com/nextstrain/nextclade/blob/master/CHANGELOG.md#attach-sequences-to-a-priori-most-likely-node-if-reference-tree-contains-placement_prior) for a detailed discussion). Furthermore, with the majority of current circulation being recombinant (mostly XBB), this dataset is no longer adequate. Users should simply use the main `SARS-CoV-2` dataset instead.

## 2023-04-02

### New dataset version (tag `2023-04-02T12:00:00Z`)
Expand All @@ -23,7 +183,7 @@ The B/Vic annotation of the HA segment was fixed -- it was previously off by 3 n
#### SARS-CoV-2 datasets

- Placement priors: Every tree node is now annotated with a `placement_prior`, an approximate probability (on log10 scale) that a random sequence is attached to this node. For this dataset, the prior was caluclated after placing 300k sequences on the tree. A value of `-10` is chosen when no sequence in the sample attached to a node. The placement priors will improve placement accuracy of incomplete sequences (such as Spike only) - but only with a recent version of Nextclade (probably 2.13.0 and above). In that release, we will introduce a new placement tie-breaking feature: when a query sequence can attach to multiple nodes with equal number of mismatches, the sequence will be attached to the reference tree node with the highest prior. This is in contrast to the previous naive tie breaking logic which always chose the node with the fewest number of parent nodes. This lead to a bias towards attaching to recombinants. See <https://github.com/neherlab/nextclade_data_workflows/pull/38> for the code calculating the placement priors, and <https://github.com/nextstrain/nextclade/pull/1119> to see how the priors are used in Nextclade.
- Pango lineages desiganted between 2023-02-24 and 2023-03-15 are now included, unfold below to see a list of them:
- Pango lineages designated between 2023-02-24 and 2023-03-15 are now included, unfold below to see a list of them:

<details>
<summary> Newly included lineages, with designation date in parentheses</summary>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
##gff-version 3
##sequence-region MN908947 1 29903
# Gene map (genome annotation) of SARS-CoV-2 in GFF format.
# For gene map purpses we only need some of the columns. We substitute unused values with "." as per GFF spec.
# See GFF format reference at https://www.ensembl.org/info/website/upload/gff.html
# seqname source feature start end score strand frame attribute
MN908947 GenBank gene 266 13468 . + . gene_name=ORF1a
MN908947 GenBank gene 13468 21555 . + . gene_name=ORF1b
MN908947 GenBank gene 25393 26220 . + . gene_name=ORF3a
MN908947 GenBank gene 21563 25384 . + . gene_name=S
MN908947 GenBank gene 26245 26472 . + . gene_name=E
MN908947 GenBank gene 26523 27191 . + . gene_name=M
MN908947 GenBank gene 27202 27387 . + . gene_name=ORF6
MN908947 GenBank gene 27394 27759 . + . gene_name=ORF7a
MN908947 GenBank gene 27756 27887 . + . gene_name=ORF7b
MN908947 GenBank gene 27894 28259 . + . gene_name=ORF8
MN908947 GenBank gene 28274 29533 . + . gene_name=N
MN908947 GenBank gene 28284 28577 . + . gene_name=ORF9b
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Country (Institute),Target,Oligonucleotide,Sequence
Loading

0 comments on commit 46e43a2

Please sign in to comment.