-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #90 from nextstrain/update-sc2
Update SC2 datasets
- Loading branch information
Showing
18 changed files
with
47,043 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
data/datasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/genemap.gff
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
##gff-version 3 | ||
##sequence-region MN908947 1 29903 | ||
# Gene map (genome annotation) of SARS-CoV-2 in GFF format. | ||
# For gene map purpses we only need some of the columns. We substitute unused values with "." as per GFF spec. | ||
# See GFF format reference at https://www.ensembl.org/info/website/upload/gff.html | ||
# seqname source feature start end score strand frame attribute | ||
MN908947 GenBank gene 266 13468 . + . gene_name=ORF1a | ||
MN908947 GenBank gene 13468 21555 . + . gene_name=ORF1b | ||
MN908947 GenBank gene 25393 26220 . + . gene_name=ORF3a | ||
MN908947 GenBank gene 21563 25384 . + . gene_name=S | ||
MN908947 GenBank gene 26245 26472 . + . gene_name=E | ||
MN908947 GenBank gene 26523 27191 . + . gene_name=M | ||
MN908947 GenBank gene 27202 27387 . + . gene_name=ORF6 | ||
MN908947 GenBank gene 27394 27759 . + . gene_name=ORF7a | ||
MN908947 GenBank gene 27756 27887 . + . gene_name=ORF7b | ||
MN908947 GenBank gene 27894 28259 . + . gene_name=ORF8 | ||
MN908947 GenBank gene 28274 29533 . + . gene_name=N | ||
MN908947 GenBank gene 28284 28577 . + . gene_name=ORF9b |
1 change: 1 addition & 0 deletions
1
data/datasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/primers.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Country (Institute),Target,Oligonucleotide,Sequence |
446 changes: 446 additions & 0 deletions
446
data/datasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/qc.json
Large diffs are not rendered by default.
Oops, something went wrong.
500 changes: 500 additions & 0 deletions
500
...tasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/reference.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
76 changes: 76 additions & 0 deletions
76
...tasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/sequences.fasta
Large diffs are not rendered by default.
Oops, something went wrong.
25 changes: 25 additions & 0 deletions
25
data/datasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/tag.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
{ | ||
"tag": "2023-09-21T12:00:00Z", | ||
"comment": "Update to include lineage BA.2.86", | ||
"compatibility": { | ||
"nextcladeCli": { | ||
"min": "1.10.0", | ||
"max": null | ||
}, | ||
"nextcladeWeb": { | ||
"min": "1.13.0", | ||
"max": null | ||
} | ||
}, | ||
"enabled": true, | ||
"files": { | ||
"geneMap": "genemap.gff", | ||
"primers": "primers.csv", | ||
"qc": "qc.json", | ||
"reference": "reference.fasta", | ||
"sequences": "sequences.fasta", | ||
"tree": "tree.json", | ||
"virusPropertiesJson": "virus_properties.json" | ||
}, | ||
"metadata": {} | ||
} |
1 change: 1 addition & 0 deletions
1
data/datasets/sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/tree.json
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
.../sars-cov-2-21L/references/BA.2/versions/2023-09-21T12:00:00Z/files/virus_properties.json
Large diffs are not rendered by default.
Oops, something went wrong.
18 changes: 18 additions & 0 deletions
18
data/datasets/sars-cov-2/references/MN908947/versions/2023-09-21T12:00:00Z/files/genemap.gff
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
##gff-version 3 | ||
##sequence-region MN908947 1 29903 | ||
# Gene map (genome annotation) of SARS-CoV-2 in GFF format. | ||
# For gene map purpses we only need some of the columns. We substitute unused values with "." as per GFF spec. | ||
# See GFF format reference at https://www.ensembl.org/info/website/upload/gff.html | ||
# seqname source feature start end score strand frame attribute | ||
MN908947 GenBank gene 266 13468 . + . gene_name=ORF1a | ||
MN908947 GenBank gene 13468 21555 . + . gene_name=ORF1b | ||
MN908947 GenBank gene 25393 26220 . + . gene_name=ORF3a | ||
MN908947 GenBank gene 21563 25384 . + . gene_name=S | ||
MN908947 GenBank gene 26245 26472 . + . gene_name=E | ||
MN908947 GenBank gene 26523 27191 . + . gene_name=M | ||
MN908947 GenBank gene 27202 27387 . + . gene_name=ORF6 | ||
MN908947 GenBank gene 27394 27759 . + . gene_name=ORF7a | ||
MN908947 GenBank gene 27756 27887 . + . gene_name=ORF7b | ||
MN908947 GenBank gene 27894 28259 . + . gene_name=ORF8 | ||
MN908947 GenBank gene 28274 29533 . + . gene_name=N | ||
MN908947 GenBank gene 28284 28577 . + . gene_name=ORF9b |
37 changes: 37 additions & 0 deletions
37
data/datasets/sars-cov-2/references/MN908947/versions/2023-09-21T12:00:00Z/files/primers.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
Country (Institute),Target,Oligonucleotide,Sequence | ||
Charité (Germany),RdRp,Charité_RdRp_F,GTGARATGGTCATGTGTGGCGG | ||
Charité (Germany),RdRp,Charité_S_RdRp_P,CAGGTGGAACCTCATCAGGAGATGC | ||
Charité (Germany),RdRp,Charité_RdRp_R,CARATGTTAAASACACTATTAGCATA | ||
Charité (Germany),E,Charité_E_F,ACAGGTACGTTAATAGTTAATAGCGT | ||
Charité (Germany),E,Charité_E_P,ACACTAGCCATCCTTACTGCGCTTCG | ||
Charité (Germany),E,Charité_E_R,ATATTGCAGCAGTACGCACACA | ||
Charité (Germany),N,Charité_N_F,CACATTGGCACCCGCAATC | ||
Charité (Germany),N,Charité_N_P,ACTTCCTCAAGGAACAACATTGCCA | ||
Charité (Germany),N,Charité_N_R,GAGGAACGAGAAGAGGCTTG | ||
HKU (Hong Kong),ORF1b-nsp14,HKU_ORF_F,TGGGGYTTTACRGGTAACCT | ||
HKU (Hong Kong),ORF1b-nsp14,HKU_ORF_P,TAGTTGTGATGCWATCATGACTAG | ||
HKU (Hong Kong),ORF1b-nsp14,HKU_ORF_R,AACRCGCTTAACAAAGCACTC | ||
HKU (Hong Kong),N,HKU_N_F,TAATCAGACAAGGAACTGATTA | ||
HKU (Hong Kong),N,HKU_N_P,GCAAATTGTGCAATTTGCGG | ||
HKU (Hong Kong),N,HKU_N_R,CGAAGGTGTGACTTCCATG | ||
China CDC (China),N,ChinaCDC_N_F,GGGGAACTTCTCCTGCTAGAAT | ||
China CDC (China),N,ChinaCDC_N_P,TTGCTGCTGCTTGACAGATT | ||
China CDC (China),N,ChinaCDC_N_R,CAGACATTTTGCTCTCAAGCTG | ||
China CDC (China),ORF1ab-nsp10,ChinaCDC_ORF_F,CCCTGTGGGTTTTACACTTAA | ||
China CDC (China),ORF1ab-nsp10,ChinaCDC_ORF_P,CCGTCTGCGGTATGTGGAAAGGTTATGG | ||
China CDC (China),ORF1ab-nsp10,ChinaCDC_ORF_R,ACGATTGTGCATCAGCTGA | ||
US CDC (United States),N1,USCDC_N1_F,GACCCCAAAATCAGCGAAAT | ||
US CDC (United States),N1,USCDC_N1_P,ACCCCGCATTACGTTTGGTGGACC | ||
US CDC (United States),N1,USCDC_N1_R,TCTGGTTACTGCCAGTTGAATCTG | ||
US CDC (United States),N2,USCDC_N2_F,TTACAAACATTGGCCGCAAA | ||
US CDC (United States),N2,USCDC_N2_P,ACAATTTGCCCCCAGCGCTTCAG | ||
US CDC (United States),N2,USCDC_N2_R,GCGCGACATTCCGAAGAA | ||
US CDC (United States),N3,USCDC_N3_F,GGGAGCCTTGAATACACCAAAA | ||
US CDC (United States),N3,USCDC_N3_P,AYCACATTGGCACCCGCAATCCTG | ||
US CDC (United States),N3,USCDC_N3_R,TGTAGCACGATTGCAGCATTG | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP2_F,ATGAGCTTAGTCCTGTTG | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP2_P,AGATGTCTTGTGCTGCCGGTA | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP2_R,CTCCCTTTGTTGTGTTGT | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP4_F,GGTAACTGGTATGATTTCG | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP4_P,TCATACAAACCACGCCAGG | ||
"Institut Pasteur, Paris (France)",RdRp,Pasteur_IP4_R,CTGGTCAAGGTTAATATAGG |
Oops, something went wrong.