Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated H3N2 reference to A/Perth/16/2009. #13

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ lineages = ['h3n2', 'h1n1pdm']
resolutions = ['2y']

def reference_strain(wildcards):
references = {'h3n2':"A/Beijing/32/1992",
references = {'h3n2':"A/Perth/16/2009",
'h1n1pdm':"A/California/07/2009",
'vic':"B/HongKong/02/1993",
'yam':"B/Singapore/11/1994"
Expand Down Expand Up @@ -630,7 +630,6 @@ rule aggregate_cluster_trees:
--trees {input.trees} \
--output {output.tree}
"""

rule clean:
message: "Removing directories: {params}"
params:
Expand Down
170 changes: 91 additions & 79 deletions config/reference_h3n2_ha.gb
Original file line number Diff line number Diff line change
@@ -1,88 +1,100 @@
LOCUS A/Beijing/32/1992 1701 bp DNA VRL 02-MAY-2006
DEFINITION Influenza A virus (A/Beijing/32/1992(H3N2)) hemagglutinin gene,
complete cds.
ACCESSION U26830
VERSION U26830.1 GI:857407
LOCUS KJ609206 1727 bp cRNA linear VRL 25-MAR-2015
DEFINITION Influenza A virus (A/Perth/16/2009(H3N2)) segment 4 hemagglutinin
(HA) gene, complete cds.
ACCESSION KJ609206
VERSION KJ609206.1
KEYWORDS .
SOURCE Influenza A virus (A/Beijing/32/1992(H3N2))
ORGANISM Influenza A virus (A/Beijing/32/1992(H3N2))
SOURCE Influenza A virus (A/Perth/16/2009(H3N2))
ORGANISM Influenza A virus (A/Perth/16/2009(H3N2))
Viruses; ssRNA viruses; ssRNA negative-strand viruses;
Orthomyxoviridae; Influenzavirus A.
REFERENCE 1 (bases 1 to 1701)
AUTHORS Muster,T., Ferko,B., Klima,A., Purtscher,M., Trkola,A., Schulz,P.,
Grassauer,A., Engelhardt,O.G., Garcia-Sastre,A., Palese,P. and
Katinger,H.
TITLE Mucosal model of immunization against human immunodeficiency virus
type 1 with a chimeric influenza virus
JOURNAL J. Virol. 69 (11), 6678-6686 (1995)
PUBMED 7474077
REFERENCE 2 (bases 1 to 1701)
AUTHORS Muster,T. and Klima,A.
REFERENCE 1 (bases 1 to 1727)
AUTHORS Oler,A.J. and Fabozzi,G.
TITLE Innate immune response of BEAS-2B to influenza A (H3N2) viruses
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 1727)
AUTHORS Oler,A.J. and Fabozzi,G.
TITLE Direct Submission
JOURNAL Submitted (11-MAY-1995) Thomas Muster, Institute of Applied
Microbiology, Nussdorfer Laende 11, Vienna, A-1190, Austria
JOURNAL Submitted (14-MAR-2014) Office of Cyber Infrastructure and
Computational Biology, National Institute of Allergy and Infectious
Diseases, 31 Center Drive, Room 3B62, Bethesda, MD 20892, USA
COMMENT GenBank Accession Numbers KJ609203-KJ609210 represent sequences
from the 8 segments of Influenza A virus (A/Perth/16/2009(H3N2)).

##Assembly-Data-START##
Assembly Method :: SOAPdenovo2-bin-LINUX-generic-r240
Coverage :: 3903
Sequencing Technology :: Illumina
##Assembly-Data-END##
FEATURES Location/Qualifiers
source 1..1701
/mol_type="genomic RNA"
/lab_host="MDBK cells"
/db_xref="taxon:380950"
source 1..1727
/organism="Influenza A virus (A/Perth/16/2009(H3N2))"
/mol_type="viral cRNA"
/strain="A/Perth/16/2009"
/serotype="H3N2"
/strain="A/Beijing/32/1992"
/host="Homo sapiens"
/organism="Influenza A virus (A/Beijing/32/1992(H3N2))"
CDS 1..1701
/db_xref="GI:857408"
/product="hemagglutinin"
/db_xref="taxon:654811"
/segment="4"
/country="Australia"
/collection_date="07-Apr-2009"
/note="passage details: E6, BEAS-2B 1"
misc_feature 1..1727
/db_xref="IRD:IRD-Perth.4"
gene 9..1709
/gene="HA"
CDS 9..1709
/gene="HA"
/function="receptor binding and fusion protein"
/codon_start=1
/translation="MKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTLVKTI
TNDQIEVTNATELVQSSSTGRICDSPHRILDGKNCTLIDALLGDPHCDGFQNKEWDLFV
ERSKAYSNCYPYDVPDYASLRSLVASSGTLEFINEDFNWTGVAQDGGSYACKRGSVNSF
FSRLNWLHKSEYKYPALNVTMPNNGKFDKLYIWGVHHPSTDRDQTSLYVRASGRVTVST
KRSQQTVTPNIGSRPWVRGQSSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRNGKS
SIMRSDAPIGTCSSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLATGMRNVP
EKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNR
LIEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDS
EMNKLFEKTRKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQ
IKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQKGNIRCNICI"
/protein_id="AAA87553.1"
CDS 1..48
/product="Signal peptide"
/gene="SigPep"
CDS 49..1035
/product="HA1 protein"
/gene="HA1"
CDS 1036..1698
/product="HA2 protein"
/gene="HA2"

ORIGIN
1 atgaagacta tcattgcttt gagctacatt ttatgtctgg ttttcgctca aaaacttccc
61 ggaaatgaca acagcacagc aacgctgtgc ctgggacatc atgcagtgcc aaacggaacg
121 ctagtgaaaa caatcacgaa tgatcaaatt gaagtgacta atgctactga gctggttcag
181 agttcctcaa caggtagaat atgcgacagt cctcaccgaa tccttgatgg aaaaaactgc
241 acactgatag atgctctatt gggagaccct cattgtgatg gcttccaaaa taaggaatgg
301 gacctttttg ttgaacgcag caaagcttac agcaactgtt acccttatga tgtaccggat
361 tatgcctccc ttaggtcact agttgcctca tcaggcaccc tggagtttat caatgaagac
421 ttcaattgga ctggagtcgc tcaggatggg ggaagctatg cttgcaaaag gggatctgtt
481 aacagtttct ttagtagatt gaattggttg cacaaatcag aatacaaata tccagcgctg
541 aacgtgacta tgccaaacaa tggcaaattt gacaaattgt acatttgggg ggttcaccac
601 ccgagcacgg acagagacca aaccagccta tatgttcgag catcagggag agtcacagtc
661 tctaccaaaa gaagccaaca aactgtaacc ccgaatatcg ggtctagacc ctgggtaagg
721 ggtcagtcca gtagaataag catctattgg acaatagtaa aaccgggaga catacttttg
781 attaatagca cagggaatct aattgctcct cggggttact tcaaaatacg aaatgggaaa
841 agctcaataa tgaggtcaga tgcacccatt ggcacctgca gttctgaatg catcactcca
901 aatggaagca ttcccaatga caaacctttt caaaatgtaa acaggatcac atatggggcc
961 tgccccagat atgttaagca aaacactctg aaattggcaa cagggatgcg gaatgtacca
1021 gagaaacaaa ctagaggcat attcggcgca atcgcaggtt tcatagaaaa tggttgggag
1081 ggaatggtag acggttggta cggtttcagg catcaaaatt ctgagggcac aggacaagca
1141 gcagatctta aaagcactca agcagcaatc gaccaaatca acgggaaact gaataggtta
1201 atcgagaaaa cgaacgagaa attccatcaa atcgaaaaag aattctcaga agtagaaggg
1261 agaattcagg acctcgagaa atatgttgaa gacactaaaa tagatctctg gtcttacaac
1321 gcggagcttc ttgttgccct ggagaaccaa catacaattg atcttactga ctcagaaatg
1381 aacaaactgt ttgaaaaaac aaggaagcaa ctgagggaaa atgctgagga catgggcaat
1441 ggttgcttca aaatatacca caaatgtgac aatgcctgca tagggtcaat cagaaatgga
1501 acttatgacc atgatgtata cagagacgaa gcattaaaca accggttcca gatcaaaggt
1561 gttgagctga agtcaggata caaagattgg atcctgtgga tttcctttgc catatcatgc
1621 tttttgcttt gtgttgtttt gctggggttc atcatgtggg cctgccaaaa aggcaacatt
1681 aggtgtaaca tttgcatttg a
/product="hemagglutinin"
/protein_id="AHX37629.1"
/translation="MKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTIVKT
ITNDQIEVTNATELVQSSSTGEICDSPHQILDGKNCTLIDALLGDPQCDGFQNKKWDL
FVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFNNESFNWTGVTQNGTSSACIRRSK
NSFFSRLNWLTHLNFKYPALNVTMPNNEQFDKLYIWGVLHPGTDKDQIFLYAQASGRI
TVSTKRSQQTVSPNIGSRPRVRNIPSRISIYWTIVKPGDILLINSTGNLIAPRGYFKI
RSGKSSIMRSDAPIGKCNSECITPNGSIPNDKPFQNVNRITYGACPRYVKQNTLKLAT
GMRNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGRGQAADLKSTQAAIDQ
INGKLNRLIGKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQ
HTIDLTDSEMNKLFEKTKKQLRENAEDMGNGCFKIYHKCDNACIGSIRNGTYDHDVYR
DEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVALLGFIMWACQKGNIRCNICI
"
sig_peptide 9..56
/gene="HA"
mat_peptide 57..1043
/gene="HA"
/product="HA1"
mat_peptide 1044..1706
/gene="HA"
/product="HA2"
ORIGIN
1 tattaaccat gaagactatc attgctttga gctacattct atgtctggtt ttcgctcaaa
61 aacttcctgg aaatgacaac agcacggcaa cgctgtgcct tgggcaccat gcagtaccaa
121 acggaacgat agtgaaaaca atcacgaatg accaaattga agttactaat gctactgagc
181 tggttcagag ttcctcaaca ggtgaaatat gcgacagtcc tcatcagatc cttgatggaa
241 aaaactgcac actaatagat gctctattgg gagaccctca gtgtgatggc ttccaaaata
301 agaaatggga cctttttgtt gaacgcagca aagcctacag caactgttac ccttatgatg
361 tgccggatta tgcctccctt aggtcactag ttgcctcatc cggcacactg gagtttaaca
421 atgaaagctt caattggact ggagtcactc aaaacggaac aagctctgct tgcataagga
481 gatctaaaaa cagtttcttt agtagattga attggttgac ccacttaaac ttcaaatacc
541 cagcattgaa cgtgactatg ccaaacaatg aacaatttga caaattgtac atttgggggg
601 ttctccaccc gggtacggac aaagaccaaa tcttcctgta tgctcaagca tcaggaagaa
661 tcacagtctc taccaaaaga agccaacaaa ccgtaagccc gaatatcgga tctagaccca
721 gagtaaggaa tatccctagc agaataagca tctattggac aatagtaaaa ccgggagaca
781 tacttttgat taacagcaca gggaatctaa ttgctcctag gggttacttc aaaatacgaa
841 gtgggaaaag ctcaataatg agatcagatg cacccattgg caaatgcaat tctgaatgca
901 tcactccaaa tggaagcatt cccaatgaca aaccattcca aaatgtaaac aggatcacat
961 acggggcctg tcccagatat gttaagcaaa acactctgaa attggcaaca gggatgcgaa
1021 atgtaccaga gaaacaaact agaggcatat ttggcgcaat cgcgggtttc atagaaaatg
1081 gttgggaggg aatggtggat ggttggtacg gtttcaggca tcaaaattct gagggaagag
1141 gacaagcagc agatctcaaa agcactcaag cagcaatcga tcaaatcaat gggaagctga
1201 atagattgat cgggaaaacc aacgagaaat tccatcagat tgaaaaagaa ttctcagaag
1261 tcgaagggag aattcaggac cttgagaaat atgttgagga cactaaaata gatctctggt
1321 catacaacgc ggagcttctt gttgccctgg agaaccaaca tacaattgat ctaactgact
1381 cagaaatgaa caaactgttt gaaaaaacaa agaagcaact gagggaaaat gctgaggata
1441 tgggcaatgg ttgtttcaaa atataccaca aatgtgacaa tgcctgcata ggatcaatca
1501 gaaatggaac ttatgaccac gatgtataca gagatgaagc attaaacaac cggtttcaga
1561 tcaagggagt tgagctgaag tcagggtaca aagattggat cctatggatt tcctttgcca
1621 tatcatgttt tttgctttgt gttgctttgt tggggttcat catgtgggcc tgccaaaaag
1681 gcaacattag gtgcaacatt tgcatttgag tgcattaatt aaaaaca
//
136 changes: 67 additions & 69 deletions config/reference_h3n2_mp.gb
Original file line number Diff line number Diff line change
@@ -1,84 +1,82 @@
LOCUS CY113678 999 bp DNA VRL 04-APR-2012
DEFINITION Influenza A virus (A/Beijing/32/1992(H3N2)) matrix protein 2 (M2)
and matrix protein 1 (M1) genes, complete cds.
ACCESSION CY113678
VERSION CY113678.1
LOCUS KJ609209 1001 bp cRNA linear VRL 25-MAR-2015
DEFINITION Influenza A virus (A/Perth/16/2009(H3N2)) segment 7 matrix protein
2 (M2) and matrix protein 1 (M1) genes, complete cds.
ACCESSION KJ609209
VERSION KJ609209.1
KEYWORDS .
SOURCE Influenza A virus (A/Beijing/32/1992(H3N2))
ORGANISM Influenza A virus (A/Beijing/32/1992(H3N2))
SOURCE Influenza A virus (A/Perth/16/2009(H3N2))
ORGANISM Influenza A virus (A/Perth/16/2009(H3N2))
Viruses; ssRNA viruses; ssRNA negative-strand viruses;
Orthomyxoviridae; Influenzavirus A.
REFERENCE 1 (bases 1 to 999)
AUTHORS Wentworth,D.E., Dugan,V., Halpin,R., Lin,X., Bera,J., Ghedin,E.,
Fedorova,N., Overton,L., Tsitrin,T., Stockwell,T., Amedeo,P.,
Bishop,B., Chen,H., Edworthy,P., Gupta,N., Katzel,D., Li,K.,
Schobel,S., Shrivastava,S., Thovarai,V., Wang,S., Westgeest,K.B.,
van Beek,R., Bestebroer,T.M., de Jong,J.C., Rimmelzwaan,G.F.,
Osterhaus,A.D.M.E., Fouchier,R.A.M., Bao,Y., Sanders,R.,
Dernovoy,D., Kiryutin,B., Lipman,D.J. and Tatusova,T.
TITLE The NIAID Influenza Genome Sequencing Project
REFERENCE 1 (bases 1 to 1001)
AUTHORS Oler,A.J. and Fabozzi,G.
TITLE Innate immune response of BEAS-2B to influenza A (H3N2) viruses
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 999)
CONSRTM The NIAID Influenza Genome Sequencing Consortium
REFERENCE 2 (bases 1 to 1001)
AUTHORS Oler,A.J. and Fabozzi,G.
TITLE Direct Submission
JOURNAL Submitted (04-APR-2012) on behalf of JCVI/Erasmus Medical
College/NCBI, National Center for Biotechnology Information, NIH,
Bethesda, MD 20894, USA
COMMENT This work was supported by the National Institute of Allergy and
Infectious Diseases (NIAID), Genome Sequencing Centers for
Infectious Diseases (GSCID) program.
JOURNAL Submitted (14-MAR-2014) Office of Cyber Infrastructure and
Computational Biology, National Institute of Allergy and Infectious
Diseases, 31 Center Drive, Room 3B62, Bethesda, MD 20892, USA
COMMENT GenBank Accession Numbers KJ609203-KJ609210 represent sequences
from the 8 segments of Influenza A virus (A/Perth/16/2009(H3N2)).

##Assembly-Data-START##
Assembly Method :: SOAPdenovo2-bin-LINUX-generic-r240
Coverage :: 5075
Sequencing Technology :: Illumina
##Assembly-Data-END##
FEATURES Location/Qualifiers
source 1..999
/bio_material="CEIRS#CIP047BE3292#"
/collection_date="1992"
/country="China: Beijing"
/db_xref="taxon:380950"
/host="Homo sapiens"
/lab_host="xtMK3 MDCK1 passage(s)"
source 1..1001
/organism="Influenza A virus (A/Perth/16/2009(H3N2))"
/mol_type="viral cRNA"
/organism="Influenza A virus (A/Beijing/32/1992(H3N2))"
/segment="7"
/strain="A/Perth/16/2009"
/serotype="H3N2"
/strain="A/Beijing/32/1992"
misc_feature 1..999
/db_xref="IRD:NIGSP_CEIRS_CIP047_RFH3_00116.MP"
gene 11..992
/host="Homo sapiens"
/db_xref="taxon:654811"
/segment="7"
/country="Australia"
/collection_date="07-Apr-2009"
/note="passage details: E6, BEAS-2B 1"
misc_feature 1..1001
/db_xref="IRD:IRD-Perth.7"
gene 15..996
/gene="M2"
CDS join(11..36,725..992)
/codon_start=1
CDS join(15..40,729..996)
/gene="M2"
/codon_start=1
/product="matrix protein 2"
/protein_id="AFG99679.1"
/translation="MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDR
LFFKCIYRLFKHGLKRGPSTEGVPESMREEYRKEQQNAVDADDSHFVSIELE"
gene 11..769
/protein_id="AHX37633.1"
/translation="MSLLTEVETPIRNEWGCRCNDSSDPLVVAANIIGILHLILWILD
RLFFKCVYRLFKHGLKRGPSTEGVPESMREEYRKEQQNAVDADDSHFVSIELE"
gene 15..773
/gene="M1"
CDS 11..769
/codon_start=1
CDS 15..773
/gene="M1"
/codon_start=1
/product="matrix protein 1"
/protein_id="AFG99678.1"
/translation="MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEW
LKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLYRKLK
REITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHR
QMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEIASQARQMVQAMRAIGTH
PSSSAGLKDDLLENLQTYQKRMGVQMQRFK"
ORIGIN
1 atattgaaag atgagccttc taaccgaggt cgaaacgtat gttctctcta tcgttccatc
61 aggccccctc aaagccgaaa tcgcgcagag acttgaagat gtctttgctg ggaaaaacac
121 agatcttgag gctctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa
181 ggggattttg gggtttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag
241 acgctttgtc caaaatgccc tcaatgggaa tggggatcca aataacatgg acagagcagt
301 taaactatat agaaaactta agagggagat tacattccat ggggccaaag aaatagcact
361 cagttattct gctggtgcac ttgccagttg catgggcctc atatacaaca gaatgggggc
421 tgtaaccact gaagtggcct ttggcctggt atgtgcaaca tgtgaacaga ttgctgactc
481 ccagcacagg tctcataggc aaatggtggc aacaaccaat ccattaataa ggcatgagaa
541 cagaatggtt ttggccagca ctacagctaa ggctatggag caaatggctg gatcaagtga
601 gcaggcagcg gaggccatgg aaattgctag tcaggccagg caaatggtgc aggcaatgag
661 agccattggg actcatccta gctccagtgc tggtctaaaa gatgatcttc ttgaaaattt
721 gcagacctat cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac ccgcttgttg
781 ttgctgcgag tatcattggg atattgcact tgatattgtg gattcttgat cgtctttttt
841 tcaaatgcat ctatcgactc ttcaaacacg gcctgaaaag agggccttct acggaaggag
901 tacctgagtc tatgagggaa gaatatcgaa aggaacagca gaatgctgtg gatgctgacg
961 acagtcattt tgtcagcata gagctggagt aaaaaacta
/protein_id="AHX37632.1"
/translation="MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALME
WLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRK
LKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHR
SHRQMVATTNPLIKHENRMVLASTTAKAMEQMAGSSEQAAEAMEIASQARQMVQAMRA
IGTHPSSSTGLRDDLLENLQTYQKRMGVQMQRFK"
ORIGIN
1 gtagatgttg aaagatgagc cttctaaccg aggtcgaaac gtatgttctc tctatcgttc
61 catcaggccc cctcaaagcc gagatcgcgc agagacttga agatgtcttt gctgggaaaa
121 acacagatct tgaggctctc atggaatggc taaagacaag accaattctg tcacctctga
181 ctaaggggat tttggggttt gtgttcacgc tcaccgtgcc cagtgagcga ggactgcagc
241 gtagacgctt tgtccaaaat gccctcaatg ggaatggaga cccaaataac atggacaaag
301 cagttaaact gtataggaaa cttaagagag agataacgtt ccatggggcc aaagaaatag
361 ctctcagtta ttccgctggt gcacttgcca gttgcatggg cctcatatac aataggatgg
421 gggctgtaac cactgaagtg gcatttggcc tggtatgtgc aacatgtgag cagattgctg
481 actcccagca caggtctcat aggcagatgg tggcaacaac caatccatta ataaaacatg
541 agaacagaat ggttttggcc agcactacag ctaaggctat ggagcaaatg gctggatcaa
601 gtgaacaggc agcggaggcc atggagattg ctagtcaggc caggcagatg gtgcaggcaa
661 tgagagccat tgggactcat cctagttcca gtactggttt aagagatgat cttcttgaaa
721 atttgcagac ctatcagaaa cgaatggggg tgcagatgca acgattcaag tgacccgctt
781 gttgttgccg cgaatatcat tgggatcttg cacttgatat tgtggattct tgatcgtctt
841 tttttcaaat gcgtctatcg actcttcaaa cacggcctta aaagaggccc ttctacggaa
901 ggagtacctg agtctatgag ggaagaatat cgaaaggaac agcagaatgc tgtggatgct
961 gacgacagtc attttgtcag catagagttg gagtaaaaaa c
//
Loading