Skip to content

Latest commit

 

History

History

eng-sla

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bel bel_Latn bos_Latn bul bul_Latn ces csb_Latn dsb hrv hsb mkd orv_Cyrl pol rue rus slv srp_Cyrl srp_Latn ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bel.eng.bel 21.8 0.476
Tatoeba-test.eng-bul.eng.bul 46.2 0.646
Tatoeba-test.eng-ces.eng.ces 41.3 0.614
Tatoeba-test.eng-csb.eng.csb 1.3 0.195
Tatoeba-test.eng-dsb.eng.dsb 2.2 0.065
Tatoeba-test.eng-hbs.eng.hbs 1.0 0.080
Tatoeba-test.eng-hsb.eng.hsb 4.4 0.237
Tatoeba-test.eng-mkd.eng.mkd 43.0 0.625
Tatoeba-test.eng.multi 39.5 0.601
Tatoeba-test.eng-orv.eng.orv 0.6 0.115
Tatoeba-test.eng-pol.eng.pol 40.9 0.627
Tatoeba-test.eng-rue.eng.rue 0.9 0.106
Tatoeba-test.eng-rus.eng.rus 39.2 0.602
Tatoeba-test.eng-slv.eng.slv 17.9 0.349
Tatoeba-test.eng-ukr.eng.ukr 37.3 0.588

opus-2020-07-14.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bel bel_Latn bos_Latn bul bul_Latn ces csb_Latn dsb hrv hsb mkd orv_Cyrl pol rue rus slv srp_Cyrl srp_Latn ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-14.zip
  • test set translations: opus-2020-07-14.test.txt
  • test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-bel.eng.bel 22.0 0.485
Tatoeba-test.eng-bul.eng.bul 45.7 0.643
Tatoeba-test.eng-ces.eng.ces 41.0 0.612
Tatoeba-test.eng-csb.eng.csb 3.2 0.217
Tatoeba-test.eng-dsb.eng.dsb 1.6 0.168
Tatoeba-test.eng-hsb.eng.hsb 9.4 0.304
Tatoeba-test.eng-mkd.eng.mkd 43.4 0.628
Tatoeba-test.eng.multi 39.8 0.599
Tatoeba-test.eng-orv.eng.orv 0.4 0.013
Tatoeba-test.eng-pol.eng.pol 40.6 0.626
Tatoeba-test.eng-rue.eng.rue 0.3 0.017
Tatoeba-test.eng-rus.eng.rus 39.1 0.599
Tatoeba-test.eng-slv.eng.slv 18.4 0.354
Tatoeba-test.eng-ukr.eng.ukr 37.2 0.585

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): bel bel_Latn bos_Latn bul bul_Latn ces csb_Latn dsb hrv hsb mkd orv_Cyrl pol rue rus slv srp_Cyrl srp_Latn ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-engces.eng.ces 20.1 0.483
news-test2008-engces.eng.ces 17.2 0.456
newstest2009-engces.eng.ces 18.5 0.474
newstest2010-engces.eng.ces 18.8 0.479
newstest2011-engces.eng.ces 19.7 0.480
newstest2012-engces.eng.ces 17.7 0.457
newstest2012-engrus.eng.rus 26.7 0.546
newstest2013-engces.eng.ces 20.8 0.483
newstest2013-engrus.eng.rus 20.7 0.487
newstest2015-encs-engces.eng.ces 20.7 0.492
newstest2015-enru-engrus.eng.rus 23.8 0.528
newstest2016-encs-engces.eng.ces 23.2 0.511
newstest2016-enru-engrus.eng.rus 22.1 0.513
newstest2017-encs-engces.eng.ces 18.5 0.469
newstest2017-enru-engrus.eng.rus 24.4 0.535
newstest2018-encs-engces.eng.ces 18.6 0.474
newstest2018-enru-engrus.eng.rus 21.3 0.518
newstest2019-encs-engces.eng.ces 19.6 0.480
newstest2019-enru-engrus.eng.rus 23.2 0.499
Tatoeba-test.eng-bel.eng.bel 22.2 0.484
Tatoeba-test.eng-bul.eng.bul 46.0 0.646
Tatoeba-test.eng-ces.eng.ces 41.9 0.617
Tatoeba-test.eng-csb.eng.csb 3.0 0.214
Tatoeba-test.eng-dsb.eng.dsb 1.3 0.162
Tatoeba-test.eng-hbs.eng.hbs 40.0 0.613
Tatoeba-test.eng-hsb.eng.hsb 13.7 0.316
Tatoeba-test.eng-mkd.eng.mkd 43.9 0.632
Tatoeba-test.eng.multi 40.3 0.604
Tatoeba-test.eng-orv.eng.orv 0.4 0.011
Tatoeba-test.eng-pol.eng.pol 40.9 0.629
Tatoeba-test.eng-rue.eng.rue 0.3 0.012
Tatoeba-test.eng-rus.eng.rus 39.7 0.605
Tatoeba-test.eng-slv.eng.slv 18.7 0.353
Tatoeba-test.eng-ukr.eng.ukr 37.3 0.588

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): bel bel_Latn bos_Latn bul bul_Latn ces csb_Latn dsb hrv hsb mkd orv_Cyrl pol rue rus slv srp_Cyrl srp_Latn ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newssyscomb2009-engces.eng.ces 20.1 0.484
news-test2008-engces.eng.ces 17.7 0.461
newstest2009-engces.eng.ces 19.1 0.479
newstest2010-engces.eng.ces 19.3 0.483
newstest2011-engces.eng.ces 20.4 0.486
newstest2012-engces.eng.ces 18.3 0.461
newstest2012-engrus.eng.rus 27.4 0.551
newstest2013-engces.eng.ces 21.5 0.489
newstest2013-engrus.eng.rus 20.9 0.490
newstest2015-encs-engces.eng.ces 21.1 0.496
newstest2015-enru-engrus.eng.rus 24.5 0.536
newstest2016-encs-engces.eng.ces 23.6 0.515
newstest2016-enru-engrus.eng.rus 23.0 0.519
newstest2017-encs-engces.eng.ces 19.2 0.474
newstest2017-enru-engrus.eng.rus 25.0 0.541
newstest2018-encs-engces.eng.ces 19.3 0.479
newstest2018-enru-engrus.eng.rus 22.3 0.526
newstest2019-encs-engces.eng.ces 20.4 0.486
newstest2019-enru-engrus.eng.rus 24.0 0.506
Tatoeba-test.eng-bel.eng.bel 22.9 0.489
Tatoeba-test.eng-bul.eng.bul 46.7 0.652
Tatoeba-test.eng-ces.eng.ces 42.7 0.624
Tatoeba-test.eng-csb.eng.csb 1.4 0.210
Tatoeba-test.eng-dsb.eng.dsb 1.4 0.165
Tatoeba-test.eng-hbs.eng.hbs 40.3 0.616
Tatoeba-test.eng-hsb.eng.hsb 14.3 0.344
Tatoeba-test.eng-mkd.eng.mkd 44.1 0.635
Tatoeba-test.eng.multi 41.0 0.610
Tatoeba-test.eng-orv.eng.orv 0.3 0.014
Tatoeba-test.eng-pol.eng.pol 42.0 0.637
Tatoeba-test.eng-rue.eng.rue 0.3 0.012
Tatoeba-test.eng-rus.eng.rus 40.5 0.612
Tatoeba-test.eng-slv.eng.slv 18.8 0.357
Tatoeba-test.eng-ukr.eng.ukr 38.8 0.600