Skip to content

Latest commit

 

History

History

zle-zle

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): bel bel_Latn orv_Cyrl rus ukr
  • target language(s): bel bel_Latn orv_Cyrl rus ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.bel-rus.bel.rus 57.1 0.758
Tatoeba-test.bel-ukr.bel.ukr 55.5 0.751
Tatoeba-test.multi.multi 58.0 0.742
Tatoeba-test.orv-rus.orv.rus 5.8 0.226
Tatoeba-test.orv-ukr.orv.ukr 2.5 0.161
Tatoeba-test.rus-bel.rus.bel 50.5 0.714
Tatoeba-test.rus-orv.rus.orv 0.3 0.129
Tatoeba-test.rus-ukr.rus.ukr 63.9 0.794
Tatoeba-test.ukr-bel.ukr.bel 51.3 0.719
Tatoeba-test.ukr-orv.ukr.orv 0.3 0.106
Tatoeba-test.ukr-rus.ukr.rus 68.7 0.825

opus-2020-09-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
  • target language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-09-26.zip
  • test set translations: opus-2020-09-26.test.txt
  • test set scores: opus-2020-09-26.eval.txt

Benchmarks

testset BLEU chr-F
newstest2012-engrus.eng.rus 26.4 0.538
newstest2012-ruseng.rus.eng 29.8 0.568
newstest2013-engrus.eng.rus 20.1 0.479
newstest2013-ruseng.rus.eng 23.8 0.512
newstest2014-ruen-ruseng.rus.eng 26.4 0.550
newstest2015-enru-engrus.eng.rus 22.5 0.515
newstest2015-enru-ruseng.rus.eng 25.1 0.528
newstest2016-enru-engrus.eng.rus 21.5 0.502
newstest2016-enru-ruseng.rus.eng 23.8 0.521
newstest2017-enru-engrus.eng.rus 23.5 0.521
newstest2017-enru-ruseng.rus.eng 27.3 0.546
newstest2018-enru-engrus.eng.rus 20.2 0.506
newstest2018-enru-ruseng.rus.eng 23.1 0.517
newstest2019-enru-engrus.eng.rus 22.0 0.484
newstest2019-ruen-ruseng.rus.eng 26.0 0.536
Tatoeba-test.bel-eng.bel.eng 36.9 0.565
Tatoeba-test.bel-rus.bel.rus 57.3 0.755
Tatoeba-test.bel-ukr.bel.ukr 54.3 0.743
Tatoeba-test.eng-bel.eng.bel 19.2 0.450
Tatoeba-test.eng-orv.eng.orv 0.3 0.137
Tatoeba-test.eng-rue.eng.rue 0.3 0.149
Tatoeba-test.eng-rus.eng.rus 38.5 0.595
Tatoeba-test.eng-ukr.eng.ukr 36.5 0.577
Tatoeba-test.multi.multi 47.5 0.655
Tatoeba-test.orv-eng.orv.eng 6.8 0.209
Tatoeba-test.orv-rus.orv.rus 4.8 0.226
Tatoeba-test.orv-ukr.orv.ukr 3.7 0.180
Tatoeba-test.rue-eng.rue.eng 17.7 0.353
Tatoeba-test.rus-bel.rus.bel 44.0 0.658
Tatoeba-test.rus-eng.rus.eng 50.1 0.659
Tatoeba-test.rus-orv.rus.orv 0.2 0.153
Tatoeba-test.rus-ukr.rus.ukr 63.1 0.789
Tatoeba-test.ukr-bel.ukr.bel 47.6 0.683
Tatoeba-test.ukr-eng.ukr.eng 49.3 0.653
Tatoeba-test.ukr-orv.ukr.orv 0.4 0.136
Tatoeba-test.ukr-rus.ukr.rus 68.9 0.824

opus-2020-10-04.zip

  • dataset: opus
  • model: transformer
  • source language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
  • target language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-10-04.zip
  • test set translations: opus-2020-10-04.test.txt
  • test set scores: opus-2020-10-04.eval.txt

Benchmarks

testset BLEU chr-F
newstest2012-engrus.eng.rus 26.4 0.538
newstest2012-ruseng.rus.eng 29.6 0.568
newstest2013-engrus.eng.rus 20.4 0.481
newstest2013-ruseng.rus.eng 23.8 0.513
newstest2014-ruen-ruseng.rus.eng 26.4 0.550
newstest2015-enru-engrus.eng.rus 22.4 0.515
newstest2015-enru-ruseng.rus.eng 25.2 0.529
newstest2016-enru-engrus.eng.rus 21.6 0.505
newstest2016-enru-ruseng.rus.eng 23.9 0.521
newstest2017-enru-engrus.eng.rus 23.6 0.524
newstest2017-enru-ruseng.rus.eng 27.5 0.546
newstest2018-enru-engrus.eng.rus 20.4 0.509
newstest2018-enru-ruseng.rus.eng 23.3 0.518
newstest2019-enru-engrus.eng.rus 22.3 0.487
newstest2019-ruen-ruseng.rus.eng 26.0 0.537
Tatoeba-test.bel-eng.bel.eng 36.7 0.562
Tatoeba-test.bel-rus.bel.rus 56.9 0.753
Tatoeba-test.bel-ukr.bel.ukr 54.5 0.742
Tatoeba-test.eng-bel.eng.bel 19.3 0.452
Tatoeba-test.eng-orv.eng.orv 0.3 0.138
Tatoeba-test.eng-rue.eng.rue 0.8 0.152
Tatoeba-test.eng-rus.eng.rus 38.9 0.597
Tatoeba-test.eng-ukr.eng.ukr 36.3 0.578
Tatoeba-test.multi.multi 47.5 0.656
Tatoeba-test.orv-eng.orv.eng 6.3 0.210
Tatoeba-test.orv-rus.orv.rus 5.3 0.224
Tatoeba-test.orv-ukr.orv.ukr 3.1 0.175
Tatoeba-test.rue-eng.rue.eng 18.1 0.351
Tatoeba-test.rus-bel.rus.bel 44.0 0.659
Tatoeba-test.rus-eng.rus.eng 50.4 0.662
Tatoeba-test.rus-orv.rus.orv 0.2 0.149
Tatoeba-test.rus-ukr.rus.ukr 63.0 0.789
Tatoeba-test.ukr-bel.ukr.bel 46.7 0.680
Tatoeba-test.ukr-eng.ukr.eng 49.5 0.655
Tatoeba-test.ukr-orv.ukr.orv 0.4 0.138
Tatoeba-test.ukr-rus.ukr.rus 68.8 0.824

opus-2021-02-23.zip

  • dataset: opus
  • model: transformer
  • source language(s): bel orv rus ukr
  • target language(s): bel orv rus ukr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>eng<< >>ukr<< >>rus<< >>bel<< >>bel_Latn<<
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
newstest2012.eng-rus 26.4 0.538 3003 64830 0.999
newstest2012.rus-eng 29.6 0.568 3003 72812 0.983
newstest2013.eng-rus 20.4 0.481 3000 58560 0.989
newstest2013.rus-eng 23.8 0.513 3000 64505 0.991
newstest2014-ruen.rus-eng 26.4 0.550 3003 69190 0.994
newstest2015-enru.eng-rus 22.4 0.515 2818 55915 1.000
newstest2015-enru.rus-eng 25.2 0.529 2818 64744 0.953
newstest2016-enru.eng-rus 21.6 0.505 2998 62018 1.000
newstest2016-enru.rus-eng 23.9 0.521 2998 69278 0.983
newstest2017-enru.eng-rus 23.6 0.524 3001 60255 1.000
newstest2017-enru.rus-eng 27.5 0.546 3001 69033 0.968
newstest2018-enru.eng-rus 20.4 0.509 3000 61920 1.000
newstest2018-enru.rus-eng 23.3 0.518 3000 71723 0.969
newstest2019-enru.eng-rus 22.3 0.487 1997 48153 0.937
newstest2019-ruen.rus-eng 26.0 0.537 2000 42875 0.964
Tatoeba-test.bel_Latn-rus 1.3 0.138 6 60 0.838
Tatoeba-test.bel_Latn-ukr 2.7 0.135 8 61 0.878
Tatoeba-test.bel-rus 56.5 0.751 2500 18815 0.984
Tatoeba-test.bel-ukr 54.4 0.742 2355 15138 0.999
Tatoeba-test.multi-multi 55.0 0.730 10000 63367 1.000
Tatoeba-test.orv-rus 5.3 0.224 171 1259 0.994
Tatoeba-test.orv-ukr 3.1 0.175 973 5423 1.000
Tatoeba-test.rus-bel 44.0 0.657 2500 18750 0.999
Tatoeba-test.rus-bel_Latn 1.0 0.008 6 64 0.571
Tatoeba-test.rus-orv 0.2 0.150 171 1174 1.000
Tatoeba-test.rus-ukr 62.6 0.786 10000 59963 0.993
Tatoeba-test.ukr-bel 46.6 0.679 2355 15166 1.000
Tatoeba-test.ukr-bel_Latn 1.4 0.007 8 61 0.897
Tatoeba-test.ukr-orv 0.4 0.138 973 5037 1.000
Tatoeba-test.ukr-rus 68.2 0.821 10000 60129 0.995
tico19-test.eng-rus 20.2 0.484 2100 55837 0.921

opusTCv20210807+bt_transformer-big_2022-03-07.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.bel_Latn-rus 0.9 3.687 6 60 1.000
Tatoeba-test-v2021-08-07.bel_Latn-ukr 2.0 3.999 8 61 1.000
Tatoeba-test-v2021-08-07.bel-rus 68.1 0.82181 2500 18815 0.998
Tatoeba-test-v2021-08-07.bel-ukr 65.4 0.80926 2355 15138 1.000
Tatoeba-test-v2021-08-07.multi-multi 67.9 0.81487 10000 64058 1.000
Tatoeba-test-v2021-08-07.orv-rus 5.5 0.24450 171 1259 0.931
Tatoeba-test-v2021-08-07.orv-ukr 2.7 0.18156 973 5423 1.000
Tatoeba-test-v2021-08-07.rus-bel 50.4 0.67310 2500 18750 1.000
Tatoeba-test-v2021-08-07.rus-bel_Latn 1.2 5.017 6 64 0.984
Tatoeba-test-v2021-08-07.rus-orv 0.4 0.18013 171 1174 1.000
Tatoeba-test-v2021-08-07.rus-rus 40.2 0.62962 2500 16799 0.986
Tatoeba-test-v2021-08-07.rus-ukr 69.9 0.83606 10000 59963 1.000
Tatoeba-test-v2021-08-07.ukr-bel 58.6 0.75005 2355 15166 1.000
Tatoeba-test-v2021-08-07.ukr-bel_Latn 2.2 5.440 8 61 1.000
Tatoeba-test-v2021-08-07.ukr-orv 0.5 0.14586 973 5037 1.000
Tatoeba-test-v2021-08-07.ukr-rus 75.3 0.86640 10000 60129 0.999
Tatoeba-test-v2021-08-07.ukr-ukr 32.6 0.58763 824 4198 1.000

opusTCv20210807+bt_transformer-big_2022-03-12.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.bel_Latn-rus 0.9 5.113 6 60 1.000
Tatoeba-test-v2021-08-07.bel_Latn-ukr 2.3 8.819 8 61 1.000
Tatoeba-test-v2021-08-07.bel-rus 68.4 0.82205 2500 18815 0.995
Tatoeba-test-v2021-08-07.bel-ukr 64.7 0.80092 2355 15138 1.000
Tatoeba-test-v2021-08-07.multi-multi 68.0 0.81499 10000 64058 1.000
Tatoeba-test-v2021-08-07.orv-rus 5.6 0.25954 171 1259 0.955
Tatoeba-test-v2021-08-07.orv-ukr 3.3 0.18942 973 5423 1.000
Tatoeba-test-v2021-08-07.rus-bel 50.6 0.67028 2500 18750 1.000
Tatoeba-test-v2021-08-07.rus-bel_Latn 1.2 5.021 6 64 0.984
Tatoeba-test-v2021-08-07.rus-orv 0.4 0.17795 171 1174 1.000
Tatoeba-test-v2021-08-07.rus-rus 40.8 0.63458 2500 16799 0.987
Tatoeba-test-v2021-08-07.rus-ukr 70.1 0.83647 10000 59963 1.000
Tatoeba-test-v2021-08-07.ukr-bel 58.4 0.74512 2355 15166 1.000
Tatoeba-test-v2021-08-07.ukr-bel_Latn 2.0 5.239 8 61 1.000
Tatoeba-test-v2021-08-07.ukr-orv 0.5 0.14517 973 5037 1.000
Tatoeba-test-v2021-08-07.ukr-rus 75.5 0.86852 10000 60129 1.000
Tatoeba-test-v2021-08-07.ukr-ukr 33.5 0.59463 824 4198 1.000