- dataset: opus
- model: transformer
- source language(s): bel bel_Latn orv_Cyrl rus ukr
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-27.zip
- test set translations: opus-2020-07-27.test.txt
- test set scores: opus-2020-07-27.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.bel-rus.bel.rus | 57.1 | 0.758 |
Tatoeba-test.bel-ukr.bel.ukr | 55.5 | 0.751 |
Tatoeba-test.multi.multi | 58.0 | 0.742 |
Tatoeba-test.orv-rus.orv.rus | 5.8 | 0.226 |
Tatoeba-test.orv-ukr.orv.ukr | 2.5 | 0.161 |
Tatoeba-test.rus-bel.rus.bel | 50.5 | 0.714 |
Tatoeba-test.rus-orv.rus.orv | 0.3 | 0.129 |
Tatoeba-test.rus-ukr.rus.ukr | 63.9 | 0.794 |
Tatoeba-test.ukr-bel.ukr.bel | 51.3 | 0.719 |
Tatoeba-test.ukr-orv.ukr.orv | 0.3 | 0.106 |
Tatoeba-test.ukr-rus.ukr.rus | 68.7 | 0.825 |
- dataset: opus
- model: transformer
- source language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
- target language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-09-26.zip
- test set translations: opus-2020-09-26.test.txt
- test set scores: opus-2020-09-26.eval.txt
testset | BLEU | chr-F |
---|---|---|
newstest2012-engrus.eng.rus | 26.4 | 0.538 |
newstest2012-ruseng.rus.eng | 29.8 | 0.568 |
newstest2013-engrus.eng.rus | 20.1 | 0.479 |
newstest2013-ruseng.rus.eng | 23.8 | 0.512 |
newstest2014-ruen-ruseng.rus.eng | 26.4 | 0.550 |
newstest2015-enru-engrus.eng.rus | 22.5 | 0.515 |
newstest2015-enru-ruseng.rus.eng | 25.1 | 0.528 |
newstest2016-enru-engrus.eng.rus | 21.5 | 0.502 |
newstest2016-enru-ruseng.rus.eng | 23.8 | 0.521 |
newstest2017-enru-engrus.eng.rus | 23.5 | 0.521 |
newstest2017-enru-ruseng.rus.eng | 27.3 | 0.546 |
newstest2018-enru-engrus.eng.rus | 20.2 | 0.506 |
newstest2018-enru-ruseng.rus.eng | 23.1 | 0.517 |
newstest2019-enru-engrus.eng.rus | 22.0 | 0.484 |
newstest2019-ruen-ruseng.rus.eng | 26.0 | 0.536 |
Tatoeba-test.bel-eng.bel.eng | 36.9 | 0.565 |
Tatoeba-test.bel-rus.bel.rus | 57.3 | 0.755 |
Tatoeba-test.bel-ukr.bel.ukr | 54.3 | 0.743 |
Tatoeba-test.eng-bel.eng.bel | 19.2 | 0.450 |
Tatoeba-test.eng-orv.eng.orv | 0.3 | 0.137 |
Tatoeba-test.eng-rue.eng.rue | 0.3 | 0.149 |
Tatoeba-test.eng-rus.eng.rus | 38.5 | 0.595 |
Tatoeba-test.eng-ukr.eng.ukr | 36.5 | 0.577 |
Tatoeba-test.multi.multi | 47.5 | 0.655 |
Tatoeba-test.orv-eng.orv.eng | 6.8 | 0.209 |
Tatoeba-test.orv-rus.orv.rus | 4.8 | 0.226 |
Tatoeba-test.orv-ukr.orv.ukr | 3.7 | 0.180 |
Tatoeba-test.rue-eng.rue.eng | 17.7 | 0.353 |
Tatoeba-test.rus-bel.rus.bel | 44.0 | 0.658 |
Tatoeba-test.rus-eng.rus.eng | 50.1 | 0.659 |
Tatoeba-test.rus-orv.rus.orv | 0.2 | 0.153 |
Tatoeba-test.rus-ukr.rus.ukr | 63.1 | 0.789 |
Tatoeba-test.ukr-bel.ukr.bel | 47.6 | 0.683 |
Tatoeba-test.ukr-eng.ukr.eng | 49.3 | 0.653 |
Tatoeba-test.ukr-orv.ukr.orv | 0.4 | 0.136 |
Tatoeba-test.ukr-rus.ukr.rus | 68.9 | 0.824 |
- dataset: opus
- model: transformer
- source language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
- target language(s): bel bel_Latn eng orv_Cyrl rue rus ukr
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-10-04.zip
- test set translations: opus-2020-10-04.test.txt
- test set scores: opus-2020-10-04.eval.txt
testset | BLEU | chr-F |
---|---|---|
newstest2012-engrus.eng.rus | 26.4 | 0.538 |
newstest2012-ruseng.rus.eng | 29.6 | 0.568 |
newstest2013-engrus.eng.rus | 20.4 | 0.481 |
newstest2013-ruseng.rus.eng | 23.8 | 0.513 |
newstest2014-ruen-ruseng.rus.eng | 26.4 | 0.550 |
newstest2015-enru-engrus.eng.rus | 22.4 | 0.515 |
newstest2015-enru-ruseng.rus.eng | 25.2 | 0.529 |
newstest2016-enru-engrus.eng.rus | 21.6 | 0.505 |
newstest2016-enru-ruseng.rus.eng | 23.9 | 0.521 |
newstest2017-enru-engrus.eng.rus | 23.6 | 0.524 |
newstest2017-enru-ruseng.rus.eng | 27.5 | 0.546 |
newstest2018-enru-engrus.eng.rus | 20.4 | 0.509 |
newstest2018-enru-ruseng.rus.eng | 23.3 | 0.518 |
newstest2019-enru-engrus.eng.rus | 22.3 | 0.487 |
newstest2019-ruen-ruseng.rus.eng | 26.0 | 0.537 |
Tatoeba-test.bel-eng.bel.eng | 36.7 | 0.562 |
Tatoeba-test.bel-rus.bel.rus | 56.9 | 0.753 |
Tatoeba-test.bel-ukr.bel.ukr | 54.5 | 0.742 |
Tatoeba-test.eng-bel.eng.bel | 19.3 | 0.452 |
Tatoeba-test.eng-orv.eng.orv | 0.3 | 0.138 |
Tatoeba-test.eng-rue.eng.rue | 0.8 | 0.152 |
Tatoeba-test.eng-rus.eng.rus | 38.9 | 0.597 |
Tatoeba-test.eng-ukr.eng.ukr | 36.3 | 0.578 |
Tatoeba-test.multi.multi | 47.5 | 0.656 |
Tatoeba-test.orv-eng.orv.eng | 6.3 | 0.210 |
Tatoeba-test.orv-rus.orv.rus | 5.3 | 0.224 |
Tatoeba-test.orv-ukr.orv.ukr | 3.1 | 0.175 |
Tatoeba-test.rue-eng.rue.eng | 18.1 | 0.351 |
Tatoeba-test.rus-bel.rus.bel | 44.0 | 0.659 |
Tatoeba-test.rus-eng.rus.eng | 50.4 | 0.662 |
Tatoeba-test.rus-orv.rus.orv | 0.2 | 0.149 |
Tatoeba-test.rus-ukr.rus.ukr | 63.0 | 0.789 |
Tatoeba-test.ukr-bel.ukr.bel | 46.7 | 0.680 |
Tatoeba-test.ukr-eng.ukr.eng | 49.5 | 0.655 |
Tatoeba-test.ukr-orv.ukr.orv | 0.4 | 0.138 |
Tatoeba-test.ukr-rus.ukr.rus | 68.8 | 0.824 |
- dataset: opus
- model: transformer
- source language(s): bel orv rus ukr
- target language(s): bel orv rus ukr
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>eng<< >>ukr<< >>rus<< >>bel<< >>bel_Latn<<
- download: opus-2021-02-23.zip
- test set translations: opus-2021-02-23.test.txt
- test set scores: opus-2021-02-23.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newstest2012.eng-rus | 26.4 | 0.538 | 3003 | 64830 | 0.999 |
newstest2012.rus-eng | 29.6 | 0.568 | 3003 | 72812 | 0.983 |
newstest2013.eng-rus | 20.4 | 0.481 | 3000 | 58560 | 0.989 |
newstest2013.rus-eng | 23.8 | 0.513 | 3000 | 64505 | 0.991 |
newstest2014-ruen.rus-eng | 26.4 | 0.550 | 3003 | 69190 | 0.994 |
newstest2015-enru.eng-rus | 22.4 | 0.515 | 2818 | 55915 | 1.000 |
newstest2015-enru.rus-eng | 25.2 | 0.529 | 2818 | 64744 | 0.953 |
newstest2016-enru.eng-rus | 21.6 | 0.505 | 2998 | 62018 | 1.000 |
newstest2016-enru.rus-eng | 23.9 | 0.521 | 2998 | 69278 | 0.983 |
newstest2017-enru.eng-rus | 23.6 | 0.524 | 3001 | 60255 | 1.000 |
newstest2017-enru.rus-eng | 27.5 | 0.546 | 3001 | 69033 | 0.968 |
newstest2018-enru.eng-rus | 20.4 | 0.509 | 3000 | 61920 | 1.000 |
newstest2018-enru.rus-eng | 23.3 | 0.518 | 3000 | 71723 | 0.969 |
newstest2019-enru.eng-rus | 22.3 | 0.487 | 1997 | 48153 | 0.937 |
newstest2019-ruen.rus-eng | 26.0 | 0.537 | 2000 | 42875 | 0.964 |
Tatoeba-test.bel_Latn-rus | 1.3 | 0.138 | 6 | 60 | 0.838 |
Tatoeba-test.bel_Latn-ukr | 2.7 | 0.135 | 8 | 61 | 0.878 |
Tatoeba-test.bel-rus | 56.5 | 0.751 | 2500 | 18815 | 0.984 |
Tatoeba-test.bel-ukr | 54.4 | 0.742 | 2355 | 15138 | 0.999 |
Tatoeba-test.multi-multi | 55.0 | 0.730 | 10000 | 63367 | 1.000 |
Tatoeba-test.orv-rus | 5.3 | 0.224 | 171 | 1259 | 0.994 |
Tatoeba-test.orv-ukr | 3.1 | 0.175 | 973 | 5423 | 1.000 |
Tatoeba-test.rus-bel | 44.0 | 0.657 | 2500 | 18750 | 0.999 |
Tatoeba-test.rus-bel_Latn | 1.0 | 0.008 | 6 | 64 | 0.571 |
Tatoeba-test.rus-orv | 0.2 | 0.150 | 171 | 1174 | 1.000 |
Tatoeba-test.rus-ukr | 62.6 | 0.786 | 10000 | 59963 | 0.993 |
Tatoeba-test.ukr-bel | 46.6 | 0.679 | 2355 | 15166 | 1.000 |
Tatoeba-test.ukr-bel_Latn | 1.4 | 0.007 | 8 | 61 | 0.897 |
Tatoeba-test.ukr-orv | 0.4 | 0.138 | 973 | 5037 | 1.000 |
Tatoeba-test.ukr-rus | 68.2 | 0.821 | 10000 | 60129 | 0.995 |
tico19-test.eng-rus | 20.2 | 0.484 | 2100 | 55837 | 0.921 |
- dataset: opusTCv20210807+bt
- model: transformer-big
- source language(s): bel bel_Latn orv_Cyrl rus ukr
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): bel orv rus ukr
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+bt_transformer-big_2022-03-07.zip
- test set translations: opusTCv20210807+bt_transformer-big_2022-03-07.test.txt
- test set scores: opusTCv20210807+bt_transformer-big_2022-03-07.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
Tatoeba-test-v2021-08-07.bel_Latn-rus | 0.9 | 3.687 | 6 | 60 | 1.000 |
Tatoeba-test-v2021-08-07.bel_Latn-ukr | 2.0 | 3.999 | 8 | 61 | 1.000 |
Tatoeba-test-v2021-08-07.bel-rus | 68.1 | 0.82181 | 2500 | 18815 | 0.998 |
Tatoeba-test-v2021-08-07.bel-ukr | 65.4 | 0.80926 | 2355 | 15138 | 1.000 |
Tatoeba-test-v2021-08-07.multi-multi | 67.9 | 0.81487 | 10000 | 64058 | 1.000 |
Tatoeba-test-v2021-08-07.orv-rus | 5.5 | 0.24450 | 171 | 1259 | 0.931 |
Tatoeba-test-v2021-08-07.orv-ukr | 2.7 | 0.18156 | 973 | 5423 | 1.000 |
Tatoeba-test-v2021-08-07.rus-bel | 50.4 | 0.67310 | 2500 | 18750 | 1.000 |
Tatoeba-test-v2021-08-07.rus-bel_Latn | 1.2 | 5.017 | 6 | 64 | 0.984 |
Tatoeba-test-v2021-08-07.rus-orv | 0.4 | 0.18013 | 171 | 1174 | 1.000 |
Tatoeba-test-v2021-08-07.rus-rus | 40.2 | 0.62962 | 2500 | 16799 | 0.986 |
Tatoeba-test-v2021-08-07.rus-ukr | 69.9 | 0.83606 | 10000 | 59963 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-bel | 58.6 | 0.75005 | 2355 | 15166 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-bel_Latn | 2.2 | 5.440 | 8 | 61 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-orv | 0.5 | 0.14586 | 973 | 5037 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-rus | 75.3 | 0.86640 | 10000 | 60129 | 0.999 |
Tatoeba-test-v2021-08-07.ukr-ukr | 32.6 | 0.58763 | 824 | 4198 | 1.000 |
- dataset: opusTCv20210807+bt
- model: transformer-big
- source language(s): bel bel_Latn orv_Cyrl rus ukr
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): bel orv rus ukr
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+bt_transformer-big_2022-03-12.zip
- test set translations: opusTCv20210807+bt_transformer-big_2022-03-12.test.txt
- test set scores: opusTCv20210807+bt_transformer-big_2022-03-12.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
Tatoeba-test-v2021-08-07.bel_Latn-rus | 0.9 | 5.113 | 6 | 60 | 1.000 |
Tatoeba-test-v2021-08-07.bel_Latn-ukr | 2.3 | 8.819 | 8 | 61 | 1.000 |
Tatoeba-test-v2021-08-07.bel-rus | 68.4 | 0.82205 | 2500 | 18815 | 0.995 |
Tatoeba-test-v2021-08-07.bel-ukr | 64.7 | 0.80092 | 2355 | 15138 | 1.000 |
Tatoeba-test-v2021-08-07.multi-multi | 68.0 | 0.81499 | 10000 | 64058 | 1.000 |
Tatoeba-test-v2021-08-07.orv-rus | 5.6 | 0.25954 | 171 | 1259 | 0.955 |
Tatoeba-test-v2021-08-07.orv-ukr | 3.3 | 0.18942 | 973 | 5423 | 1.000 |
Tatoeba-test-v2021-08-07.rus-bel | 50.6 | 0.67028 | 2500 | 18750 | 1.000 |
Tatoeba-test-v2021-08-07.rus-bel_Latn | 1.2 | 5.021 | 6 | 64 | 0.984 |
Tatoeba-test-v2021-08-07.rus-orv | 0.4 | 0.17795 | 171 | 1174 | 1.000 |
Tatoeba-test-v2021-08-07.rus-rus | 40.8 | 0.63458 | 2500 | 16799 | 0.987 |
Tatoeba-test-v2021-08-07.rus-ukr | 70.1 | 0.83647 | 10000 | 59963 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-bel | 58.4 | 0.74512 | 2355 | 15166 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-bel_Latn | 2.0 | 5.239 | 8 | 61 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-orv | 0.5 | 0.14517 | 973 | 5037 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-rus | 75.5 | 0.86852 | 10000 | 60129 | 1.000 |
Tatoeba-test-v2021-08-07.ukr-ukr | 33.5 | 0.59463 | 824 | 4198 | 1.000 |