- dataset: opusTCv20210807+bt
- model: transformer-big
- source language(s): ces dsb hsb pol
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): ces dsb hsb pol
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+bt_transformer-big_2022-03-07.zip
- test set translations: opusTCv20210807+bt_transformer-big_2022-03-07.test.txt
- test set scores: opusTCv20210807+bt_transformer-big_2022-03-07.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newstest2012.ces-rus | 20.6 | 0.49166 | 3003 | 64830 | 0.997 |
newstest2013.ces-rus | 26.8 | 0.53763 | 3000 | 58560 | 0.973 |
Tatoeba-test-v2021-08-07.ces-bel | 33.8 | 0.51349 | 31 | 181 | 0.966 |
Tatoeba-test-v2021-08-07.ces-rus | 55.0 | 0.72626 | 2934 | 17743 | 0.985 |
Tatoeba-test-v2021-08-07.ces-ukr | 51.0 | 0.68280 | 1787 | 8854 | 0.997 |
Tatoeba-test-v2021-08-07.dsb-rus | 26.9 | 0.48948 | 24 | 124 | 1.000 |
Tatoeba-test-v2021-08-07.dsb-ukr | 8.8 | 0.34208 | 3 | 13 | 1.000 |
Tatoeba-test-v2021-08-07.hsb-rus | 19.4 | 0.40929 | 38 | 281 | 0.859 |
Tatoeba-test-v2021-08-07.hsb-ukr | 3.5 | 0.16605 | 8 | 126 | 0.585 |
Tatoeba-test-v2021-08-07.multi-multi | 52.8 | 0.70695 | 10000 | 58091 | 0.987 |
Tatoeba-test-v2021-08-07.pol-bel | 28.7 | 0.51885 | 287 | 1730 | 1.000 |
Tatoeba-test-v2021-08-07.pol-bel_Latn | 3.8 | 0.847 | 2 | 16 | 0.794 |
Tatoeba-test-v2021-08-07.pol-orv | 4.5 | 0.24322 | 7 | 31 | 1.000 |
Tatoeba-test-v2021-08-07.pol-rus | 54.5 | 0.72518 | 3543 | 21971 | 0.992 |
Tatoeba-test-v2021-08-07.pol-ukr | 48.1 | 0.67885 | 2519 | 13493 | 0.998 |
- dataset: opusTCv20210807+bt
- model: transformer-big
- source language(s): ces dsb hsb pol
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): ces dsb hsb pol
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+bt_transformer-big_2022-03-19.zip
- test set translations: opusTCv20210807+bt_transformer-big_2022-03-19.test.txt
- test set scores: opusTCv20210807+bt_transformer-big_2022-03-19.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newstest2012.ces-rus | 21.0 | 0.49476 | 3003 | 64830 | 0.989 |
newstest2013.ces-rus | 27.2 | 0.54197 | 3000 | 58560 | 0.963 |
Tatoeba-test-v2021-08-07.ces-bel | 33.7 | 0.51728 | 31 | 181 | 1.000 |
Tatoeba-test-v2021-08-07.ces-rus | 55.6 | 0.72736 | 2934 | 17743 | 0.979 |
Tatoeba-test-v2021-08-07.ces-ukr | 52.5 | 0.69628 | 1787 | 8854 | 0.997 |
Tatoeba-test-v2021-08-07.dsb-rus | 23.4 | 0.48769 | 24 | 124 | 1.000 |
Tatoeba-test-v2021-08-07.dsb-ukr | 9.7 | 0.29332 | 3 | 13 | 1.000 |
Tatoeba-test-v2021-08-07.hsb-rus | 26.9 | 0.45080 | 38 | 281 | 0.907 |
Tatoeba-test-v2021-08-07.hsb-ukr | 7.8 | 0.36314 | 8 | 126 | 0.701 |
Tatoeba-test-v2021-08-07.multi-multi | 53.3 | 0.71039 | 10000 | 58091 | 0.985 |
Tatoeba-test-v2021-08-07.pol-bel | 28.9 | 0.50513 | 287 | 1730 | 0.994 |
Tatoeba-test-v2021-08-07.pol-bel_Latn | 4.0 | 1.122 | 2 | 16 | 0.936 |
Tatoeba-test-v2021-08-07.pol-orv | 4.1 | 0.25181 | 7 | 31 | 1.000 |
Tatoeba-test-v2021-08-07.pol-rus | 54.3 | 0.72718 | 3543 | 21971 | 0.989 |
Tatoeba-test-v2021-08-07.pol-ukr | 48.4 | 0.68147 | 2519 | 13493 | 0.997 |
- dataset: opusTCv20210807+bt
- model: transformer-big
- source language(s): ces dsb hsb pol
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): ces dsb hsb pol
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+bt_transformer-big_2022-03-23.zip
- test set translations: opusTCv20210807+bt_transformer-big_2022-03-23.test.txt
- test set scores: opusTCv20210807+bt_transformer-big_2022-03-23.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newstest2012.ces-rus | 21.2 | 0.49627 | 3003 | 64830 | 0.987 |
newstest2013.ces-rus | 27.5 | 0.54297 | 3000 | 58560 | 0.962 |
Tatoeba-test-v2021-08-07.ces-bel | 32.7 | 0.50195 | 31 | 181 | 1.000 |
Tatoeba-test-v2021-08-07.ces-rus | 55.6 | 0.72859 | 2934 | 17743 | 0.979 |
Tatoeba-test-v2021-08-07.ces-ukr | 52.3 | 0.69750 | 1787 | 8854 | 0.996 |
Tatoeba-test-v2021-08-07.dsb-rus | 23.9 | 0.47404 | 24 | 124 | 1.000 |
Tatoeba-test-v2021-08-07.dsb-ukr | 9.4 | 0.27363 | 3 | 13 | 1.000 |
Tatoeba-test-v2021-08-07.hsb-rus | 36.2 | 0.47720 | 38 | 281 | 0.945 |
Tatoeba-test-v2021-08-07.hsb-ukr | 23.0 | 0.38722 | 8 | 126 | 0.800 |
Tatoeba-test-v2021-08-07.multi-multi | 53.5 | 0.71203 | 10000 | 58091 | 0.985 |
Tatoeba-test-v2021-08-07.pol-bel | 28.2 | 0.49950 | 287 | 1730 | 0.999 |
Tatoeba-test-v2021-08-07.pol-bel_Latn | 3.5 | 4.508 | 2 | 16 | 0.717 |
Tatoeba-test-v2021-08-07.pol-orv | 4.0 | 0.24981 | 7 | 31 | 1.000 |
Tatoeba-test-v2021-08-07.pol-rus | 54.6 | 0.72814 | 3543 | 21971 | 0.988 |
Tatoeba-test-v2021-08-07.pol-ukr | 49.0 | 0.68496 | 2519 | 13493 | 0.997 |
- dataset: opusTCv20210807+xb+bt
- model: transformer-big
- source language(s): ces dsb hsb pol slk
- target language(s): bel bel_Latn orv_Cyrl rus ukr
- raw source language(s): ces dsb hsb pol slk
- raw target language(s): bel orv rus ukr
- model: transformer-big
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels:
- download: opusTCv20210807+xb+bt_transformer-big_2022-05-08.zip
- test set translations: opusTCv20210807+xb+bt_transformer-big_2022-05-08.test.txt
- test set scores: opusTCv20210807+xb+bt_transformer-big_2022-05-08.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
flores101.ces-ukr | 23.5 | 0.53099 | 1012 | 22810 | 0.967 |
flores101.slk-ukr | 22.7 | 0.52645 | 1012 | 22810 | 0.987 |
newstest2012.ces-rus | 21.3 | 0.49883 | 3003 | 64830 | 0.992 |
newstest2013.ces-rus | 27.9 | 0.54568 | 3000 | 58560 | 0.965 |
Tatoeba-test-v2021-08-07.multi-multi | 53.5 | 0.71063 | 10000 | 58091 | 0.984 |