- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin jdt_Cyrl kur_Arab kur_Latn mai mar npi ori oss pan_Guru pes pes_Latn pes_Thaa pnb pus rom san_Deva sin snd_Arab tgk_Cyrl tly_Latn urd zza
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-14.zip
- test set translations: opus-2020-07-14.test.txt
- test set scores: opus-2020-07-14.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-asm.eng.asm | 1.7 | 0.181 |
Tatoeba-test.eng-awa.eng.awa | 0.2 | 0.041 |
Tatoeba-test.eng-ben.eng.ben | 14.6 | 0.440 |
Tatoeba-test.eng-bho.eng.bho | 0.4 | 0.101 |
Tatoeba-test.eng-fas.eng.fas | 2.9 | 0.216 |
Tatoeba-test.eng-guj.eng.guj | 14.8 | 0.346 |
Tatoeba-test.eng-hif.eng.hif | 1.1 | 0.090 |
Tatoeba-test.eng-hin.eng.hin | 16.1 | 0.445 |
Tatoeba-test.eng-jdt.eng.jdt | 8.0 | 0.016 |
Tatoeba-test.eng-kok.eng.kok | 4.1 | 0.006 |
Tatoeba-test.eng-kur.eng.kur | 3.8 | 0.118 |
Tatoeba-test.eng-lah.eng.lah | 0.4 | 0.033 |
Tatoeba-test.eng-mai.eng.mai | 10.9 | 0.398 |
Tatoeba-test.eng-mar.eng.mar | 18.6 | 0.445 |
Tatoeba-test.eng.multi | 12.7 | 0.374 |
Tatoeba-test.eng-nep.eng.nep | 0.7 | 0.028 |
Tatoeba-test.eng-ori.eng.ori | 1.4 | 0.185 |
Tatoeba-test.eng-oss.eng.oss | 2.1 | 0.203 |
Tatoeba-test.eng-pan.eng.pan | 5.3 | 0.322 |
Tatoeba-test.eng-pus.eng.pus | 0.4 | 0.109 |
Tatoeba-test.eng-rom.eng.rom | 0.9 | 0.213 |
Tatoeba-test.eng-san.eng.san | 0.9 | 0.093 |
Tatoeba-test.eng-sin.eng.sin | 10.8 | 0.370 |
Tatoeba-test.eng-snd.eng.snd | 2.4 | 0.251 |
Tatoeba-test.eng-tgk.eng.tgk | 6.5 | 0.328 |
Tatoeba-test.eng-tly.eng.tly | 0.6 | 0.018 |
Tatoeba-test.eng-urd.eng.urd | 10.9 | 0.387 |
Tatoeba-test.eng-zza.eng.zza | 0.6 | 0.033 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin jdt_Cyrl kur_Arab kur_Latn mai mar npi ori oss pan_Guru pes pes_Latn pes_Thaa pnb pus rom san_Deva sin snd_Arab tgk_Cyrl tly_Latn urd zza
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-19.zip
- test set translations: opus-2020-07-19.test.txt
- test set scores: opus-2020-07-19.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-asm.eng.asm | 1.7 | 0.194 |
Tatoeba-test.eng-awa.eng.awa | 0.2 | 0.031 |
Tatoeba-test.eng-ben.eng.ben | 14.9 | 0.447 |
Tatoeba-test.eng-bho.eng.bho | 0.4 | 0.110 |
Tatoeba-test.eng-fas.eng.fas | 3.3 | 0.219 |
Tatoeba-test.eng-guj.eng.guj | 17.3 | 0.366 |
Tatoeba-test.eng-hif.eng.hif | 1.0 | 0.079 |
Tatoeba-test.eng-hin.eng.hin | 16.5 | 0.451 |
Tatoeba-test.eng-jdt.eng.jdt | 9.5 | 0.136 |
Tatoeba-test.eng-kok.eng.kok | 8.1 | 0.040 |
Tatoeba-test.eng-kur.eng.kur | 3.1 | 0.123 |
Tatoeba-test.eng-lah.eng.lah | 0.9 | 0.036 |
Tatoeba-test.eng-mai.eng.mai | 9.8 | 0.374 |
Tatoeba-test.eng-mar.eng.mar | 19.3 | 0.456 |
Tatoeba-test.eng.multi | 13.1 | 0.380 |
Tatoeba-test.eng-nep.eng.nep | 0.9 | 0.037 |
Tatoeba-test.eng-ori.eng.ori | 1.3 | 0.190 |
Tatoeba-test.eng-oss.eng.oss | 2.2 | 0.194 |
Tatoeba-test.eng-pan.eng.pan | 8.5 | 0.337 |
Tatoeba-test.eng-pus.eng.pus | 1.0 | 0.123 |
Tatoeba-test.eng-rom.eng.rom | 1.3 | 0.221 |
Tatoeba-test.eng-san.eng.san | 1.0 | 0.106 |
Tatoeba-test.eng-sin.eng.sin | 10.8 | 0.382 |
Tatoeba-test.eng-snd.eng.snd | 2.8 | 0.205 |
Tatoeba-test.eng-tgk.eng.tgk | 6.9 | 0.324 |
Tatoeba-test.eng-tly.eng.tly | 0.6 | 0.024 |
Tatoeba-test.eng-urd.eng.urd | 11.8 | 0.396 |
Tatoeba-test.eng-zza.eng.zza | 0.5 | 0.033 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin jdt_Cyrl kur_Arab kur_Latn mai mar npi ori oss pan_Guru pes pes_Latn pes_Thaa pnb pus rom san_Deva sin snd_Arab tgk_Cyrl tly_Latn urd zza
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-26.zip
- test set translations: opus-2020-07-26.test.txt
- test set scores: opus-2020-07-26.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 6.8 | 0.318 |
newsdev2019-engu-engguj.eng.guj | 5.7 | 0.274 |
newstest2014-hien-enghin.eng.hin | 9.9 | 0.344 |
newstest2019-engu-engguj.eng.guj | 6.5 | 0.284 |
Tatoeba-test.eng-asm.eng.asm | 2.3 | 0.219 |
Tatoeba-test.eng-awa.eng.awa | 0.3 | 0.026 |
Tatoeba-test.eng-ben.eng.ben | 15.3 | 0.454 |
Tatoeba-test.eng-bho.eng.bho | 0.3 | 0.078 |
Tatoeba-test.eng-fas.eng.fas | 3.5 | 0.222 |
Tatoeba-test.eng-guj.eng.guj | 17.7 | 0.367 |
Tatoeba-test.eng-hif.eng.hif | 1.1 | 0.078 |
Tatoeba-test.eng-hin.eng.hin | 16.7 | 0.455 |
Tatoeba-test.eng-jdt.eng.jdt | 0.8 | 0.000 |
Tatoeba-test.eng-kok.eng.kok | 6.6 | 0.006 |
Tatoeba-test.eng-kur.eng.kur | 2.6 | 0.113 |
Tatoeba-test.eng-lah.eng.lah | 0.7 | 0.092 |
Tatoeba-test.eng-mai.eng.mai | 9.8 | 0.371 |
Tatoeba-test.eng-mar.eng.mar | 19.8 | 0.462 |
Tatoeba-test.eng.multi | 13.3 | 0.384 |
Tatoeba-test.eng-nep.eng.nep | 0.4 | 0.013 |
Tatoeba-test.eng-ori.eng.ori | 1.4 | 0.209 |
Tatoeba-test.eng-oss.eng.oss | 2.3 | 0.179 |
Tatoeba-test.eng-pan.eng.pan | 6.9 | 0.329 |
Tatoeba-test.eng-pus.eng.pus | 1.5 | 0.122 |
Tatoeba-test.eng-rom.eng.rom | 1.8 | 0.224 |
Tatoeba-test.eng-san.eng.san | 1.5 | 0.108 |
Tatoeba-test.eng-sin.eng.sin | 9.9 | 0.378 |
Tatoeba-test.eng-snd.eng.snd | 4.5 | 0.337 |
Tatoeba-test.eng-tgk.eng.tgk | 7.1 | 0.337 |
Tatoeba-test.eng-tly.eng.tly | 0.4 | 0.015 |
Tatoeba-test.eng-urd.eng.urd | 11.9 | 0.398 |
Tatoeba-test.eng-zza.eng.zza | 0.4 | 0.026 |
- dataset: opus2m
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin jdt_Cyrl kur_Arab kur_Latn mai mar npi ori oss pan_Guru pes pes_Latn pes_Thaa pnb pus rom san_Deva sin snd_Arab tgk_Cyrl tly_Latn urd zza
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus2m-2020-08-01.zip
- test set translations: opus2m-2020-08-01.test.txt
- test set scores: opus2m-2020-08-01.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 6.7 | 0.326 |
newsdev2019-engu-engguj.eng.guj | 6.0 | 0.283 |
newstest2014-hien-enghin.eng.hin | 10.4 | 0.353 |
newstest2019-engu-engguj.eng.guj | 6.6 | 0.282 |
Tatoeba-test.eng-asm.eng.asm | 2.7 | 0.249 |
Tatoeba-test.eng-awa.eng.awa | 0.4 | 0.122 |
Tatoeba-test.eng-ben.eng.ben | 15.3 | 0.459 |
Tatoeba-test.eng-bho.eng.bho | 3.7 | 0.161 |
Tatoeba-test.eng-fas.eng.fas | 3.4 | 0.227 |
Tatoeba-test.eng-guj.eng.guj | 18.5 | 0.365 |
Tatoeba-test.eng-hif.eng.hif | 1.0 | 0.064 |
Tatoeba-test.eng-hin.eng.hin | 17.0 | 0.461 |
Tatoeba-test.eng-jdt.eng.jdt | 3.9 | 0.122 |
Tatoeba-test.eng-kok.eng.kok | 5.5 | 0.059 |
Tatoeba-test.eng-kur.eng.kur | 4.0 | 0.125 |
Tatoeba-test.eng-lah.eng.lah | 0.3 | 0.008 |
Tatoeba-test.eng-mai.eng.mai | 9.3 | 0.445 |
Tatoeba-test.eng-mar.eng.mar | 20.7 | 0.473 |
Tatoeba-test.eng.multi | 13.7 | 0.392 |
Tatoeba-test.eng-nep.eng.nep | 0.6 | 0.060 |
Tatoeba-test.eng-ori.eng.ori | 2.4 | 0.193 |
Tatoeba-test.eng-oss.eng.oss | 2.1 | 0.174 |
Tatoeba-test.eng-pan.eng.pan | 9.7 | 0.355 |
Tatoeba-test.eng-pus.eng.pus | 1.0 | 0.126 |
Tatoeba-test.eng-rom.eng.rom | 1.3 | 0.230 |
Tatoeba-test.eng-san.eng.san | 1.3 | 0.101 |
Tatoeba-test.eng-sin.eng.sin | 11.7 | 0.384 |
Tatoeba-test.eng-snd.eng.snd | 2.8 | 0.180 |
Tatoeba-test.eng-tgk.eng.tgk | 8.1 | 0.353 |
Tatoeba-test.eng-tly.eng.tly | 0.5 | 0.015 |
Tatoeba-test.eng-urd.eng.urd | 12.3 | 0.409 |
Tatoeba-test.eng-zza.eng.zza | 0.5 | 0.025 |