- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-14.zip
- test set translations: opus-2020-07-14.test.txt
- test set scores: opus-2020-07-14.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-arg.eng.arg | 1.5 | 0.120 |
Tatoeba-test.eng-ast.eng.ast | 17.1 | 0.384 |
Tatoeba-test.eng-cat.eng.cat | 47.1 | 0.666 |
Tatoeba-test.eng-cos.eng.cos | 3.1 | 0.274 |
Tatoeba-test.eng-egl.eng.egl | 0.2 | 0.105 |
Tatoeba-test.eng-ext.eng.ext | 4.9 | 0.243 |
Tatoeba-test.eng-fra.eng.fra | 44.1 | 0.629 |
Tatoeba-test.eng-frm.eng.frm | 1.2 | 0.207 |
Tatoeba-test.eng-gcf.eng.gcf | 0.3 | 0.092 |
Tatoeba-test.eng-glg.eng.glg | 43.1 | 0.635 |
Tatoeba-test.eng-hat.eng.hat | 28.3 | 0.509 |
Tatoeba-test.eng-ita.eng.ita | 44.8 | 0.669 |
Tatoeba-test.eng-lad.eng.lad | 5.2 | 0.276 |
Tatoeba-test.eng-lat.eng.lat | 11.9 | 0.376 |
Tatoeba-test.eng-lij.eng.lij | 1.3 | 0.172 |
Tatoeba-test.eng-lld.eng.lld | 0.9 | 0.211 |
Tatoeba-test.eng-lmo.eng.lmo | 0.3 | 0.150 |
Tatoeba-test.eng-mfe.eng.mfe | 68.0 | 0.848 |
Tatoeba-test.eng.multi | 37.2 | 0.583 |
Tatoeba-test.eng-mwl.eng.mwl | 2.7 | 0.356 |
Tatoeba-test.eng-oci.eng.oci | 7.7 | 0.286 |
Tatoeba-test.eng-pap.eng.pap | 43.9 | 0.641 |
Tatoeba-test.eng-pms.eng.pms | 1.8 | 0.177 |
Tatoeba-test.eng-por.eng.por | 40.7 | 0.632 |
Tatoeba-test.eng-roh.eng.roh | 2.2 | 0.247 |
Tatoeba-test.eng-ron.eng.ron | 39.7 | 0.626 |
Tatoeba-test.eng-scn.eng.scn | 0.7 | 0.132 |
Tatoeba-test.eng-spa.eng.spa | 48.8 | 0.679 |
Tatoeba-test.eng-vec.eng.vec | 2.2 | 0.222 |
Tatoeba-test.eng-wln.eng.wln | 6.2 | 0.213 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-20.zip
- test set translations: opus-2020-07-20.test.txt
- test set scores: opus-2020-07-20.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-arg.eng.arg | 1.5 | 0.117 |
Tatoeba-test.eng-ast.eng.ast | 17.7 | 0.382 |
Tatoeba-test.eng-cat.eng.cat | 47.4 | 0.665 |
Tatoeba-test.eng-cos.eng.cos | 3.1 | 0.297 |
Tatoeba-test.eng-egl.eng.egl | 0.9 | 0.113 |
Tatoeba-test.eng-ext.eng.ext | 7.9 | 0.277 |
Tatoeba-test.eng-fra.eng.fra | 44.6 | 0.632 |
Tatoeba-test.eng-frm.eng.frm | 1.1 | 0.214 |
Tatoeba-test.eng-gcf.eng.gcf | 0.4 | 0.101 |
Tatoeba-test.eng-glg.eng.glg | 43.1 | 0.638 |
Tatoeba-test.eng-hat.eng.hat | 30.0 | 0.528 |
Tatoeba-test.eng-ita.eng.ita | 45.0 | 0.670 |
Tatoeba-test.eng-lad.eng.lad | 6.2 | 0.285 |
Tatoeba-test.eng-lat.eng.lat | 11.9 | 0.376 |
Tatoeba-test.eng-lij.eng.lij | 1.7 | 0.189 |
Tatoeba-test.eng-lld.eng.lld | 0.5 | 0.201 |
Tatoeba-test.eng-lmo.eng.lmo | 0.8 | 0.192 |
Tatoeba-test.eng-mfe.eng.mfe | 83.6 | 0.909 |
Tatoeba-test.eng-msa.eng.msa | 30.9 | 0.546 |
Tatoeba-test.eng.multi | 37.6 | 0.585 |
Tatoeba-test.eng-mwl.eng.mwl | 3.2 | 0.327 |
Tatoeba-test.eng-oci.eng.oci | 7.8 | 0.286 |
Tatoeba-test.eng-pap.eng.pap | 41.4 | 0.613 |
Tatoeba-test.eng-pms.eng.pms | 2.0 | 0.182 |
Tatoeba-test.eng-por.eng.por | 40.8 | 0.633 |
Tatoeba-test.eng-roh.eng.roh | 4.0 | 0.262 |
Tatoeba-test.eng-ron.eng.ron | 40.1 | 0.628 |
Tatoeba-test.eng-scn.eng.scn | 1.6 | 0.175 |
Tatoeba-test.eng-spa.eng.spa | 48.8 | 0.680 |
Tatoeba-test.eng-vec.eng.vec | 2.6 | 0.237 |
Tatoeba-test.eng-wln.eng.wln | 6.8 | 0.228 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-27.zip
- test set translations: opus-2020-07-27.test.txt
- test set scores: opus-2020-07-27.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-enro-engron.eng.ron | 26.9 | 0.562 |
newsdiscussdev2015-enfr-engfra.eng.fra | 29.7 | 0.572 |
newsdiscusstest2015-enfr-engfra.eng.fra | 34.9 | 0.607 |
newssyscomb2009-engfra.eng.fra | 27.6 | 0.565 |
newssyscomb2009-engita.eng.ita | 28.7 | 0.586 |
newssyscomb2009-engspa.eng.spa | 29.3 | 0.567 |
news-test2008-engfra.eng.fra | 25.0 | 0.535 |
news-test2008-engspa.eng.spa | 26.9 | 0.546 |
newstest2009-engfra.eng.fra | 26.3 | 0.555 |
newstest2009-engita.eng.ita | 28.4 | 0.581 |
newstest2009-engspa.eng.spa | 28.6 | 0.566 |
newstest2010-engfra.eng.fra | 29.2 | 0.572 |
newstest2010-engspa.eng.spa | 33.5 | 0.597 |
newstest2011-engfra.eng.fra | 30.7 | 0.589 |
newstest2011-engspa.eng.spa | 34.6 | 0.597 |
newstest2012-engfra.eng.fra | 29.0 | 0.572 |
newstest2012-engspa.eng.spa | 34.6 | 0.598 |
newstest2013-engfra.eng.fra | 29.6 | 0.563 |
newstest2013-engspa.eng.spa | 31.5 | 0.574 |
newstest2016-enro-engron.eng.ron | 25.4 | 0.544 |
Tatoeba-test.eng-arg.eng.arg | 1.6 | 0.126 |
Tatoeba-test.eng-ast.eng.ast | 18.0 | 0.399 |
Tatoeba-test.eng-cat.eng.cat | 47.7 | 0.669 |
Tatoeba-test.eng-cos.eng.cos | 2.9 | 0.284 |
Tatoeba-test.eng-egl.eng.egl | 0.2 | 0.076 |
Tatoeba-test.eng-ext.eng.ext | 11.0 | 0.280 |
Tatoeba-test.eng-fra.eng.fra | 44.5 | 0.632 |
Tatoeba-test.eng-frm.eng.frm | 0.8 | 0.214 |
Tatoeba-test.eng-gcf.eng.gcf | 0.4 | 0.108 |
Tatoeba-test.eng-glg.eng.glg | 43.7 | 0.641 |
Tatoeba-test.eng-hat.eng.hat | 29.6 | 0.525 |
Tatoeba-test.eng-ita.eng.ita | 45.0 | 0.670 |
Tatoeba-test.eng-lad.eng.lad | 6.2 | 0.286 |
Tatoeba-test.eng-lat.eng.lat | 11.9 | 0.377 |
Tatoeba-test.eng-lij.eng.lij | 1.7 | 0.178 |
Tatoeba-test.eng-lld.eng.lld | 0.8 | 0.201 |
Tatoeba-test.eng-lmo.eng.lmo | 1.1 | 0.201 |
Tatoeba-test.eng-mfe.eng.mfe | 91.9 | 0.956 |
Tatoeba-test.eng-msa.eng.msa | 30.9 | 0.546 |
Tatoeba-test.eng.multi | 37.5 | 0.585 |
Tatoeba-test.eng-mwl.eng.mwl | 3.8 | 0.339 |
Tatoeba-test.eng-oci.eng.oci | 7.7 | 0.290 |
Tatoeba-test.eng-pap.eng.pap | 42.0 | 0.626 |
Tatoeba-test.eng-pms.eng.pms | 2.0 | 0.184 |
Tatoeba-test.eng-por.eng.por | 41.0 | 0.634 |
Tatoeba-test.eng-roh.eng.roh | 3.8 | 0.245 |
Tatoeba-test.eng-ron.eng.ron | 40.4 | 0.630 |
Tatoeba-test.eng-scn.eng.scn | 1.6 | 0.177 |
Tatoeba-test.eng-spa.eng.spa | 48.9 | 0.681 |
Tatoeba-test.eng-vec.eng.vec | 3.1 | 0.232 |
Tatoeba-test.eng-wln.eng.wln | 5.1 | 0.218 |
- dataset: opus2m
- model: transformer
- source language(s): eng
- target language(s): arg ast cat cos egl ext fra frm_Latn gcf_Latn glg hat ind ita lad lad_Latn lat_Latn lij lld_Latn lmo max_Latn mfe min mwl oci pap pms por roh ron scn spa tmw_Latn vec wln zlm_Latn zsm_Latn
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus2m-2020-08-01.zip
- test set translations: opus2m-2020-08-01.test.txt
- test set scores: opus2m-2020-08-01.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-enro-engron.eng.ron | 27.1 | 0.565 |
newsdiscussdev2015-enfr-engfra.eng.fra | 29.9 | 0.574 |
newsdiscusstest2015-enfr-engfra.eng.fra | 35.3 | 0.609 |
newssyscomb2009-engfra.eng.fra | 27.7 | 0.567 |
newssyscomb2009-engita.eng.ita | 28.6 | 0.586 |
newssyscomb2009-engspa.eng.spa | 29.8 | 0.569 |
news-test2008-engfra.eng.fra | 25.0 | 0.536 |
news-test2008-engspa.eng.spa | 27.1 | 0.548 |
newstest2009-engfra.eng.fra | 26.7 | 0.557 |
newstest2009-engita.eng.ita | 28.9 | 0.583 |
newstest2009-engspa.eng.spa | 28.9 | 0.567 |
newstest2010-engfra.eng.fra | 29.6 | 0.574 |
newstest2010-engspa.eng.spa | 33.8 | 0.598 |
newstest2011-engfra.eng.fra | 30.9 | 0.590 |
newstest2011-engspa.eng.spa | 34.8 | 0.598 |
newstest2012-engfra.eng.fra | 29.1 | 0.574 |
newstest2012-engspa.eng.spa | 34.9 | 0.600 |
newstest2013-engfra.eng.fra | 30.1 | 0.567 |
newstest2013-engspa.eng.spa | 31.8 | 0.576 |
newstest2016-enro-engron.eng.ron | 25.9 | 0.548 |
Tatoeba-test.eng-arg.eng.arg | 1.6 | 0.120 |
Tatoeba-test.eng-ast.eng.ast | 17.2 | 0.389 |
Tatoeba-test.eng-cat.eng.cat | 47.6 | 0.668 |
Tatoeba-test.eng-cos.eng.cos | 4.3 | 0.287 |
Tatoeba-test.eng-egl.eng.egl | 0.9 | 0.101 |
Tatoeba-test.eng-ext.eng.ext | 8.7 | 0.287 |
Tatoeba-test.eng-fra.eng.fra | 44.9 | 0.635 |
Tatoeba-test.eng-frm.eng.frm | 1.0 | 0.225 |
Tatoeba-test.eng-gcf.eng.gcf | 0.7 | 0.115 |
Tatoeba-test.eng-glg.eng.glg | 44.9 | 0.648 |
Tatoeba-test.eng-hat.eng.hat | 30.9 | 0.533 |
Tatoeba-test.eng-ita.eng.ita | 45.4 | 0.673 |
Tatoeba-test.eng-lad.eng.lad | 5.6 | 0.279 |
Tatoeba-test.eng-lat.eng.lat | 12.1 | 0.380 |
Tatoeba-test.eng-lij.eng.lij | 1.4 | 0.183 |
Tatoeba-test.eng-lld.eng.lld | 0.5 | 0.199 |
Tatoeba-test.eng-lmo.eng.lmo | 0.7 | 0.187 |
Tatoeba-test.eng-mfe.eng.mfe | 83.6 | 0.909 |
Tatoeba-test.eng-msa.eng.msa | 31.3 | 0.549 |
Tatoeba-test.eng.multi | 38.0 | 0.588 |
Tatoeba-test.eng-mwl.eng.mwl | 2.7 | 0.322 |
Tatoeba-test.eng-oci.eng.oci | 8.2 | 0.293 |
Tatoeba-test.eng-pap.eng.pap | 46.7 | 0.663 |
Tatoeba-test.eng-pms.eng.pms | 2.1 | 0.194 |
Tatoeba-test.eng-por.eng.por | 41.2 | 0.635 |
Tatoeba-test.eng-roh.eng.roh | 2.6 | 0.237 |
Tatoeba-test.eng-ron.eng.ron | 40.6 | 0.632 |
Tatoeba-test.eng-scn.eng.scn | 1.6 | 0.181 |
Tatoeba-test.eng-spa.eng.spa | 49.5 | 0.685 |
Tatoeba-test.eng-vec.eng.vec | 1.6 | 0.223 |
Tatoeba-test.eng-wln.eng.wln | 7.1 | 0.250 |
- dataset: opus1m+bt
- model: transformer-align
- source language(s): eng
- target language(s): arg ast cat cbk cos egl ext fra frm gcf glg hat ita lad lat lij lld lmo mfe mol mwl oci osp pap pms pob por roh ron scn spa vec wln
- model: transformer-align
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>acf<< >>aoa<< >>arg<< >>ast<< >>cat<< >>cbk<< >>cbk_Latn<< >>ccd<< >>cks<< >>cos<< >>cri<< >>crs<< >>dlm<< >>drc<< >>egl<< >>ext<< >>fab<< >>fax<< >>fra<< >>frc<< >>frm<< >>frm_Latn<< >>fro<< >>frp<< >>fur<< >>gcf<< >>gcf_Latn<< >>gcr<< >>glg<< >>hat<< >>idb<< >>ist<< >>ita<< >>itk<< >>kea<< >>kmv<< >>lad<< >>lad_Latn<< >>lat<< >>lat_Latn<< >>lij<< >>lld<< >>lld_Latn<< >>lmo<< >>lou<< >>mcm<< >>mfe<< >>mol<< >>mwl<< >>mxi<< >>mzs<< >>nap<< >>nrf<< >>oci<< >>osc<< >>osp<< >>osp_Latn<< >>pap<< >>pcd<< >>pln<< >>pms<< >>pob<< >>por<< >>pov<< >>pre<< >>pro<< >>qbb<< >>qhr<< >>rcf<< >>rgn<< >>roh<< >>ron<< >>ruo<< >>rup<< >>ruq<< >>scf<< >>scn<< >>sdc<< >>sdn<< >>spa<< >>spq<< >>spx<< >>src<< >>srd<< >>sro<< >>tmg<< >>tvy<< >>vec<< >>vkp<< >>wln<< >>xfa<< >>xum<<
- download: opus1m+bt-2021-04-10.zip
- test set translations: opus1m+bt-2021-04-10.test.txt
- test set scores: opus1m+bt-2021-04-10.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newsdev2016-enro.eng-ron | 21.4 | 0.524 | 1999 | 51566 | 0.971 |
newsdiscussdev2015-enfr.eng-fra | 27.7 | 0.556 | 1500 | 27986 | 1.000 |
newsdiscusstest2015-enfr.eng-fra | 32.1 | 0.588 | 1500 | 28027 | 0.994 |
newssyscomb2009.eng-fra | 26.6 | 0.558 | 502 | 12334 | 1.000 |
newssyscomb2009.eng-ita | 27.4 | 0.578 | 502 | 11551 | 1.000 |
newssyscomb2009.eng-spa | 28.8 | 0.565 | 502 | 12506 | 0.983 |
news-test2008.eng-fra | 23.8 | 0.527 | 2051 | 52685 | 0.995 |
news-test2008.eng-spa | 26.3 | 0.541 | 2051 | 52596 | 0.997 |
newstest2009.eng-fra | 24.9 | 0.544 | 2525 | 69278 | 0.976 |
newstest2009.eng-ita | 27.3 | 0.572 | 2525 | 63474 | 1.000 |
newstest2009.eng-spa | 27.8 | 0.560 | 2525 | 68114 | 0.998 |
newstest2010.eng-fra | 27.1 | 0.559 | 2489 | 66043 | 0.985 |
newstest2010.eng-spa | 32.2 | 0.588 | 2489 | 65522 | 0.993 |
newstest2011.eng-fra | 29.2 | 0.576 | 3003 | 80626 | 0.969 |
newstest2011.eng-spa | 33.8 | 0.591 | 3003 | 79476 | 0.978 |
newstest2012.eng-fra | 27.3 | 0.560 | 3003 | 78011 | 0.984 |
newstest2012.eng-spa | 33.5 | 0.590 | 3003 | 79006 | 0.962 |
newstest2013.eng-fra | 27.7 | 0.549 | 3000 | 70037 | 0.972 |
newstest2013.eng-spa | 30.3 | 0.566 | 3000 | 70528 | 0.948 |
newstest2016-enro.eng-ron | 20.8 | 0.510 | 1999 | 49094 | 0.984 |
Tatoeba-test.eng-arg | 12.4 | 0.328 | 105 | 405 | 1.000 |
Tatoeba-test.eng-ast | 24.4 | 0.476 | 99 | 720 | 0.980 |
Tatoeba-test.eng-cat | 44.5 | 0.648 | 1631 | 12342 | 0.989 |
Tatoeba-test.eng-cbk | 4.4 | 0.253 | 1498 | 10591 | 0.968 |
Tatoeba-test.eng-cos | 39.5 | 0.680 | 5 | 45 | 0.931 |
Tatoeba-test.eng-egl | 0.4 | 0.118 | 84 | 438 | 1.000 |
Tatoeba-test.eng-ext | 11.4 | 0.345 | 69 | 353 | 1.000 |
Tatoeba-test.eng-fra | 39.8 | 0.605 | 10000 | 80759 | 0.974 |
Tatoeba-test.eng-frm | 2.1 | 0.221 | 18 | 211 | 1.000 |
Tatoeba-test.eng-gcf | 0.8 | 0.118 | 99 | 560 | 0.989 |
Tatoeba-test.eng-glg | 41.5 | 0.627 | 1008 | 7828 | 0.986 |
Tatoeba-test.eng-hat | 33.1 | 0.549 | 64 | 416 | 0.978 |
Tatoeba-test.eng-ita | 42.5 | 0.651 | 10000 | 65498 | 0.953 |
Tatoeba-test.eng-lad | 7.5 | 0.288 | 629 | 3354 | 1.000 |
Tatoeba-test.eng-lad_Latn | 8.0 | 0.314 | 582 | 3097 | 1.000 |
Tatoeba-test.eng-lat | 10.4 | 0.371 | 10000 | 74902 | 0.930 |
Tatoeba-test.eng-lij | 4.0 | 0.278 | 94 | 711 | 0.983 |
Tatoeba-test.eng-lld | 1.0 | 0.213 | 21 | 228 | 0.973 |
Tatoeba-test.eng-lmo | 8.8 | 0.317 | 17 | 124 | 1.000 |
Tatoeba-test.eng-mfe | 83.6 | 0.905 | 7 | 36 | 1.000 |
Tatoeba-test.eng-multi | 35.1 | 0.564 | 10000 | 74243 | 0.964 |
Tatoeba-test.eng-mwl | 7.8 | 0.505 | 4 | 21 | 1.000 |
Tatoeba-test.eng-oci | 9.9 | 0.330 | 841 | 5219 | 0.910 |
Tatoeba-test.eng-osp | 13.9 | 0.331 | 3 | 20 | 1.000 |
Tatoeba-test.eng-pap | 49.0 | 0.673 | 70 | 376 | 1.000 |
Tatoeba-test.eng-pms | 14.3 | 0.359 | 268 | 2244 | 0.944 |
Tatoeba-test.eng-por | 41.6 | 0.640 | 10000 | 75353 | 0.971 |
Tatoeba-test.eng-roh | 22.5 | 0.476 | 16 | 198 | 1.000 |
Tatoeba-test.eng-ron | 33.6 | 0.580 | 5000 | 36833 | 0.970 |
Tatoeba-test.eng-scn | 38.9 | 0.482 | 4 | 42 | 1.000 |
Tatoeba-test.eng-spa | 45.4 | 0.657 | 10000 | 77291 | 0.974 |
Tatoeba-test.eng-vec | 5.6 | 0.315 | 19 | 127 | 0.927 |
Tatoeba-test.eng-wln | 11.9 | 0.299 | 89 | 520 | 0.951 |
tico19-test.eng-fra | 33.2 | 0.588 | 2100 | 64655 | 0.978 |
tico19-test.eng-pob | 41.4 | 0.686 | 2100 | 62729 | 0.947 |
tico19-test.eng-por | 40.7 | 0.683 | 2100 | 62729 | 0.959 |
tico19-test.eng-spa | 42.4 | 0.681 | 2100 | 66591 | 0.950 |