Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
opus-2020-06-28.yml		opus-2020-06-28.yml
opus-2020-07-06.yml		opus-2020-07-06.yml
opus-2020-07-26.yml		opus-2020-07-26.yml
opus.yml		opus.yml
opus1m+bt-2021-04-13.yml		opus1m+bt-2021-04-13.yml
opus1m+bt.yml		opus1m+bt.yml
opus2m-2020-08-01.yml		opus2m-2020-08-01.yml
opus2m.yml		opus2m.yml

README.md

opus-2020-06-28.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom sin snd_Arab urd
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-06-28.zip
test set translations: opus-2020-06-28.test.txt
test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-asm.eng.asm	3.0	0.245
Tatoeba-test.eng-awa.eng.awa	0.4	0.098
Tatoeba-test.eng-ben.eng.ben	16.5	0.481
Tatoeba-test.eng-bho.eng.bho	0.8	0.110
Tatoeba-test.eng-guj.eng.guj	19.9	0.393
Tatoeba-test.eng-hif.eng.hif	0.5	0.022
Tatoeba-test.eng-hin.eng.hin	17.4	0.463
Tatoeba-test.eng-kok.eng.kok	8.1	0.006
Tatoeba-test.eng-lah.eng.lah	0.2	0.001
Tatoeba-test.eng-mai.eng.mai	7.6	0.374
Tatoeba-test.eng-mar.eng.mar	20.4	0.464
Tatoeba-test.eng.multi	17.0	0.442
Tatoeba-test.eng-nep.eng.nep	1.0	0.102
Tatoeba-test.eng-ori.eng.ori	2.2	0.198
Tatoeba-test.eng-pan.eng.pan	8.4	0.343
Tatoeba-test.eng-rom.eng.rom	0.3	0.185
Tatoeba-test.eng-sin.eng.sin	9.5	0.368
Tatoeba-test.eng-snd.eng.snd	6.8	0.343
Tatoeba-test.eng-urd.eng.urd	12.5	0.414

opus-2020-07-06.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-06.zip
test set translations: opus-2020-07-06.test.txt
test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset	BLEU	chr-F
Tatoeba-test.eng-asm.eng.asm	3.6	0.277
Tatoeba-test.eng-awa.eng.awa	0.4	0.144
Tatoeba-test.eng-ben.eng.ben	15.9	0.466
Tatoeba-test.eng-bho.eng.bho	0.6	0.152
Tatoeba-test.eng-guj.eng.guj	20.9	0.380
Tatoeba-test.eng-hif.eng.hif	0.6	0.032
Tatoeba-test.eng-hin.eng.hin	17.2	0.461
Tatoeba-test.eng-kok.eng.kok	3.3	0.022
Tatoeba-test.eng-lah.eng.lah	0.3	0.007
Tatoeba-test.eng-mai.eng.mai	8.9	0.392
Tatoeba-test.eng-mar.eng.mar	20.1	0.463
Tatoeba-test.eng.multi	16.8	0.439
Tatoeba-test.eng-nep.eng.nep	0.6	0.058
Tatoeba-test.eng-ori.eng.ori	2.2	0.187
Tatoeba-test.eng-pan.eng.pan	9.6	0.351
Tatoeba-test.eng-rom.eng.rom	0.4	0.188
Tatoeba-test.eng-san.eng.san	1.5	0.111
Tatoeba-test.eng-sin.eng.sin	9.1	0.370
Tatoeba-test.eng-snd.eng.snd	1.9	0.235
Tatoeba-test.eng-urd.eng.urd	12.7	0.412

opus-2020-07-26.zip

dataset: opus
model: transformer
source language(s): eng
target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus-2020-07-26.zip
test set translations: opus-2020-07-26.test.txt
test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2014-enghin.eng.hin	7.5	0.337
newsdev2019-engu-engguj.eng.guj	6.3	0.282
newstest2014-hien-enghin.eng.hin	11.0	0.358
newstest2019-engu-engguj.eng.guj	7.1	0.291
Tatoeba-test.eng-asm.eng.asm	3.7	0.260
Tatoeba-test.eng-awa.eng.awa	0.4	0.144
Tatoeba-test.eng-ben.eng.ben	16.0	0.466
Tatoeba-test.eng-bho.eng.bho	0.6	0.143
Tatoeba-test.eng-guj.eng.guj	20.2	0.375
Tatoeba-test.eng-hif.eng.hif	0.5	0.040
Tatoeba-test.eng-hin.eng.hin	17.3	0.462
Tatoeba-test.eng-kok.eng.kok	3.3	0.044
Tatoeba-test.eng-lah.eng.lah	0.2	0.005
Tatoeba-test.eng-mai.eng.mai	9.3	0.385
Tatoeba-test.eng-mar.eng.mar	19.9	0.461
Tatoeba-test.eng.multi	16.6	0.436
Tatoeba-test.eng-nep.eng.nep	0.7	0.067
Tatoeba-test.eng-ori.eng.ori	2.2	0.196
Tatoeba-test.eng-pan.eng.pan	7.0	0.342
Tatoeba-test.eng-rom.eng.rom	0.4	0.187
Tatoeba-test.eng-san.eng.san	1.7	0.109
Tatoeba-test.eng-sin.eng.sin	9.1	0.365
Tatoeba-test.eng-snd.eng.snd	5.6	0.343
Tatoeba-test.eng-urd.eng.urd	12.9	0.411

opus2m-2020-08-01.zip

dataset: opus2m
model: transformer
source language(s): eng
target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset	BLEU	chr-F
newsdev2014-enghin.eng.hin	8.2	0.342
newsdev2019-engu-engguj.eng.guj	6.5	0.293
newstest2014-hien-enghin.eng.hin	11.4	0.364
newstest2019-engu-engguj.eng.guj	7.2	0.296
Tatoeba-test.eng-asm.eng.asm	2.7	0.277
Tatoeba-test.eng-awa.eng.awa	0.5	0.132
Tatoeba-test.eng-ben.eng.ben	16.7	0.470
Tatoeba-test.eng-bho.eng.bho	4.3	0.227
Tatoeba-test.eng-guj.eng.guj	17.5	0.373
Tatoeba-test.eng-hif.eng.hif	0.6	0.028
Tatoeba-test.eng-hin.eng.hin	17.7	0.469
Tatoeba-test.eng-kok.eng.kok	1.7	0.000
Tatoeba-test.eng-lah.eng.lah	0.3	0.028
Tatoeba-test.eng-mai.eng.mai	15.6	0.429
Tatoeba-test.eng-mar.eng.mar	21.3	0.477
Tatoeba-test.eng.multi	17.3	0.448
Tatoeba-test.eng-nep.eng.nep	0.8	0.081
Tatoeba-test.eng-ori.eng.ori	2.2	0.208
Tatoeba-test.eng-pan.eng.pan	8.0	0.347
Tatoeba-test.eng-rom.eng.rom	0.4	0.197
Tatoeba-test.eng-san.eng.san	0.5	0.108
Tatoeba-test.eng-sin.eng.sin	9.1	0.364
Tatoeba-test.eng-snd.eng.snd	4.4	0.284
Tatoeba-test.eng-urd.eng.urd	13.3	0.423

opus1m+bt-2021-04-13.zip

dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): asm awa ben bho dty gbm gom guj hif hin mai mar nep npi ori pan pnb rmn rmy rom san sin snd urd
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels: >>aee<< >>aeq<< >>anp<< >>anr<< >>asm<< >>awa<< >>bdv<< >>ben<< >>ben_Cyrl<< >>ben_Deva<< >>ben_Gujr<< >>bfb<< >>bfy<< >>bfz<< >>bgc<< >>bgd<< >>bge<< >>bgq<< >>bgw<< >>bha<< >>bhb<< >>bhd<< >>bhe<< >>bhi<< >>bho<< >>bht<< >>bhu<< >>bjj<< >>bkk<< >>bmj<< >>bns<< >>bpx<< >>bpy<< >>bra<< >>btv<< >>ccp<< >>cdh<< >>cdi<< >>cdj<< >>cih<< >>clh<< >>ctg<< >>dcc<< >>dgo<< >>dhd<< >>dhn<< >>dho<< >>div<< >>dmk<< >>dml<< >>doi<< >>dry<< >>dty<< >>dub<< >>duh<< >>dwz<< >>emx<< >>gas<< >>gbk<< >>gbl<< >>gbm<< >>gda<< >>gdx<< >>ggg<< >>ghr<< >>gig<< >>gjk<< >>gju<< >>glh<< >>gom<< >>gra<< >>guj<< >>gwc<< >>gwf<< >>gwt<< >>haj<< >>hca<< >>hif<< >>hif_Latn<< >>hii<< >>hin<< >>hlb<< >>hnd<< >>hne<< >>hno<< >>hns<< >>hoj<< >>jat<< >>jdg<< >>jml<< >>jnd<< >>jns<< >>kas<< >>kbu<< >>keq<< >>key<< >>kfr<< >>kfs<< >>kft<< >>kfu<< >>kfv<< >>kfx<< >>kfy<< >>khn<< >>khw<< >>kjo<< >>kls<< >>knn<< >>kok<< >>kra<< >>ksy<< >>kvx<< >>kxp<< >>kyw<< >>lah<< >>lbm<< >>lhl<< >>lmn<< >>lss<< >>luv<< >>mag<< >>mai<< >>mar<< >>mby<< >>mjl<< >>mjz<< >>mkb<< >>mke<< >>mki<< >>mtr<< >>mup<< >>mve<< >>mvy<< >>mwr<< >>nag<< >>nep<< >>nhh<< >>nli<< >>nlx<< >>noe<< >>noi<< >>npi<< >>odk<< >>omr<< >>ori<< >>ort<< >>ory<< >>pan<< >>pan_Guru<< >>paq<< >>pcl<< >>pgg<< >>phd<< >>phl<< >>phr<< >>pli<< >>plk<< >>plp<< >>pmh<< >>pmu<< >>pnb<< >>pnb_Guru<< >>psh<< >>psi<< >>psu<< >>pwr<< >>qpp<< >>raj<< >>rei<< >>rhg<< >>rjs<< >>rkt<< >>rmc<< >>rmf<< >>rmi<< >>rml<< >>rmn<< >>rmo<< >>rmq<< >>rmt<< >>rmw<< >>rmy<< >>rom<< >>rtw<< >>rwr<< >>san<< >>san_Deva<< >>saz<< >>sbn<< >>sck<< >>scl<< >>sdg<< >>sdr<< >>shd<< >>sin<< >>sjp<< >>skr<< >>smm<< >>smv<< >>snd<< >>snd_Arab<< >>soi<< >>spv<< >>srx<< >>ssi<< >>sts<< >>swv<< >>syl<< >>tdb<< >>the<< >>thl<< >>thq<< >>thr<< >>tkb<< >>tkt<< >>tnv<< >>tra<< >>trw<< >>urd<< >>ush<< >>vaa<< >>vah<< >>vas<< >>vav<< >>ved<< >>vgr<< >>wbr<< >>wry<< >>wsv<< >>wtm<< >>xhe<< >>xka<< >>xnr<<
download: opus1m+bt-2021-04-13.zip
test set translations: opus1m+bt-2021-04-13.test.txt
test set scores: opus1m+bt-2021-04-13.eval.txt

Benchmarks

testset	BLEU	chr-F	#sent	#words	BP
newsdev2014.eng-hin	8.4	0.363	520	9538	1.000
newsdev2019-engu.eng-guj	7.6	0.312	1998	39137	0.810
newstest2014-hien.eng-hin	12.0	0.384	2507	60878	1.000
newstest2019-engu.eng-guj	7.9	0.320	998	21927	0.806
Tatoeba-test.eng-asm	3.5	0.256	117	569	1.000
Tatoeba-test.eng-awa	0.4	0.084	279	1148	1.000
Tatoeba-test.eng-ben	9.9	0.446	2500	11654	1.000
Tatoeba-test.eng-bho	2.0	0.246	42	244	1.000
Tatoeba-test.eng-gbm	0.3	0.075	39	153	1.000
Tatoeba-test.eng-guj	20.8	0.418	154	824	1.000
Tatoeba-test.eng-hif	0.7	0.038	36	231	1.000
Tatoeba-test.eng-hin	17.0	0.466	5000	32904	1.000
Tatoeba-test.eng-kok	8.1	0.005	1	6	1.000
Tatoeba-test.eng-lah	0.2	0.018	32	182	1.000
Tatoeba-test.eng-mai	7.8	0.304	8	19	1.000
Tatoeba-test.eng-mar	22.1	0.504	10000	58667	0.985
Tatoeba-test.eng-multi	16.3	0.451	10000	59570	1.000
Tatoeba-test.eng-nep	0.7	0.104	115	413	1.000
Tatoeba-test.eng-ori	0.3	0.003	33	205	1.000
Tatoeba-test.eng-pan	6.2	0.312	87	603	1.000
Tatoeba-test.eng-rom	2.9	0.253	671	4974	1.000
Tatoeba-test.eng-san	1.0	0.107	144	389	1.000
Tatoeba-test.eng-sin	8.1	0.350	45	234	1.000
Tatoeba-test.eng-snd	6.5	0.334	4	18	1.000
Tatoeba-test.eng-urd	10.5	0.390	1663	12154	1.000
tico19-test.eng-ben	6.7	0.376	2100	51751	1.000
tico19-test.eng-hin	18.1	0.432	2100	62738	1.000
tico19-test.eng-mar	7.5	0.364	2100	50881	0.844
tico19-test.eng-nep	7.6	0.407	2100	48706	1.000
tico19-test.eng-urd	8.4	0.329	2100	65363	0.943

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eng-inc

eng-inc

README.md

opus-2020-06-28.zip

Benchmarks

opus-2020-07-06.zip

Benchmarks

opus-2020-07-26.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-13.zip

Benchmarks

Files

eng-inc

Directory actions

More options

Directory actions

More options

Latest commit

History

eng-inc

Folders and files

parent directory

README.md

opus-2020-06-28.zip

Benchmarks

opus-2020-07-06.zip

Benchmarks

opus-2020-07-26.zip

Benchmarks

opus2m-2020-08-01.zip

Benchmarks

opus1m+bt-2021-04-13.zip

Benchmarks