Folders and files Name Name Last commit message
Last commit date
parent directory
View all files
dataset: opus2m
model: transformer
source language(s): eng
target language(s): bod brx brx_Latn cjy_Hans cjy_Hant cmn cmn_Hans cmn_Hant gan lzh lzh_Hans mya nan wuu yue yue_Hans yue_Hant zho zho_Hans zho_Hant
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus2m-2020-08-01.zip
test set translations: opus2m-2020-08-01.test.txt
test set scores: opus2m-2020-08-01.eval.txt
testset
BLEU
chr-F
newsdev2017-enzh-engzho.eng.zho
23.5
0.217
newstest2017-enzh-engzho.eng.zho
23.2
0.223
newstest2018-enzh-engzho.eng.zho
25.0
0.230
newstest2019-enzh-engzho.eng.zho
20.2
0.225
Tatoeba-test.eng-bod.eng.bod
0.4
0.147
Tatoeba-test.eng-brx.eng.brx
0.5
0.012
Tatoeba-test.eng.multi
25.7
0.223
Tatoeba-test.eng-mya.eng.mya
0.2
0.222
Tatoeba-test.eng-zho.eng.zho
29.2
0.249
dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): bod brx cjy cmn dng gan hak hsn lzh mya nan nst wuu yue
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>acn<< >>adi<< >>adl<< >>adx<< >>aeu<< >>ahk<< >>aim<< >>ajz<< >>anl<< >>anm<< >>aot<< >>aph<< >>apt<< >>atb<< >>aub<< >>aza<< >>bap<< >>bca<< >>bee<< >>bfc<< >>bfs<< >>bft<< >>bfu<< >>bgg<< >>bgr<< >>bhj<< >>biu<< >>blk<< >>bod<< >>bqh<< >>brd<< >>bro<< >>brx<< >>brx_Latn<< >>bwe<< >>bxd<< >>byh<< >>byo<< >>byw<< >>bzi<< >>cbl<< >>cda<< >>cdf<< >>cdm<< >>cdn<< >>cdo<< >>cek<< >>cfm<< >>cgk<< >>chx<< >>cik<< >>cjy<< >>cjy_Hans<< >>cjy_Hant<< >>ckh<< >>clk<< >>clt<< >>cmn<< >>cmn_Hans<< >>cmn_Hant<< >>cmr<< >>cna<< >>cnb<< >>cnc<< >>cng<< >>cnh<< >>cnk<< >>cnw<< >>cpx<< >>csh<< >>csv<< >>csy<< >>ctd<< >>ctn<< >>cur<< >>cuw<< >>cvg<< >>czh<< >>czo<< >>czt<< >>dao<< >>der<< >>dhi<< >>dis<< >>dka<< >>dln<< >>dng<< >>drd<< >>dre<< >>drq<< >>dus<< >>duu<< >>dzl<< >>dzo<< >>eky<< >>emg<< >>enu<< >>ero<< >>ers<< >>gan<< >>ghe<< >>ghh<< >>ghk<< >>ght<< >>gnb<< >>goe<< >>gqi<< >>gro<< >>grt<< >>gvr<< >>hak<< >>hak_Hani<< >>hle<< >>hlt<< >>hmr<< >>hni<< >>how<< >>hpo<< >>hra<< >>hsn<< >>hsn_Hani<< >>hut<< >>iii<< >>int<< >>jda<< >>jee<< >>jih<< >>jiq<< >>jiu<< >>jiy<< >>jkm<< >>jkr<< >>jmn<< >>jna<< >>jnl<< >>jul<< >>jya<< >>kac<< >>kaf<< >>kbg<< >>kdq<< >>kdv<< >>kfk<< >>kfw<< >>kgj<< >>kgy<< >>khg<< >>kif<< >>kip<< >>kix<< >>kjl<< >>kjp<< >>kjt<< >>kjz<< >>kkf<< >>kkt<< >>kle<< >>klr<< >>kmm<< >>kpp<< >>ksw<< >>kte<< >>ktp<< >>kvl<< >>kvq<< >>kvu<< >>kvy<< >>kxf<< >>kxk<< >>kyu<< >>kzq<< >>lae<< >>lax<< >>lbf<< >>lbj<< >>lbr<< >>lep<< >>lgh<< >>lhi<< >>lhm<< >>lhp<< >>lhu<< >>lif<< >>lis<< >>lkc<< >>lkh<< >>llh<< >>lmk<< >>loy<< >>lpn<< >>lpo<< >>lrr<< >>lsh<< >>lsi<< >>ltc<< >>luk<< >>lus<< >>lwm<< >>lwu<< >>lya<< >>lzh<< >>lzh_Hans<< >>lzn<< >>mgp<< >>mhu<< >>mhx<< >>mjw<< >>mni<< >>mnp<< >>mpz<< >>mrd<< >>mrg<< >>mrh<< >>mro<< >>muk<< >>mvm<< >>mwq<< >>mxj<< >>mya<< >>nan<< >>nao<< >>nbc<< >>nbe<< >>nbi<< >>nbt<< >>nbu<< >>ncd<< >>nct<< >>neh<< >>nes<< >>new<< >>njb<< >>njh<< >>njm<< >>njn<< >>njo<< >>njz<< >>nkb<< >>nkd<< >>nkh<< >>nki<< >>nma<< >>nme<< >>nmf<< >>nmh<< >>nmm<< >>nmo<< >>nmy<< >>nng<< >>nnl<< >>nnp<< >>nos<< >>npa<< >>nph<< >>npo<< >>nre<< >>nri<< >>nru<< >>nsa<< >>nsd<< >>nsf<< >>nsm<< >>nst<< >>nst_Latn<< >>nty<< >>nuf<< >>nun<< >>nwc<< >>nxq<< >>nzm<< >>obr<< >>och<< >>ola<< >>ole<< >>onp<< >>pck<< >>pdu<< >>phh<< >>pho<< >>phq<< >>pkh<< >>pmi<< >>pmj<< >>pmx<< >>prx<< >>pub<< >>pum<< >>pwo<< >>pww<< >>pyx<< >>pyy<< >>pzn<< >>qvy<< >>qxs<< >>raa<< >>rab<< >>raf<< >>rah<< >>ral<< >>raq<< >>rau<< >>rav<< >>raw<< >>rgk<< >>ria<< >>rji<< >>rki<< >>rmz<< >>rnl<< >>rnp<< >>ruh<< >>sbu<< >>sch<< >>scp<< >>scu<< >>sdp<< >>sez<< >>sgk<< >>sgp<< >>sgt<< >>shl<< >>sip<< >>sjl<< >>skj<< >>slt<< >>smh<< >>smt<< >>spt<< >>ssk<< >>suv<< >>suz<< >>sxg<< >>syw<< >>taj<< >>tcl<< >>tcn<< >>tco<< >>tcp<< >>tcz<< >>tdg<< >>tdh<< >>tge<< >>tgf<< >>tgj<< >>thf<< >>ths<< >>tij<< >>tji<< >>tjs<< >>tpe<< >>tpq<< >>tro<< >>trp<< >>tsj<< >>tsk<< >>ttz<< >>tvn<< >>tvt<< >>twm<< >>txg<< >>txo<< >>ugo<< >>umn<< >>usi<< >>vap<< >>vay<< >>wea<< >>wly<< >>wme<< >>wuh<< >>wuu<< >>wxa<< >>xac<< >>xct<< >>xkf<< >>xkz<< >>xns<< >>xsr<< >>xzh<< >>ybh<< >>ybi<< >>ybk<< >>ych<< >>ycl<< >>ycp<< >>ygp<< >>yhl<< >>yif<< >>yig<< >>yik<< >>yim<< >>yip<< >>yiq<< >>yit<< >>yiu<< >>yiv<< >>yix<< >>yiz<< >>ykl<< >>ykn<< >>ykt<< >>yku<< >>ylm<< >>ylo<< >>ymc<< >>ymd<< >>ymh<< >>ymi<< >>ymq<< >>ymx<< >>ymz<< >>yna<< >>ypa<< >>ypb<< >>ypg<< >>yph<< >>ypm<< >>ypn<< >>ypo<< >>ypp<< >>ypz<< >>ysd<< >>ysg<< >>ysn<< >>yso<< >>ysy<< >>yta<< >>ytl<< >>ytp<< >>yue<< >>yue_Hans<< >>yue_Hant<< >>ywl<< >>ywq<< >>ywt<< >>ywu<< >>yyz<< >>yzk<< >>zal<< >>zau<< >>zhb<< >>zho<< >>zkr<< >>zom<< >>zyp<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
Tatoeba-test.eng-bod
0.3
0.170
38
42
1.000
Tatoeba-test.eng-brx
0.4
0.015
12
43
1.000
Tatoeba-test.eng-brx_Latn
1.5
0.039
3
15
1.000
Tatoeba-test.eng-cjy_Hans
15.4
0.148
2
21
1.000
Tatoeba-test.eng-cjy_Hant
8.1
0.040
1
4
1.000
Tatoeba-test.eng-cmn
10.1
0.295
15
87
1.000
Tatoeba-test.eng-cmn_Hans
30.8
0.263
4195
48021
0.900
Tatoeba-test.eng-cmn_Hant
30.2
0.261
4418
46882
0.809
Tatoeba-test.eng-dng
0.8
0.006
6
25
1.000
Tatoeba-test.eng-gan
10.3
0.106
1
14
0.846
Tatoeba-test.eng-lzh
0.2
0.024
98
702
1.000
Tatoeba-test.eng-lzh_Hans
0.4
0.008
3
38
1.000
Tatoeba-test.eng-multi
25.7
0.194
10000
106007
0.932
Tatoeba-test.eng-mya
0.2
0.167
216
1219
1.000
Tatoeba-test.eng-nan
2.7
0.020
2
10
1.000
Tatoeba-test.eng-nst
0.1
0.033
805
5779
1.000
Tatoeba-test.eng-wuu
4.2
0.066
203
2335
1.000
Tatoeba-test.eng-yue_Hans
4.7
0.067
630
8155
1.000
Tatoeba-test.eng-yue_Hant
4.0
0.070
431
4183
1.000
Tatoeba-test.eng-zho
27.1
0.231
10000
110457
0.904
tico19-test.eng-mya
1.9
0.225
2100
32295
1.000
opus4m+btTCv20210807-2021-09-30.zip
dataset: opus4m+btTCv20210807
model: transformer
source language(s): eng
target language(s): bod brx cjy cmn cnh dng dzo gan hak hsn iii kac ksw lus lzh mni mya nan new nst pck wuu yue
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>acn<< >>adi<< >>adl<< >>adx<< >>aeu<< >>ahk<< >>aim<< >>ajz<< >>anl<< >>anm<< >>aot<< >>aph<< >>apt<< >>atb<< >>aub<< >>aza<< >>bap<< >>bca<< >>bee<< >>bfc<< >>bfs<< >>bft<< >>bfu<< >>bgg<< >>bgr<< >>bhj<< >>biu<< >>blk<< >>bod<< >>bqh<< >>brd<< >>bro<< >>brx<< >>brx_Latn<< >>bwe<< >>bxd<< >>byh<< >>byo<< >>byw<< >>bzi<< >>cbl<< >>cda<< >>cdf<< >>cdm<< >>cdn<< >>cdo<< >>cek<< >>cfm<< >>cgk<< >>chx<< >>cik<< >>cjy<< >>cjy_Hans<< >>cjy_Hant<< >>ckh<< >>clk<< >>clt<< >>cmn<< >>cmn_Hans<< >>cmn_Hant<< >>cmr<< >>cna<< >>cnb<< >>cnc<< >>cng<< >>cnh<< >>cnk<< >>cnw<< >>cpx<< >>csh<< >>csv<< >>csy<< >>ctd<< >>ctn<< >>cur<< >>cuw<< >>cvg<< >>czh<< >>czo<< >>czt<< >>dao<< >>der<< >>dhi<< >>dis<< >>dka<< >>dln<< >>dng<< >>drd<< >>dre<< >>drq<< >>dus<< >>duu<< >>dzl<< >>dzo<< >>eky<< >>emg<< >>enu<< >>ero<< >>ers<< >>gan<< >>ghe<< >>ghh<< >>ghk<< >>ght<< >>gnb<< >>goe<< >>gqi<< >>gro<< >>grt<< >>gvr<< >>hak<< >>hak_Hani<< >>hle<< >>hlt<< >>hmr<< >>hni<< >>how<< >>hpo<< >>hra<< >>hsn<< >>hsn_Hani<< >>hut<< >>iii<< >>int<< >>jda<< >>jee<< >>jih<< >>jiq<< >>jiu<< >>jiy<< >>jkm<< >>jkr<< >>jmn<< >>jna<< >>jnl<< >>jul<< >>jya<< >>kac<< >>kaf<< >>kbg<< >>kdq<< >>kdv<< >>kfk<< >>kfw<< >>kgj<< >>kgy<< >>khg<< >>kif<< >>kip<< >>kix<< >>kjl<< >>kjp<< >>kjt<< >>kjz<< >>kkf<< >>kkt<< >>kle<< >>klr<< >>kmm<< >>kpp<< >>ksw<< >>kte<< >>ktp<< >>kvl<< >>kvq<< >>kvu<< >>kvy<< >>kxf<< >>kxk<< >>kyu<< >>kzq<< >>lae<< >>lax<< >>lbf<< >>lbj<< >>lbr<< >>lep<< >>lgh<< >>lhi<< >>lhm<< >>lhp<< >>lhu<< >>lif<< >>lis<< >>lkc<< >>lkh<< >>llh<< >>lmk<< >>loy<< >>lpn<< >>lpo<< >>lrr<< >>lsh<< >>lsi<< >>ltc<< >>luk<< >>lus<< >>lwm<< >>lwu<< >>lya<< >>lzh<< >>lzh_Hans<< >>lzn<< >>mgp<< >>mhu<< >>mhx<< >>mjw<< >>mni<< >>mnp<< >>mpz<< >>mrd<< >>mrg<< >>mrh<< >>mro<< >>muk<< >>mvm<< >>mwq<< >>mxj<< >>mya<< >>nan<< >>nao<< >>nbc<< >>nbe<< >>nbi<< >>nbt<< >>nbu<< >>ncd<< >>nct<< >>neh<< >>nes<< >>new<< >>njb<< >>njh<< >>njm<< >>njn<< >>njo<< >>njz<< >>nkb<< >>nkd<< >>nkh<< >>nki<< >>nma<< >>nme<< >>nmf<< >>nmh<< >>nmm<< >>nmo<< >>nmy<< >>nng<< >>nnl<< >>nnp<< >>nos<< >>npa<< >>nph<< >>npo<< >>nre<< >>nri<< >>nru<< >>nsa<< >>nsd<< >>nsf<< >>nsm<< >>nst<< >>nst_Latn<< >>nty<< >>nuf<< >>nun<< >>nwc<< >>nxq<< >>nzm<< >>obr<< >>och<< >>ola<< >>ole<< >>onp<< >>pck<< >>pdu<< >>phh<< >>pho<< >>phq<< >>pkh<< >>pmi<< >>pmj<< >>pmx<< >>prx<< >>pub<< >>pum<< >>pwo<< >>pww<< >>pyx<< >>pyy<< >>pzn<< >>qvy<< >>qxs<< >>raa<< >>rab<< >>raf<< >>rah<< >>ral<< >>raq<< >>rau<< >>rav<< >>raw<< >>rgk<< >>ria<< >>rji<< >>rki<< >>rmz<< >>rnl<< >>rnp<< >>ruh<< >>sbu<< >>sch<< >>scp<< >>scu<< >>sdp<< >>sez<< >>sgk<< >>sgp<< >>sgt<< >>shl<< >>sip<< >>sjl<< >>skj<< >>slt<< >>smh<< >>smt<< >>spt<< >>ssk<< >>suv<< >>suz<< >>sxg<< >>syw<< >>taj<< >>tcl<< >>tcn<< >>tco<< >>tcp<< >>tcz<< >>tdg<< >>tdh<< >>tge<< >>tgf<< >>tgj<< >>thf<< >>ths<< >>tij<< >>tji<< >>tjs<< >>tpe<< >>tpq<< >>tro<< >>trp<< >>tsj<< >>tsk<< >>ttz<< >>tvn<< >>tvt<< >>twm<< >>txg<< >>txo<< >>ugo<< >>umn<< >>usi<< >>vap<< >>vay<< >>wea<< >>wly<< >>wme<< >>wuh<< >>wuu<< >>wxa<< >>xac<< >>xct<< >>xkf<< >>xkz<< >>xns<< >>xsr<< >>xzh<< >>ybh<< >>ybi<< >>ybk<< >>ych<< >>ycl<< >>ycp<< >>ygp<< >>yhl<< >>yif<< >>yig<< >>yik<< >>yim<< >>yip<< >>yiq<< >>yit<< >>yiu<< >>yiv<< >>yix<< >>yiz<< >>ykl<< >>ykn<< >>ykt<< >>yku<< >>ylm<< >>ylo<< >>ymc<< >>ymd<< >>ymh<< >>ymi<< >>ymq<< >>ymx<< >>ymz<< >>yna<< >>ypa<< >>ypb<< >>ypg<< >>yph<< >>ypm<< >>ypn<< >>ypo<< >>ypp<< >>ypz<< >>ysd<< >>ysg<< >>ysn<< >>yso<< >>ysy<< >>yta<< >>ytl<< >>ytp<< >>yue<< >>yue_Hans<< >>yue_Hant<< >>ywl<< >>ywq<< >>ywt<< >>ywu<< >>yyz<< >>yzk<< >>zal<< >>zau<< >>zhb<< >>zho<< >>zkr<< >>zom<< >>zyp<<
download: opus4m+btTCv20210807-2021-09-30.zip
test set translations: opus4m+btTCv20210807-2021-09-30.test.txt
test set scores: opus4m+btTCv20210807-2021-09-30.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
Tatoeba-test-v2021-08-07.eng-multi
25.0
0.209
10000
106133
0.959
Tatoeba-test-v2021-08-07.multi-multi
25.0
0.209
10000
106133
0.959
tico19-test.eng-mya
6.2
0.422
2100
32295
0.684
You can’t perform that action at this time.