Folders and files Name Name Last commit message
Last commit date
parent directory
View all files
dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): afh avk bzt dws epo ido ile ina jbo ldn lfn nov qya sjn tlh tzl vol
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>afh<< >>afh_Latn<< >>avk<< >>avk_Latn<< >>bzt<< >>bzt_Latn<< >>dws<< >>dws_Latn<< >>epo<< >>ido<< >>ido_Latn<< >>igs<< >>ile<< >>ile_Latn<< >>ina<< >>ina_Latn<< >>jbo<< >>jbo_Cyrl<< >>jbo_Latn<< >>ldn<< >>ldn_Latn<< >>lfn<< >>lfn_Cyrl<< >>lfn_Latn<< >>neu<< >>nov<< >>nov_Latn<< >>qya<< >>qya_Latn<< >>rmv<< >>sjn<< >>sjn_Latn<< >>tlh<< >>tlh_Latn<< >>tzl<< >>tzl_Latn<< >>vol<< >>vol_Latn<< >>zbl<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
Tatoeba-test.eng-afh
1.5
0.108
10
46
1.000
Tatoeba-test.eng-avk
0.3
0.128
167
970
1.000
Tatoeba-test.eng-bzt
0.9
0.136
62
354
1.000
Tatoeba-test.eng-dws
0.9
0.107
10
40
1.000
Tatoeba-test.eng-epo
36.5
0.593
10000
76402
0.997
Tatoeba-test.eng-ido
5.4
0.309
1968
13078
1.000
Tatoeba-test.eng-ido_Latn
5.4
0.309
1967
13072
1.000
Tatoeba-test.eng-ile
0.7
0.115
1711
10655
0.832
Tatoeba-test.eng-ina
5.3
0.266
5000
44642
0.973
Tatoeba-test.eng-jbo
0.2
0.117
5000
35293
1.000
Tatoeba-test.eng-jbo_Cyrl
1.5
0.000
1
9
1.000
Tatoeba-test.eng-jbo_Latn
0.2
0.117
4996
35278
1.000
Tatoeba-test.eng-ldn
0.3
0.080
101
630
0.953
Tatoeba-test.eng-lfn
1.6
0.167
3297
24468
0.914
Tatoeba-test.eng-lfn_Cyrl
0.1
0.008
847
6075
0.975
Tatoeba-test.eng-lfn_Latn
2.0
0.220
2450
18393
0.893
Tatoeba-test.eng-multi
12.1
0.308
10000
69052
1.000
Tatoeba-test.eng-nov
1.7
0.263
198
1303
1.000
Tatoeba-test.eng-qya
0.8
0.114
116
485
1.000
Tatoeba-test.eng-qya_Latn
0.8
0.116
115
481
1.000
Tatoeba-test.eng-sjn
0.4
0.095
44
196
1.000
Tatoeba-test.eng-tlh
0.0
0.130
5000
21301
1.000
Tatoeba-test.eng-tzl
0.5
0.123
166
642
1.000
Tatoeba-test.eng-tzl_Latn
0.5
0.123
165
640
1.000
Tatoeba-test.eng-vol
0.3
0.128
1549
7884
1.000
opus4m+btTCv20210807-2021-09-30.zip
dataset: opus4m+btTCv20210807
model: transformer
source language(s): eng
target language(s): afh avk bzt dws epo ido ile ina jbo ldn lfn nov qya sjn tlh tzl vol
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>afh<< >>afh_Latn<< >>avk<< >>avk_Latn<< >>bzt<< >>bzt_Latn<< >>dws<< >>dws_Latn<< >>epo<< >>ido<< >>ido_Latn<< >>igs<< >>ile<< >>ile_Latn<< >>ina<< >>ina_Latn<< >>jbo<< >>jbo_Cyrl<< >>jbo_Latn<< >>ldn<< >>ldn_Latn<< >>lfn<< >>lfn_Cyrl<< >>lfn_Latn<< >>neu<< >>nov<< >>nov_Latn<< >>qya<< >>qya_Latn<< >>rmv<< >>sjn<< >>sjn_Latn<< >>tlh<< >>tlh_Latn<< >>tzl<< >>tzl_Latn<< >>vol<< >>vol_Latn<< >>zbl<<
download: opus4m+btTCv20210807-2021-09-30.zip
test set translations: opus4m+btTCv20210807-2021-09-30.test.txt
test set scores: opus4m+btTCv20210807-2021-09-30.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
Tatoeba-test-v2021-08-07.eng-multi
22.2
0.430
10000
71346
1.000
Tatoeba-test-v2021-08-07.multi-multi
22.2
0.430
10000
71346
1.000
You can’t perform that action at this time.