Folders and files Name Name Last commit message
Last commit date
parent directory
View all files
dataset: opus
model: transformer
source language(s): eng
target language(s): ces csb_Latn pol
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus-2020-06-28.zip
test set translations: opus-2020-06-28.test.txt
test set scores: opus-2020-06-28.eval.txt
testset
BLEU
chr-F
newssyscomb2009-engces.eng.ces
19.6
0.478
news-test2008-engces.eng.ces
16.9
0.453
newstest2009-engces.eng.ces
17.8
0.468
newstest2010-engces.eng.ces
18.1
0.472
newstest2011-engces.eng.ces
19.4
0.474
newstest2012-engces.eng.ces
17.4
0.454
newstest2013-engces.eng.ces
20.5
0.480
newstest2015-encs-engces.eng.ces
20.3
0.485
newstest2016-encs-engces.eng.ces
22.9
0.505
newstest2017-encs-engces.eng.ces
18.4
0.464
newstest2018-encs-engces.eng.ces
18.0
0.466
newstest2019-encs-engces.eng.ces
19.4
0.474
Tatoeba-test.eng-ces.eng.ces
41.8
0.615
Tatoeba-test.eng-csb.eng.csb
1.4
0.190
Tatoeba-test.eng.multi
41.3
0.619
Tatoeba-test.eng-pol.eng.pol
40.6
0.623
dataset: opus
model: transformer
source language(s): eng
target language(s): ces csb_Latn dsb hsb pol
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus-2020-07-27.zip
test set translations: opus-2020-07-27.test.txt
test set scores: opus-2020-07-27.eval.txt
testset
BLEU
chr-F
newssyscomb2009-engces.eng.ces
19.8
0.480
news-test2008-engces.eng.ces
17.1
0.453
newstest2009-engces.eng.ces
17.9
0.470
newstest2010-engces.eng.ces
18.3
0.474
newstest2011-engces.eng.ces
19.1
0.474
newstest2012-engces.eng.ces
17.4
0.452
newstest2013-engces.eng.ces
20.1
0.478
newstest2015-encs-engces.eng.ces
19.8
0.485
newstest2016-encs-engces.eng.ces
22.8
0.504
newstest2017-encs-engces.eng.ces
18.6
0.465
newstest2018-encs-engces.eng.ces
18.1
0.467
newstest2019-encs-engces.eng.ces
19.3
0.472
Tatoeba-test.eng-ces.eng.ces
41.5
0.614
Tatoeba-test.eng-csb.eng.csb
3.1
0.207
Tatoeba-test.eng-dsb.eng.dsb
1.8
0.157
Tatoeba-test.eng-hsb.eng.hsb
4.6
0.186
Tatoeba-test.eng.multi
40.9
0.616
Tatoeba-test.eng-pol.eng.pol
40.8
0.623
dataset: opus2m
model: transformer
source language(s): eng
target language(s): ces csb_Latn dsb hsb pol
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
download: opus2m-2020-08-02.zip
test set translations: opus2m-2020-08-02.test.txt
test set scores: opus2m-2020-08-02.eval.txt
testset
BLEU
chr-F
newssyscomb2009-engces.eng.ces
20.6
0.488
news-test2008-engces.eng.ces
18.3
0.466
newstest2009-engces.eng.ces
19.8
0.483
newstest2010-engces.eng.ces
19.8
0.486
newstest2011-engces.eng.ces
20.6
0.489
newstest2012-engces.eng.ces
18.6
0.464
newstest2013-engces.eng.ces
22.3
0.495
newstest2015-encs-engces.eng.ces
21.7
0.502
newstest2016-encs-engces.eng.ces
24.5
0.521
newstest2017-encs-engces.eng.ces
20.1
0.480
newstest2018-encs-engces.eng.ces
19.9
0.483
newstest2019-encs-engces.eng.ces
21.2
0.490
Tatoeba-test.eng-ces.eng.ces
43.7
0.632
Tatoeba-test.eng-csb.eng.csb
1.2
0.188
Tatoeba-test.eng-dsb.eng.dsb
1.5
0.167
Tatoeba-test.eng-hsb.eng.hsb
5.7
0.199
Tatoeba-test.eng.multi
42.8
0.632
Tatoeba-test.eng-pol.eng.pol
43.2
0.641
dataset: opus1m+bt
model: transformer-align
source language(s): eng
target language(s): ces csb dsb hsb pol
model: transformer-align
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>ces<< >>csb<< >>csb_Latn<< >>czk<< >>dsb<< >>hsb<< >>pol<< >>pox<< >>slk<< >>szl<<
download: opus1m+bt-2021-04-10.zip
test set translations: opus1m+bt-2021-04-10.test.txt
test set scores: opus1m+bt-2021-04-10.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
newssyscomb2009.eng-ces
19.7
0.476
502
10032
0.976
news-test2008.eng-ces
16.4
0.448
2051
42484
0.978
newstest2009.eng-ces
17.3
0.462
2525
55533
0.981
newstest2010.eng-ces
17.6
0.466
2489
52958
0.979
newstest2011.eng-ces
19.0
0.472
3003
65653
0.950
newstest2012.eng-ces
16.8
0.446
3003
65456
0.934
newstest2013.eng-ces
20.0
0.475
3000
57250
0.955
newstest2015-encs.eng-ces
19.6
0.481
2656
45931
1.000
newstest2016-encs.eng-ces
22.1
0.498
2999
57013
0.985
newstest2017-encs.eng-ces
18.0
0.460
3005
54461
0.970
newstest2018-encs.eng-ces
17.7
0.462
2983
54772
0.992
newstest2019-encs.eng-ces
18.7
0.469
1997
43373
0.971
Tatoeba-test.eng-ces
39.9
0.601
10000
65287
0.983
Tatoeba-test.eng-csb
6.0
0.208
27
243
0.811
Tatoeba-test.eng-dsb
22.5
0.394
34
184
1.000
Tatoeba-test.eng-hsb
30.6
0.458
40
207
1.000
Tatoeba-test.eng-multi
39.4
0.606
10000
65263
0.970
Tatoeba-test.eng-pol
40.0
0.618
10000
64899
0.959
opus4m+btTCv20210807-2021-09-30.zip
dataset: opus4m+btTCv20210807
model: transformer
source language(s): eng
target language(s): ces csb dsb hsb pol slk szl
model: transformer
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<<
(id = valid target language ID)
valid language labels: >>ces<< >>csb<< >>csb_Latn<< >>czk<< >>dsb<< >>hsb<< >>pol<< >>pox<< >>slk<< >>szl<<
download: opus4m+btTCv20210807-2021-09-30.zip
test set translations: opus4m+btTCv20210807-2021-09-30.test.txt
test set scores: opus4m+btTCv20210807-2021-09-30.eval.txt
testset
BLEU
chr-F
#sent
#words
BP
newssyscomb2009.eng-ces
20.6
0.489
502
10032
0.980
news-test2008.eng-ces
18.1
0.464
2051
42484
0.983
newstest2009.eng-ces
18.9
0.478
2525
55533
0.982
newstest2010.eng-ces
19.6
0.486
2489
52958
0.986
newstest2011.eng-ces
20.8
0.489
3003
65653
0.956
newstest2012.eng-ces
18.2
0.462
3003
65456
0.935
newstest2013.eng-ces
22.0
0.494
3000
57250
0.961
newstest2015-encs.eng-ces
21.0
0.495
2656
45931
1.000
newstest2016-encs.eng-ces
24.5
0.518
2999
57013
0.991
newstest2017-encs.eng-ces
19.6
0.476
3005
54461
0.976
newstest2018-encs.eng-ces
19.4
0.480
2983
54772
1.000
newstest2019-encs.eng-ces
20.8
0.488
1997
43373
0.980
Tatoeba-test-v2021-08-07.eng-multi
39.1
0.602
10000
65766
0.987
Tatoeba-test-v2021-08-07.multi-multi
39.1
0.602
10000
65766
0.987
You can’t perform that action at this time.