Skip to content

Latest commit

 

History

History

eng-afa

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 9.6 0.502
Tatoeba-test.eng-ara.eng.ara 11.5 0.402
Tatoeba-test.eng-hau.eng.hau 10.1 0.450
Tatoeba-test.eng-heb.eng.heb 31.3 0.542
Tatoeba-test.eng-kab.eng.kab 1.2 0.179
Tatoeba-test.eng-mlt.eng.mlt 15.0 0.525
Tatoeba-test.eng.multi 13.8 0.364
Tatoeba-test.eng-rif.eng.rif 1.6 0.072
Tatoeba-test.eng-shy.eng.shy 0.8 0.066
Tatoeba-test.eng-som.eng.som 0.0 0.294
Tatoeba-test.eng-tir.eng.tir 2.4 0.233

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 10.6 0.513
Tatoeba-test.eng-ara.eng.ara 11.2 0.397
Tatoeba-test.eng-hau.eng.hau 8.2 0.429
Tatoeba-test.eng-heb.eng.heb 31.3 0.541
Tatoeba-test.eng-kab.eng.kab 1.2 0.175
Tatoeba-test.eng-mlt.eng.mlt 17.0 0.532
Tatoeba-test.eng.multi 13.7 0.363
Tatoeba-test.eng-rif.eng.rif 1.5 0.109
Tatoeba-test.eng-shy.eng.shy 0.7 0.093
Tatoeba-test.eng-som.eng.som 16.0 0.272
Tatoeba-test.eng-tir.eng.tir 2.6 0.238

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 11.6 0.504
Tatoeba-test.eng-ara.eng.ara 12.0 0.404
Tatoeba-test.eng-hau.eng.hau 10.2 0.429
Tatoeba-test.eng-heb.eng.heb 32.3 0.551
Tatoeba-test.eng-kab.eng.kab 1.6 0.191
Tatoeba-test.eng-mlt.eng.mlt 17.7 0.551
Tatoeba-test.eng.multi 14.4 0.375
Tatoeba-test.eng-rif.eng.rif 1.7 0.103
Tatoeba-test.eng-shy.eng.shy 0.8 0.090
Tatoeba-test.eng-som.eng.som 16.0 0.429
Tatoeba-test.eng-tir.eng.tir 2.7 0.238