Skip to content

Latest commit

 

History

History

afa-eng

opus-2020-07-03.zip

  • dataset: opus
  • model: transformer
  • source language(s): acm afb amh apc apc_Latn ara ara_Latn arq arq_Latn ary arz eng heb kab mlt rif_Latn shy_Latn som tir
  • target language(s): eng hau_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-03.zip
  • test set translations: opus-2020-07-03.test.txt
  • test set scores: opus-2020-07-03.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 3.0 0.233
Tatoeba-test.ara-eng.ara.eng 13.3 0.374
Tatoeba-test.hau-eng.hau.eng 0.2 0.083
Tatoeba-test.heb-eng.heb.eng 14.9 0.397
Tatoeba-test.kab-eng.kab.eng 0.4 0.108
Tatoeba-test.mlt-eng.mlt.eng 9.2 0.330
Tatoeba-test.multi.eng 26.0 0.451
Tatoeba-test.rif-eng.rif.eng 0.5 0.076
Tatoeba-test.shy-eng.shy.eng 0.1 0.027
Tatoeba-test.som-eng.som.eng 0.0 0.097
Tatoeba-test.tir-eng.tir.eng 3.1 0.215

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 35.1 0.553
Tatoeba-test.ara-eng.ara.eng 34.5 0.526
Tatoeba-test.hau-eng.hau.eng 11.5 0.307
Tatoeba-test.heb-eng.heb.eng 41.2 0.578
Tatoeba-test.kab-eng.kab.eng 3.7 0.203
Tatoeba-test.mlt-eng.mlt.eng 40.5 0.586
Tatoeba-test.multi.eng 25.4 0.450
Tatoeba-test.rif-eng.rif.eng 3.5 0.143
Tatoeba-test.shy-eng.shy.eng 1.4 0.141
Tatoeba-test.som-eng.som.eng 16.2 0.285
Tatoeba-test.tir-eng.tir.eng 13.9 0.337

opus2m-2020-07-31.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus2m-2020-07-31.zip
  • test set translations: opus2m-2020-07-31.test.txt
  • test set scores: opus2m-2020-07-31.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 35.9 0.550
Tatoeba-test.ara-eng.ara.eng 36.6 0.543
Tatoeba-test.hau-eng.hau.eng 11.9 0.327
Tatoeba-test.heb-eng.heb.eng 42.7 0.591
Tatoeba-test.kab-eng.kab.eng 4.3 0.213
Tatoeba-test.mlt-eng.mlt.eng 44.3 0.618
Tatoeba-test.multi.eng 27.1 0.464
Tatoeba-test.rif-eng.rif.eng 3.5 0.141
Tatoeba-test.shy-eng.shy.eng 0.6 0.125
Tatoeba-test.som-eng.som.eng 23.6 0.472
Tatoeba-test.tir-eng.tir.eng 13.1 0.328

opus4m-2020-08-12.zip

  • dataset: opus4m
  • model: transformer
  • source language(s): acm afb amh apc ara arq ary arz hau_Latn heb kab mlt rif_Latn shy_Latn som tir
  • target language(s): eng
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • download: opus4m-2020-08-12.zip
  • test set translations: opus4m-2020-08-12.test.txt
  • test set scores: opus4m-2020-08-12.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 35.1 0.543
Tatoeba-test.ara-eng.ara.eng 37.1 0.547
Tatoeba-test.hau-eng.hau.eng 12.6 0.330
Tatoeba-test.heb-eng.heb.eng 43.3 0.598
Tatoeba-test.kab-eng.kab.eng 4.2 0.212
Tatoeba-test.mlt-eng.mlt.eng 45.2 0.618
Tatoeba-test.multi.eng 27.5 0.467
Tatoeba-test.rif-eng.rif.eng 2.2 0.136
Tatoeba-test.shy-eng.shy.eng 0.8 0.116
Tatoeba-test.som-eng.som.eng 23.6 0.472
Tatoeba-test.tir-eng.tir.eng 17.3 0.376