Skip to content

Latest commit

 

History

History

fiu-sem

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

opus-2021-02-18.zip

  • dataset: opus
  • model: transformer
  • source language(s): fin hun
  • target language(s): ara arq arz heb jpa tmr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>heb<< >>ara<< >>arq<< >>arz<<
  • download: opus-2021-02-18.zip
  • test set translations: opus-2021-02-18.test.txt
  • test set scores: opus-2021-02-18.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.fin-ara 8.6 0.436 7 30 0.931
Tatoeba-test.fin-heb 31.9 0.548 212 1354 1.000
Tatoeba-test.hun-ara 10.6 0.402 93 455 0.920
Tatoeba-test.hun-arq 9.7 0.131 1 5 1.000
Tatoeba-test.hun-heb 27.2 0.510 401 2212 0.999
Tatoeba-test.hun-jpa 4.8 0.000 2 8 0.867
Tatoeba-test.hun-tmr 0.7 0.000 5 17 1.000
Tatoeba-test.multi-multi 27.2 0.506 718 4070 0.999