Skip to content

Latest commit

 

History

History

eng-kaz

opus-2020-09-10.zip

  • dataset: opus
  • model: transformer-align
  • source language(s): eng
  • target language(s): kaz_Cyrl kaz_Latn
  • model: transformer-align
  • pre-processing: normalization + SentencePiece (spm4k,spm4k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-09-10.zip
  • test set translations: opus-2020-09-10.test.txt
  • test set scores: opus-2020-09-10.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng.kaz 3.1 0.181

opusTCv20210807+bt-2021-09-05.zip

Benchmarks

testset BLEU chr-F #sent #words BP
newsdev2019-enkk.eng-kaz_Cyrl 20.3 0.530 2066 42561 0.931
newstest2019-enkk.eng-kaz_Cyrl 7.9 0.394 998 18810 0.943
Tatoeba-test-v2021-08-07.eng-kaz 20.7 0.492 403 2180 1.000