Skip to content

Latest commit

 

History

History

eng-gmq

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.2 0.720
Tatoeba-test.eng-fao.eng.fao 8.2 0.314
Tatoeba-test.eng-isl.eng.isl 23.1 0.500
Tatoeba-test.eng.multi 52.0 0.681
Tatoeba-test.eng-non.eng.non 0.7 0.193
Tatoeba-test.eng-swe.eng.swe 57.4 0.713

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.0 0.719
Tatoeba-test.eng-fao.eng.fao 6.9 0.300
Tatoeba-test.eng-isl.eng.isl 22.6 0.500
Tatoeba-test.eng.multi 52.6 0.684
Tatoeba-test.eng-non.eng.non 1.9 0.189
Tatoeba-test.eng-nor.eng.nor 11.9 0.388
Tatoeba-test.eng-swe.eng.swe 57.4 0.714

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.0 0.719
Tatoeba-test.eng-fao.eng.fao 7.0 0.311
Tatoeba-test.eng-isl.eng.isl 23.3 0.500
Tatoeba-test.eng.multi 52.3 0.683
Tatoeba-test.eng-non.eng.non 0.7 0.196
Tatoeba-test.eng-nor.eng.nor 49.6 0.671
Tatoeba-test.eng-swe.eng.swe 56.9 0.711

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.7 0.724
Tatoeba-test.eng-fao.eng.fao 9.2 0.322
Tatoeba-test.eng-isl.eng.isl 23.8 0.506
Tatoeba-test.eng.multi 52.8 0.688
Tatoeba-test.eng-non.eng.non 0.7 0.196
Tatoeba-test.eng-nor.eng.nor 50.3 0.678
Tatoeba-test.eng-swe.eng.swe 57.8 0.717

opus1m+bt-2021-04-13.zip

  • dataset: opus1m+bt
  • model: transformer-align
  • source language(s): eng
  • target language(s): dan fao isl nno nob non swe
  • model: transformer-align
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>dan<< >>fao<< >>isl<< >>jut<< >>nno<< >>nob<< >>non<< >>non_Latn<< >>nrn<< >>ovd<< >>qer<< >>rmg<< >>swe<<
  • download: opus1m+bt-2021-04-13.zip
  • test set translations: opus1m+bt-2021-04-13.test.txt
  • test set scores: opus1m+bt-2021-04-13.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-dan 57.2 0.720 10000 73191 0.980
Tatoeba-test.eng-fao 17.7 0.398 294 1933 0.957
Tatoeba-test.eng-isl 22.8 0.499 2500 18999 0.935
Tatoeba-test.eng-multi 52.9 0.687 10000 71671 0.967
Tatoeba-test.eng-nno 31.1 0.551 460 3428 0.989
Tatoeba-test.eng-nob 52.8 0.689 4539 36110 0.963
Tatoeba-test.eng-non 0.7 0.194 15 142 0.986
Tatoeba-test.eng-nor 51.0 0.677 5000 39543 0.966
Tatoeba-test.eng-swe 56.9 0.710 10000 65572 0.966

opus1m+bt-tuned4eng2fao-2021-04-16.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-fao 18.7 0.406 294 1933 0.949
Tatoeba-test.eng-multi 44.1 0.607 10000 71671 0.971

opus4m+btTCv20210807-2021-09-30.zip

  • dataset: opus4m+btTCv20210807
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob non swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>dan<< >>fao<< >>isl<< >>jut<< >>nno<< >>nob<< >>non<< >>non_Latn<< >>nrn<< >>ovd<< >>qer<< >>rmg<< >>swe<<
  • download: opus4m+btTCv20210807-2021-09-30.zip
  • test set translations: opus4m+btTCv20210807-2021-09-30.test.txt
  • test set scores: opus4m+btTCv20210807-2021-09-30.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.eng-multi 52.2 0.683 10000 72532 0.982
Tatoeba-test-v2021-08-07.multi-multi 52.2 0.683 10000 72532 0.982

opus4m+btTCv20210807-2021-12-08.zip

  • dataset: opus4m+btTCv20210807
  • model: transformer-big
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Zinh non_Latn swe
  • raw source language(s): eng
  • raw target language(s): dan fao isl nno nob non swe
  • model: transformer-big
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels:
  • download: opus4m+btTCv20210807-2021-12-08.zip
  • test set translations: opus4m+btTCv20210807-2021-12-08.test.txt
  • test set scores: opus4m+btTCv20210807-2021-12-08.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.eng-dan 57.9 0.7243 10795 79361 0.990
Tatoeba-test-v2021-08-07.eng-fao 14.4 0.3176 294 1933 0.902
Tatoeba-test-v2021-08-07.eng-isl 36.7 0.5729 2503 19023 0.968
Tatoeba-test-v2021-08-07.eng-multi 53.8 0.6916 10000 72532 0.978
Tatoeba-test-v2021-08-07.eng-non 0.7 0.1937 15 142 1.000
Tatoeba-test-v2021-08-07.eng-nor 52.8 0.6909 5000 39543 0.978
Tatoeba-test-v2021-08-07.eng-swe 56.8 0.7100 10362 68060 0.973

opusTCv20210807+bt_transformer-big_2022-03-13.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.eng-dan 61.1 0.74740 10795 79361 0.992
Tatoeba-test-v2021-08-07.eng-fao 17.9 0.40067 294 1933 0.971
Tatoeba-test-v2021-08-07.eng-isl 39.4 0.59243 2503 19023 0.957
Tatoeba-test-v2021-08-07.eng-multi 58.5 0.72602 10000 71106 0.982
Tatoeba-test-v2021-08-07.eng-nno 39.8 0.60805 460 3428 0.997
Tatoeba-test-v2021-08-07.eng-nob 57.0 0.72105 4539 36110 0.975
Tatoeba-test-v2021-08-07.eng-non 0.7 0.18711 15 142 1.000
Tatoeba-test-v2021-08-07.eng-nor 55.6 0.71151 5000 39543 0.977
Tatoeba-test-v2021-08-07.eng-swe 60.4 0.73736 10362 68060 0.975

opusTCv20210807+bt_transformer-big_2022-03-17.zip

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test-v2021-08-07.eng-dan 61.1 0.74740 10795 79361 0.992
Tatoeba-test-v2021-08-07.eng-fao 17.9 0.40067 294 1933 0.971
Tatoeba-test-v2021-08-07.eng-isl 39.4 0.59243 2503 19023 0.957
Tatoeba-test-v2021-08-07.eng-multi 58.8 0.72799 10000 71106 0.982
Tatoeba-test-v2021-08-07.eng-nno 39.8 0.60805 460 3428 0.997
Tatoeba-test-v2021-08-07.eng-nob 57.0 0.72105 4539 36110 0.975
Tatoeba-test-v2021-08-07.eng-non 0.7 0.18711 15 142 1.000
Tatoeba-test-v2021-08-07.eng-nor 55.6 0.71151 5000 39543 0.977
Tatoeba-test-v2021-08-07.eng-swe 60.4 0.73736 10362 68060 0.975