This project documents the results of our evaluation of German T5 models.
- train data: Swisstext
- test data: MLSUM
- GPUs: 4 (V100)
- batch size / GPU: 2
- batch size total: 8
- warmup_ratio: 0.3
- epochs: 10
- max_source_length: 800
- max_target_length: 96
- learning rate: 5e-5 (default)
Higher metric is better.
Model | rouge1 | rouge2 | rougeL | rougeLsum |
---|---|---|---|---|
google/mt5-small | 16.7323 | 3.5629 | 12.65 | 14.6898 |
philschmid/test-german-t5-prompted-germanquad | 15.7629 | 2.8154 | 11.898 | 13.9223 |
stefan-it/t5-base-secret package3 | 15.7427 | 2.9186 | 12.0224 | 13.8726 |
stefan-it/t5-base-secret epoch2-package2 | 15.4757 | 2.7629 | 11.978 | 13.5326 |
GermanT5/t5-base-german-3e | 14.5525 | 2.0007 | 11.1617 | 12.9124 |
GermanT5/t5-efficient-oscar-german-small-el32 last CP | 16.6277 | 3.404 | 12.6183 | 14.5772 |
GermanT5/t5-efficient-oscar-german-small-el32 2nd last CP | 16.6886 | 3.4468 | 12.666 | 14.6423 |
- train data: Swisstext
- test data: MLSUM
- GPUs: 4 (V100)
- batch size / GPU: 2
- batch size total: 8
- warmup_ratio: 0.3
- epochs: 8
- max_source_length: 512
- max_target_length: 96
- learning rate: 5e-5 (default)
Higher metric is better.
Model | rouge1 | rouge2 | rougeL | rougeLsum |
---|---|---|---|---|
google/mt5-small (FP32) | 16.0354 | 3.2689 | 12.2063 | 14.1225 |
GermanT5/t5-efficient-oscar-german-small-el32 (FP32) | 16.2004 | 3.2372 | 12.3031 | 14.2256 |
GermanT5/t5-efficient-gc4-german-small-el32 (FP32) | 17.1507 | 3.8038 | 13.0836 | 15.1671 |
GermanT5/t5-efficient-gc4-german-base-nl36-old (FP16) | 9.3494 | 1.3531 | 7.4037 | 8.5473 |
GermanT5/t5-efficient-gc4-german-base-nl36 (DeepSpeed with ZeRO-3 Example auto config) |
17.9501 | 3.9247 | 13.3758 | 15.6139 |