Releases: DCGM/lm-evaluation-harness
Releases · DCGM/lm-evaluation-harness
v0.4 Preview Release
- Fixes bug causing binary F1 computation.
- Fixes bug with double include in yaml inheritance.
- Added clarification for exception when using language modeling tasks with smart truncation.
- Added unit tests.
Fixed issue with subjectivity task
- Unfortunately, the subjectivity task was not properly configured. The labels were assigned the other way around, when compared. This was fixed in commit a85cf.
- Reevaluation of experiments is not necessary, it is enough, if you flip the llhs in the logfiles, and recompute your metrics.
v0.2
v0.1 Preview Release
This is the code we used with first experiments.