Open-Source Machine Translation Quality Estimation in PyTorch.
Quality estimation (QE) is one of the missing pieces of machine translation: its goal is to evaluate a translation system’s quality without access to reference translations. We present OpenKiwi, a Pytorch-based open-source framework that implements the best QE systems from WMT 2015-18 shared tasks, making it easy to experiment with these models under the same framework. Using OpenKiwi and a stacked combination of these models we have achieved state-of-the-art results on word-level QE on the WMT 2018 English-German dataset.
- Framework for training QE models and using pre-trained models for evaluating MT.
- Supports both word and sentence-level Quality estimation.
- Implementation of five QE systems in Pytorch: QUETCH [1], NuQE [2, 3], predictor-estimator [4, 5], APE-QE [3], and a stacked ensemble with a linear system [2, 3].
- Easy to use API. Import it as a package in other projects or run from the command line.
- Provides scripts to run pre-trained QE models on data from the WMT 2018 campaign.
- Easy to track and reproduce experiments via yaml configuration files.
Results for the WMT18 Quality Estimation shared task, for word level and sentence level on the test set.
Model | En-De SMT | En-De NMT | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
MT | gaps | source | r | ⍴ | MT | gaps | source | r | ⍴ | |
OpenKiwi | 62.70 | 52.14 | 48.88 | 71.08 | 72.70 | 44.77 | 22.89 | 36.53 | 46.72 | 58.51 |
Wang2018 | 62.46 | 49.99 | -- | 73.97 | 75.43 | 43.61 | -- | -- | 50.12 | 60.49 |
UNQE | -- | -- | -- | 70.00 | 72.44 | -- | -- | -- | 51.29 | 60.52 |
deepQUEST | 42.98 | 28.24 | 33.97 | 48.72 | 50.97 | 30.31 | 11.93 | 28.59 | 38.08 | 48.00 |
To install OpenKiwi as a package, simply run
pip install openkiwi
You can now
import kiwi
inside your project or run in the command line
kiwi
Optionally, if you'd like to take advantage of our MLflow integration, simply install it in the same virtualenv as OpenKiwi:
pip install mlflow
Detailed usage examples and instructions can be found in the Full Documentation.
We provide pre-trained models with the corresponding pre-processed datasets and configuration files. You can easily reproduce our numbers in the WMT 2018 word- and sentence-level tasks by following the reproduce instructions in the documentation.
We welcome contributions to improve OpenKiwi. Please refer to CONTRIBUTING.md for quick instructions or to contributing instructions for more detailed instructions on how to set up your development environment.
OpenKiwi is Affero GPL licensed. You can see the details of this license in LICENSE.
If you use OpenKiwi, please cite the following report.
OpenKiwi: An Open Source Framework for Quality Estimation
@misc{openkiwi,
author = {Fábio Kepler and
Jonay Trénous and
Marcos Treviso and
Miguel Vera and
André F. T. Martins},
title = {Open{K}iwi: An Open Source Framework for Quality Estimation},
year = {2019},
url = {https://arxiv.org/abs/1902.08646},
Eprint = {arXiv:1902.08646},
}