From 9e6b79f04f4347574f0b5857f9956197dacd81f8 Mon Sep 17 00:00:00 2001 From: JS Date: Fri, 23 Feb 2024 11:01:46 +0200 Subject: [PATCH] minor fixes --- docs/source/examples/sharing_schemes.md | 2 +- docs/source/prepare_data.md | 2 +- docs/source/quickstart.md | 6 +++++- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/docs/source/examples/sharing_schemes.md b/docs/source/examples/sharing_schemes.md index 70d3a361..9734e79c 100644 --- a/docs/source/examples/sharing_schemes.md +++ b/docs/source/examples/sharing_schemes.md @@ -15,7 +15,7 @@ For this tutorial, we will be utilizing the [UNPC](https://opus.nlpl.eu/UNPC/cor Before diving into the sharing schemes, we need to preprocess the data. You can download the processed data using the following command: ```bash -wget +wget https://mammoth-share.a3s.fi/unpc.tar ``` Additionally, we require the corresponding vocabularies for the dataset. Download the vocabularies with the following command: diff --git a/docs/source/prepare_data.md b/docs/source/prepare_data.md index 5d6ba40b..8eaa1dfc 100644 --- a/docs/source/prepare_data.md +++ b/docs/source/prepare_data.md @@ -4,7 +4,7 @@ ## UNPC [UNPC](https://opus.nlpl.eu/UNPC/corpus/version/UNPC) consists of manually translated UN documents from the last 25 years (1990 to 2014) for the six official UN languages, Arabic, Chinese, English, French, Russian, and Spanish. We preprocess the data. You can download the processed data by: -``` +```bash wget https://mammoth-share.a3s.fi/unpc.tar ``` Or you can use the scripts provided by the tarball to process the data yourself. diff --git a/docs/source/quickstart.md b/docs/source/quickstart.md index c9db37fb..8d5d6d85 100644 --- a/docs/source/quickstart.md +++ b/docs/source/quickstart.md @@ -174,8 +174,12 @@ Follow these configs to translate text with your trained model. `--src "$path_to_src_language/$lang_pair.$src_lang.sp"` - Define the path for saving the translated output: `--output "$out_path/$src_lang-$tgt_lang.hyp.sp"` - Adjust GPU and batch size settings based on your requirements: `--gpu 0 --shard_size 0 --batch_size 512` +- We provide the model checkpoint trained using the encoder shared scheme described in [this tutorial](examples/sharing_schemes.md). + ```bash + wget https://mammoth-share.a3s.fi/encoder-shared-models.tar.gz + ``` Congratulations! You've successfully translated text using your Mammoth model. Adjust the parameters as needed for your specific translation tasks. ### Further reading -A complete example of training on the Europarl dataset is available at [MAMMOTH101](examples/train_mammoth_101.md), and a complete example for configuring different sharing schemes is available at [MAMMOTH sharing schemes](examples/sharing_schemes.md) \ No newline at end of file +A complete example of training on the Europarl dataset is available at [MAMMOTH101](examples/train_mammoth_101.md), and a complete example for configuring different sharing schemes is available at [MAMMOTH sharing schemes](examples/sharing_schemes.md). \ No newline at end of file