Skip to content

Commit

Permalink
minor fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
shaoxiongji committed Feb 23, 2024
1 parent 9057c65 commit 9e6b79f
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/source/examples/sharing_schemes.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ For this tutorial, we will be utilizing the [UNPC](https://opus.nlpl.eu/UNPC/cor

Before diving into the sharing schemes, we need to preprocess the data. You can download the processed data using the following command:
```bash
wget
wget https://mammoth-share.a3s.fi/unpc.tar
```

Additionally, we require the corresponding vocabularies for the dataset. Download the vocabularies with the following command:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/prepare_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
## UNPC
[UNPC](https://opus.nlpl.eu/UNPC/corpus/version/UNPC) consists of manually translated UN documents from the last 25 years (1990 to 2014) for the six official UN languages, Arabic, Chinese, English, French, Russian, and Spanish.
We preprocess the data. You can download the processed data by:
```
```bash
wget https://mammoth-share.a3s.fi/unpc.tar
```
Or you can use the scripts provided by the tarball to process the data yourself.
Expand Down
6 changes: 5 additions & 1 deletion docs/source/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,12 @@ Follow these configs to translate text with your trained model.
`--src "$path_to_src_language/$lang_pair.$src_lang.sp"`
- Define the path for saving the translated output: `--output "$out_path/$src_lang-$tgt_lang.hyp.sp"`
- Adjust GPU and batch size settings based on your requirements: `--gpu 0 --shard_size 0 --batch_size 512`
- We provide the model checkpoint trained using the encoder shared scheme described in [this tutorial](examples/sharing_schemes.md).
```bash
wget https://mammoth-share.a3s.fi/encoder-shared-models.tar.gz
```

Congratulations! You've successfully translated text using your Mammoth model. Adjust the parameters as needed for your specific translation tasks.
### Further reading
A complete example of training on the Europarl dataset is available at [MAMMOTH101](examples/train_mammoth_101.md), and a complete example for configuring different sharing schemes is available at [MAMMOTH sharing schemes](examples/sharing_schemes.md)
A complete example of training on the Europarl dataset is available at [MAMMOTH101](examples/train_mammoth_101.md), and a complete example for configuring different sharing schemes is available at [MAMMOTH sharing schemes](examples/sharing_schemes.md).

0 comments on commit 9e6b79f

Please sign in to comment.