Skip to content

Commit

Permalink
Merge branch 'ctranslate-multigpu' into ibaldoall
Browse files Browse the repository at this point in the history
  • Loading branch information
ibaldonl committed Apr 11, 2024
2 parents 9223180 + 9dd2213 commit cae4a29
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
7 changes: 6 additions & 1 deletion ctranslate/README.md
Original file line number Diff line number Diff line change
@@ -1 +1,6 @@
[CTranslate](https://opennmt.net/CTranslate2/guides/transformers.html#llama-2)
[CTranslate](https://opennmt.net/CTranslate2/guides/transformers.html#llama-2)

In case of multiple GPUs, if for example we have 4, we run
```sh
mpirun -np 4 python3 bench.py
```
2 changes: 1 addition & 1 deletion ctranslate/bench.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from questions import questions
import pandas as pd

generator = ctranslate2.Generator("llama-2-7b-ct2", device="cuda")
generator = ctranslate2.Generator("llama-2-7b-ct2", device="cuda", tensor_parallel=True, flash_attention=True)
tokenizer = transformers.AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")

def predict(prompt:str):
Expand Down

0 comments on commit cae4a29

Please sign in to comment.