-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Few more Inference Endpoints fixes (#69)
* fix(TGI): correct clear request with a give batch id * ci(tgi): create images when pushing on current branch * fix(generator): raise error if prefill receives too many requests * feat(tgi): add more prefill lenghts Since bucketing does not work for now, we add more (small) prefill lengths. This will increase the warmup time, but it will also allow to speed up generation. * Revert "ci(tgi): create images when pushing on current branch" This reverts commit 26e1193. * fix(test): multiple decode test require max_batch_size to be > 1 * fix(test): expected result is different when model is compiled Compiled model results are not always very good. While this should be better investigated later on, current solution is just to use the non-compiled version. This results in some tests generating different results, so expectations has been updated accordingly. * chore: bump to version v0.1.2
- Loading branch information
1 parent
fd29591
commit 7cce24c
Showing
6 changed files
with
19 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
text-generation-inference/server/text_generation_server/version.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
from pkg_resources import parse_version | ||
|
||
|
||
__version__ = "0.1.1" | ||
__version__ = "0.1.2" | ||
VERSION = parse_version(__version__) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters