-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flattened inf params #152
Flattened inf params #152
Conversation
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Some initial thoughts, but mostly just discussion points
top_p: Optional[float] = 0.0, | ||
typical_p: Optional[float] = 0.0, | ||
temperature: Optional[float] = 1.0, | ||
repetition_penalty: Optional[float] = 0.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments for repetition penalty and top_p here
truncation = True | ||
|
||
if repetition_penalty == 0.0: | ||
repetition_penalty = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now - but why is this being overridden here rather than just using 1 as the default in .run?
Is top_p supposed to be handled outside of generate like this too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was to align with tgis, as per the comments in proto. https://github.com/caikit/caikit-tgis-backend/blob/main/caikit_tgis_backend/generation.proto#L103
typical_p=0.23, | ||
temperature=0.77, | ||
) | ||
assert isinstance(pred, GeneratedTextResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to add some kind of validation to trace the input args to generate or something like that? Since these tests aren't actually validating that greedy / sampling etc are happening
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
error.type_check("<NLP84635843E>", int, allow_none=True, top_k=top_k) | ||
error.type_check("<NLP55267523E>", float, allow_none=True, top_p=top_p) | ||
error.type_check("<NLP13670202E>", float, allow_none=True, typical_p=typical_p) | ||
error.type_check( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type check for decoding_method
, temperature
, max_time
and exponential_decay_length_penalty
? And a value check on decoding_method
something like this?
… run Signed-off-by: gkumbhat <[email protected]>
Signed-off-by: gkumbhat <[email protected]>
Amount of time in seconds that the query should take maximum. | ||
NOTE: this does not include network overhead. | ||
Range: 0-120.0 | ||
exponential_decay_length_penalty: Tuple(int, float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type should also include ExponentialDecayLengthPenalty
Supports #155