-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different Caikit server instances returns different protos #237
Comments
I have a simpler reproducer !
again, the location of this field is the issue:
|
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
I don't really know, I've tested it only in my OpenShift cluster with the should be easy to check with this reproducer |
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
Ok, looks like the version of |
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
…roto' workaround for caikit/caikit-nlp#237 so that llm-load-test always runs with the right protos
thanks for the updates, I can't find any problematic difference now :) only thing I noticed is that the Services still have a random order. Two examples: I think this last bit will be solved by caikit/caikit#535, @evaline-ju could you confirm? |
@aluu317 worked on caikit/caikit#535 - could you provide any insight on the above? I think I was tagged by mistake |
oups, I got it wrong with the auto-completion, sorry Evaline! |
@kpouget Ah, unfortunately, the PR caikit/caikit#535 will not fix the ordering of rpcs, it was meant to fix the ordering of the fields in your original bug. |
Describe the bug
As part of my automated testing, I observed that some requests were returning very quickly.
Investigations highlighted that
min_new_tokens: 0
was the reason for the quick return:max_new_tokens: 25, min_new_tokens: 0
Request generated 5 tokens before EosToken
while the query is set with
max_new_tokens == min_new_tokens
.Further investigations let to this reproducer:
1.log | 2.log
which highlights that the protos returned by two endpoints running the same image are different.
Image is
quay.io/opendatahub/caikit-tgis-serving@sha256:794adc22d52cb3ac4b5aadfb286e8431cca829acdc4909719329cf8c4fabb4ec
Platform
Caikit packages in this image have this version:
Python 3.9
Sample Code
See above.
The invalid launch happens ~50% of the time, from what I observed.
Expected behavior
The prototypes are always the same.
Observed behavior
The prototypes do not have the same ordering.
No error printed anywhere.
Additional info
The location of this block (+ the field numbering) is the key difference between the "different versions" of the protos:
The text was updated successfully, but these errors were encountered: