-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade text embedding from text-embedding-004 to text-embedding-preview-0815 #46
Comments
Calling the preview embedding:
payload (request.json):
response:
|
For comparison, here is the call of the
|
Since embedding-004 the API supports
According to doc the "reduction" is a simple truncation:
Note that a code example suggests reduction to 256. Note also that |
Since the app is in alpha I will skip the data migration this time. Later when we'll need that the migration could mean a lot of embedding calls (with long chat history), which possibly even need to be throttled to not hit any quotas. We can use batch inference to cut down the number of requests, but that has a limit as well (max number of texts in the array). We cannot do that during app startup, this would rather be in the settings and once the user switches the embedding model an alert could verify the intent and then a progress bar view would accompany the process. |
I'll decide if I take advantage of the dimensionality truncation. A reduction to 256 dimensions would cut the storage size in 1/3rd (768 / 3 = 256) and also the retrieval processing time as well. If we go with the truncation we'd definitely benefit from a reranking #39 |
Looks like the reduction works with any arbitrary number, I tried 64. I also realize that this is not available on the Gemini Dart API yet google-gemini/generative-ai-dart#208, however the workaround can be to perform the truncation ourselves until the support, since it's a simple truncation. |
I ended up opening a separate ticket for dimensionality reduction: #47 |
I prefer multilingual model #48 over this upgrade |
At the latest Gemini Unplugged show when I was the guest speaker (https://www.linkedin.com/events/7234841228205268993/comments/) CHANDRA drew my attention that there's a new embedding model in a preview state. He said
008
I think he meant thetext-embedding-preview-0815
. It seems that it's also has 768 dimensionality.This change will require a data migration. The dimensionality is the same, so no schema change is needed.
The text was updated successfully, but these errors were encountered: