Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: retrieve embeddings from database only when necessary #119

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fangyi-zhou
Copy link

When performing a similarity search without using maximal marginal relevance, the database query includes the embeddings by default, whereas the retrived embeddings are discarded without use.

This can be very suboptimal when retrieve a large number of documents due to communication overhead.

@fangyi-zhou fangyi-zhou force-pushed the select-columns-to-query branch from 3faed2b to c1f5956 Compare September 18, 2024 22:55
@fangyi-zhou fangyi-zhou changed the title feat: retrieve embeddings only from database when necessary feat: retrieve embeddings from database only when necessary Sep 18, 2024
When performing a similarity search without using maximal marginal
relevance, the database query includes the embeddings by default,
whereas the retrived embeddings are discarded without use.

This can be very suboptimal when retrieve a large number of documents
due to communication overhead.
@fangyi-zhou fangyi-zhou force-pushed the select-columns-to-query branch from c1f5956 to 712a40c Compare September 19, 2024 09:51
@fangyi-zhou
Copy link
Author

fangyi-zhou commented Sep 23, 2024

Hello can I get a review of this PR? @eyurtsev

@eyurtsev
Copy link
Collaborator

Looks reasonable could you add unit tests?

@fangyi-zhou
Copy link
Author

I'm not sure how to add unit test for this performance patch, any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants