Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No access to the similarity_search_by_vector function #131

Closed
xindexer opened this issue Mar 23, 2024 · 4 comments
Closed

No access to the similarity_search_by_vector function #131

xindexer opened this issue Mar 23, 2024 · 4 comments
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@xindexer
Copy link

There's no way to call for a near_vector search function (similarity_search_by_vector_function). One of the earlier versions had a "by_text" boolean that was used to switch to a vector search. If that is put back in then you can access it again.

if self._by_text: return self._perform_search(query, k, **kwargs) else: if self._embedding is None: raise ValueError( "_embedding cannot be None for similarity_search when " "_by_text=False" ) embedding = self._embedding.embed_query(query) return self.similarity_search_by_vector(embedding, k, **kwargs)

Not the best solution in my opinion, but it is working

@xindexer xindexer added the bug Something isn't working label Mar 23, 2024
@hkhairy
Copy link

hkhairy commented Mar 24, 2024

You can, by passing a keyword argument to the search method. It's not as straightforward as it used to be, because now it relies on a hidden keyword argument called search_method

So, if you call

weaviate_vstore.similarity_search(..., search_method = "near_vector")

It will call the near_vector in the _perform_search method. If you open the _perform_search method in the source code, you'll find this

if search_method == "hybrid":
    embedding = self._embedding.embed_query(query)
    result = collection.query.hybrid(
        query=query, vector=embedding, limit=k, **kwargs
    )
elif search_method == "near_vector":
    result = collection.query.near_vector(limit=k, **kwargs)

There's another problem though, in this block specifically

elif search_method == "near_vector":
  result = collection.query.near_vector(limit=k, **kwargs)

The embedding of the query isn't being passed. I've raised an issue and made a PR for that.

@xindexer
Copy link
Author

Ok - I had tried to use the search_method but like you said it didn't pass the vector - I like your solution much better.

@hsm207 hsm207 added the documentation Improvements or additions to documentation label Mar 25, 2024
hsm207 added a commit that referenced this issue Mar 25, 2024
Remove `similarity_search_by_vector` since the same function can be done
by hybrid search

Fixes:

* #132
* #131

Signed-off-by: hsm207 <[email protected]>

---------

Signed-off-by: hsm207 <[email protected]>
@hsm207
Copy link
Collaborator

hsm207 commented Mar 25, 2024

@xindexer in this new integration, calling similarity_search is the way to do any search. This method calls weaviate's hybrid search under the hood. So, if you want to do vector search only, then either of the following will work:

weaviate_vector_store.similarity_search(query="hello", k=5, alpha=1)

weaviate_vector_store.similarity_search(query=None, vector=[1, 2, 3], k=5, alpha=1)

@hsm207 hsm207 closed this as completed Mar 25, 2024
@hkhairy
Copy link

hkhairy commented Mar 26, 2024

@xindexer
In case you don't know, hybrid search is a hybrid of 2 things:

  • text search (using an algorithm called BM25, or Best Match 25),
  • and vector search.

The hybrid search favors one search outcome over the other via a parameter called alpha. So, if alpha is 0.3, then the vector search will have a weight of 0.3, while the text search will have a weight of 0.7.

You can even control the weight on the level of the property. See here in the weaviate Docs

That's why, you'll find in @hsm207 answer, he provides the alpha = 1 in the first example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants