-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples show boolean vectors, what about string vectors? #27
Comments
Excellent point. It would help to have examples converting strings and documents to vectors using something like word2vec but more lightweight. Something like this but with a fully coded example: |
I use this approach:
|
This is a very interesting method and I would like to use it in my project. At the moment I am using ngrams to create boolean vectors, perhaps this works better and faster. I do have a few questions/observations:
Cheers PS. I slightly changed your code to make more use of (Java) constants:
|
Hi all, I have the same question here, why do not we use some random hash functions to hash strings into some u64 and consider it as bit vectors/binary vectors, this is how it was used in practice since I can imagine for truely random hash functions, same strings will be hashed to the same binary vectors. Check the Rust version of SRP-LSH here: https://github.com/serega/gaoya Any idea how to modified it? Thanks, Jianshu |
Hi,
I was wondering how to use this library for comparing two different Strings that are tokenized into a string vector each.
The examples only show boolean vectors which are just "post-transformation". As a newbie and to make great use of the library, it would be great to have the transformation part covered in the examples.
The text was updated successfully, but these errors were encountered: