-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a simple vector inspector tool #298
Comments
The tool could also report some aggregate stats, like per-dimension variance, or, do all/some dimensions have negative values, etc. |
Capturing a tiny tool I have been using for posterity:
|
Awesome! Let's start with that! I'll go merge it :) Thanks @msokolov |
mikemccand
added a commit
that referenced
this issue
Sep 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Too often when trying to generate
.vec
files for benchmarking from Cohere I struggled with whether the written files were actually "correct".E.g. early attempts were writing
float64
instead offloat32
and, horribly, if you run with afloat64
encoded.vec
file nothing really "goes wrong", except you get weird/bad recall. Eachfloat64
is interpreted as two (strange) adjacentfloat32
.It'd be nice to have a tool that could just give a bit of transparency about a
.vec
file, e.g. if its size doesn't evenly divide by the dimensions, something is wrong. Or if there are NaN's, something is wrong. Or if the vectors are not normalized to unit sphere when you expected them to be, something is wrong.Maybe the tool could also print out the actual float values for a few vectors and we might use our human eyes to look for any such "anomalies" ...
The text was updated successfully, but these errors were encountered: