You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The AI tab would benefit from having an embedding vector visualization. That is, whenever the user enables the AI extension and sets up a vector index, we could plot the resulting vectors in the UI using UMAP.
UMAP is a technique that enables us to project 2000-dimensional data to 2 or 3 dimensions, that is, points with 2 or 3 coordinates that we can plot. It preserves the local structure, so the user can see which pieces of text are similar to each other according to the embedding model.
A potential workflow such visualization could enable:
The user creates a vector index
They open up visualization and select a type to visualize
They input a text query
The query gets embedded via the API and then gets projected on visualization. The user is able to see what points in the index are the closest to their query.
Alternatively, they could browse the visualization, filter it by EdgeQL expressions and see what points cluster up together. This information would enable them to adjust the content of the property that gets indexed by the database.
There would have to be a cap of ~10000 samples that get visualized, otherwise it would take forever to calculate a projection. Those samples would need to be picked uniformly across all of the records.
The text was updated successfully, but these errors were encountered:
The AI tab would benefit from having an embedding vector visualization. That is, whenever the user enables the AI extension and sets up a vector index, we could plot the resulting vectors in the UI using UMAP.
UMAP is a technique that enables us to project 2000-dimensional data to 2 or 3 dimensions, that is, points with 2 or 3 coordinates that we can plot. It preserves the local structure, so the user can see which pieces of text are similar to each other according to the embedding model.
A potential workflow such visualization could enable:
Alternatively, they could browse the visualization, filter it by EdgeQL expressions and see what points cluster up together. This information would enable them to adjust the content of the property that gets indexed by the database.
There would have to be a cap of ~10000 samples that get visualized, otherwise it would take forever to calculate a projection. Those samples would need to be picked uniformly across all of the records.
The text was updated successfully, but these errors were encountered: