Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enabler: guardrail inserting values to vector store #251

Open
kdziedzic68 opened this issue Dec 17, 2024 · 1 comment
Open

enabler: guardrail inserting values to vector store #251

kdziedzic68 opened this issue Dec 17, 2024 · 1 comment
Labels
feature New feature or request needs discussion Task is under discussion or need more clarification

Comments

@kdziedzic68
Copy link
Collaborator

Feature description

The effect could be achived in a few ways

  1. store full embedding model name in metadata (currently only image or text are stored) - would help identify eventual broken entries
  2. keep embedder as the attribute of vector store - seems nice and elegant approach to me, but would require significant considerations of system architecture
    3, develop some outlier detection model for the store - quiet risky, may not be 100% accurate, from the other hand could be helpful with managing store size - eg. serve as noise detector

Motivation

Right now we are able to insert any vectors to the given store. Let assume that somebody has chosen model that has the same number of dimensions as the ones already stored - but different model. In current setup we would insert it and break the store - without possibility of identyfing broken entries

Additional context

No response

@kdziedzic68 kdziedzic68 added the feature New feature or request label Dec 17, 2024
@kdziedzic68 kdziedzic68 moved this to Backlog in ragbits Dec 17, 2024
@mhordynski mhordynski added the needs discussion Task is under discussion or need more clarification label Dec 17, 2024
@mhordynski
Copy link
Member

  1. keep embedder as the attribute of vector store - seems nice and elegant approach to me, but would require significant considerations of system architecture

and later implementing

  1. store full embedding model name in metadata (currently only image or text are stored) - would help identify eventual broken entries

seems like good option, but we need to dig deeper to understand all consequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request needs discussion Task is under discussion or need more clarification
Projects
Status: Backlog
Development

No branches or pull requests

2 participants