You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The purpose is to supply LLMs with up-to-date or private data using retrieval-augmented generation (RAG) because LLMs are trained on static and sometimes outdated datasets, which may not provide accurate information.
Goals:
Create text embeddings to perform similarity searches.
When a user makes a query, the system compares embeddings to find the most relevant texts, ensuring responses are more contextually relevant.
Allow users to plug in their own embedding models.
Provide an interface (abstract base classes) so methods stay consistent regardless of which embedding model is used (e.g., OpenAI, HuggingFace, MistralAI, WordLlama).
Cache embeddings to enable fast lookups.
Use vector stores, like NumPy arrays for in memory storage or DuckDB for persistent storage to store embeddings, speeding up similarity searches and improving overall performance.
I propose the following interfaces with minimal methods as to prevent the interface from being too rigid. I did not add delete to the interface because some of the stores, like numpy, are not persistent. DuckDb will have it though.
classEmbeddings(ABC):
@abstractmethoddefembed(self, texts: List[str]) ->List[List[float]]:
"""Generate embeddings for a list of texts."""passclassVectorStore(ABC):
def__init__(self, embedding_model: 'Embeddings'):
self.embedding_model=embedding_model@abstractmethoddefadd(self, texts: List[str], metadata: Optional[List[Dict]] =None) ->List[int]:
""" Add texts and their metadata to the store. Returns: List[int]: A list of unique text IDs for the added texts. """pass@abstractmethoddefquery(self, text: str, top_k: int=5) ->List[Dict]:
""" Query store for similar texts. Returns: List[Dict]: List of matching texts with metadata and similarity scores. """pass
I am planning to refactor the existing Embeddings class.
The purpose is to supply LLMs with up-to-date or private data using retrieval-augmented generation (RAG) because LLMs are trained on static and sometimes outdated datasets, which may not provide accurate information.
Goals:
Create text embeddings to perform similarity searches.
When a user makes a query, the system compares embeddings to find the most relevant texts, ensuring responses are more contextually relevant.
Allow users to plug in their own embedding models.
Provide an interface (abstract base classes) so methods stay consistent regardless of which embedding model is used (e.g., OpenAI, HuggingFace, MistralAI, WordLlama).
Cache embeddings to enable fast lookups.
Use vector stores, like NumPy arrays for in memory storage or DuckDB for persistent storage to store embeddings, speeding up similarity searches and improving overall performance.
I propose the following interfaces with minimal methods as to prevent the interface from being too rigid. I did not add
delete
to the interface because some of the stores, like numpy, are not persistent. DuckDb will have it though.Then, I am planning to implement these:
OpenAI Embeddings
MistralAI Embeddings
HuggingFace Embeddings
WordLlama Embeddings
Example:
With these stores:
NumpyVectorStore (memory)
DuckDBVectorStore (persistent)
WordLlamaVectorStore (memory)
ChromaVectorStore (variety)
Example:
The text was updated successfully, but these errors were encountered: