how to save, load and update vectorstoreindex locally? #4188
Replies: 11 comments 7 replies
-
It depends on what backend vectorstore you are using. FAISS, for example, allows you to save to disk and also merge two vectorstores together. But you would need to check with the documentation of your specific vectorstore to know whether something similar is supported. I don't have a lot of experience with the other vectorstores. |
Beta Was this translation helpful? Give feedback.
-
For FAISS, I saw some documentation like this. But have not tried it myself docsearch = FAISS.from_documents(docs, embeddings) |
Beta Was this translation helpful? Give feedback.
-
@IamExperimenting Hi, did you find a solution for how to update the index for new new pdf files? Thanks! |
Beta Was this translation helpful? Give feedback.
-
I want to append to this question, because it's not clear to me yet.
It took an awful lot of time, I had 110000 documents, and then my retrieval worked. It stopped working, after I tried to load the vector store from disk. Only 200 are left if I count with collection.count(). Similarity search does not return anything. Anybody a guess or idea what went wrong? There doesn't seem to be a tutorial (or documentation) around which covers 'more than one document' vector store. |
Beta Was this translation helpful? Give feedback.
-
Probably very vectorestore specific, maybe its better to make it using native vectorestore methods instead of langchain .from_documents |
Beta Was this translation helpful? Give feedback.
-
Now as I increase the embedding, I encountered this problem with FAISS
Anyone have idea how to resolve this? I am considering pinecone or another vector store. @catbears , is there any particular reason you decided on Chroma? |
Beta Was this translation helpful? Give feedback.
-
@tancs711 From the local vector stores supported by Langchain, Chroma was the top alphabetically. Also I found a tutorial which worked 😄 Is FAISS easier to use? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the feedback I think my issue stems from this 'totally not a bug' feature which I must've overlooked in the documentation: chroma-core/chroma#683 Will try FAISS as well |
Beta Was this translation helpful? Give feedback.
-
Same problem for me using Chroma.
|
Beta Was this translation helpful? Give feedback.
-
It worked for me with chroma db, after a few corrections . And then saving and loading from disk. I Loaded our website - https://www.vaayushop.com/ all_splits = text_splitter.split_documents(data) Now, next time, load from disk if os.path.exists(".//embeddings//"): |
Beta Was this translation helpful? Give feedback.
-
Yes, |
Beta Was this translation helpful? Give feedback.
-
Hi team,
I'm creating index using vectorstoreindexcreator, can anyone tell how to save and load locally? because, I feel like running/creating index everytime which is time consuming task.
also how do I update the index if I get new new pdf files? do I need to run from the beginning or is there any options to update or merge?
@oddrationale do you have answer for this?
Beta Was this translation helpful? Give feedback.
All reactions