Learning Resources:
https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/
- Retrieval Argumented Generation
- Problem trying to solve:
- When it comes to LLM model, there are two common problems: No source(privacy or Lack of some domain knowledge); Data is outdated; n LLM’s parameters essentially represent the general patterns of how humans use words to form sentences.
- LLM is a very generatic model, trained on human text, so when it comes to a deeper knowledge about a specific field, it tends to perform less well and even make things up.
- When it comes to LLM model, there are two common problems: No source(privacy or Lack of some domain knowledge); Data is outdated; n LLM’s parameters essentially represent the general patterns of how humans use words to form sentences.
- How it works
- We have an external knowledge base stored in a vector db.
- When user passes a prompt to LLM, it will be embedded first to a numerical vector and sent to vectordb to find most possible(similar) answers.
- The answers will be transformed to human readable sentences, and passes it back to the LLM.
- Finally, the LLM combines the retrieved words and its own response to the query into a final answer it presents to the user, potentially citing sources the embedding model found.
- LangChain, an open-source library, can be particularly useful in chaining together LLMs, embedding models and knowledge bases.
- Advantages
- Up to date knowledge, and cheaper to get updated knowledge without retraining the whole LLM
- Gave LLM a context, so it generated more accurate answer and less makeup