Replies: 2 comments 2 replies
-
🤖 Hi @XariZaru! I'm Dosu, a friendly bot here to assist while we're waiting for a human maintainer. I'm capable of helping you solve bugs, answering your questions, and guiding you to become a contributor. Let's work on your issue together! In LangChain, you can use the However, LangChain does not provide a built-in way to filter documents based on their metadata and date of effectiveness. You would need to implement this functionality yourself. You could do this by adding a filtering step in the Here is a simplified example of how you could implement this: def rank_fusion(
self,
query: str,
run_manager: CallbackManagerForRetrieverRun,
*,
config: Optional[RunnableConfig] = None,
) -> List[Document]:
# Get the results of all retrievers.
retriever_docs = [
retriever.invoke(
query,
patch_config(
config, callbacks=run_manager.get_child(tag=f"retriever_{i+1}")
),
)
for i, retriever in enumerate(self.retrievers)
]
# Filter documents based on metadata and date of effectiveness
for i in range(len(retriever_docs)):
retriever_docs[i] = [
doc for doc in retriever_docs[i]
if doc.metadata['date_of_effectiveness'] <= datetime.now()
and doc.metadata['other_required_metadata'] == 'required_value'
]
# Enforce that retrieved docs are Documents for each list in retriever_docs
for i in range(len(retriever_docs)):
retriever_docs[i] = [
Document(page_content=doc) if not isinstance(doc, Document) else doc
for doc in retriever_docs[i]
]
# apply rank fusion
fused_documents = self.weighted_reciprocal_rank(retriever_docs)
return fused_documents Please note that this is a simplified example and you would need to adjust it to fit your specific requirements. You can find more information about the As for parsing and extracting metadata from documents, such as the date of effectiveness from the filename, LangChain does not seem to have built-in support for this. You would need to implement this functionality yourself. I recommend using a regular expression or a similar method to extract the date of effectiveness from the filename, and then storing this information in the document's metadata. I hope this helps! If you have any further questions, please let me know. SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
Can additional insight from a contributor and things they may have done to get the latest document in a chain? I am not sure if @dosu-bot can answer this. I do have restrictions on the fact that I cannot edit any of the original functions in Langchain. For example, this is how the document data looks like:
|
Beta Was this translation helpful? Give feedback.
-
I know this is a tricky one but basically I have documents that are similar to each other. They are only current for specific dates. For example, one document may only be effective for March 2023. The other effective for April 2023.
They are not identical documents. There are changes to them. What is the best practice in Langchain to look at these documents based on metadata and return the latest out of the bunch? Here is an example.
I have a list of files:
They all contain similar type of information (but not identical). If a user query was something similar to:
"Grab me the latest rates for an auto loan."
It should find the latest rate documents and grab the information out of it.
What is the best way to do this in Langchain?
@dosu-bot
Beta Was this translation helpful? Give feedback.
All reactions