How to format streaming response with sources. #23751

levalencia · 2024-07-02T05:53:42Z

levalencia
Jul 2, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

async def policy_gpt_chain_lcel(prompt, username, inputs, outputs):
    async def llm_thread_gpt_lcel(prompt, username, inputs, outputs):
        llm = AzureChatOpenAI(
            azure_endpoint=SHD_AZURE_OPENAI_ENDPOINT,
            openai_api_version="2023-03-15-preview",
            deployment_name=SHD_OPENAI_DEPLOYMENT_NAME,
            openai_api_key=SHD_OPENAI_KEY,
            openai_api_type="Azure",
            model_name=SHD_OPENAI_GPT_MODEL_NAME,
            streaming=True,
            temperature=0
        )        

        embeddings = setup_embeddings()
        vector_store = setup_vector_store(embeddings)
        retriever = UserRetrieverPolicy2(vectorstore=vector_store, username=username)
        
        demo_ephemeral_chat_history = ChatMessageHistory()
        for i in range(len(inputs)):
            demo_ephemeral_chat_history.add_user_message(inputs[i])
            demo_ephemeral_chat_history.add_ai_message(outputs[i])
            
        contextualize_q_system_prompt = (
            "Given a chat history and the latest user question "
            "which might reference context in the chat history, "
            "formulate a standalone question which can be understood "
            "without the chat history. Do NOT answer the question, "
            "just reformulate it if needed and otherwise return it as is."
        )
        contextualize_q_prompt = ChatPromptTemplate.from_messages(
            [
                ("system", contextualize_q_system_prompt),
                MessagesPlaceholder("chat_history"),
                ("human", "{input}"),
            ]
        )
        history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt) 
        qa_prompt = ChatPromptTemplate.from_messages(
            [
                ("system", ANSWER_PROMPT_POLICY),
                MessagesPlaceholder("chat_history"),
                ("human", "{input}"),
            ]
        )
        question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)  | StrOutputParser()
             
        rag_chain_from_docs = create_retrieval_chain(history_aware_retriever, question_answer_chain) 
        conversational_rag_chain = RunnableWithMessageHistory(
            rag_chain_from_docs,
            lambda session_id: demo_ephemeral_chat_history, 
            input_messages_key="input",
            history_messages_key="chat_history",
            output_messages_key="answer",
        )


        async for chunk in conversational_rag_chain.astream( {"input": prompt},config={"configurable": {"session_id": "unbound"}}):
            print(str(chunk).encode())
            yield str(chunk).encode()
    
    async for chunk in llm_thread_gpt_lcel(prompt, username, inputs, outputs):
        yield chunk

Description

The code works perfectly fine and I noticed that it streams as first chunk the sources with a specific format:

b"{'context': [Document(page_content='', metadata={'file_id': 'File_0000, 'source': 'File.pdf_someid_page=7', 'page_number': '7', 'document_title': 'File.pdf', 'document_owning_department': 'N/A', 'manual': True}),]}"

and then it streams the answer in chunks with the following format:
b"{'answer': ''}"
b"{'answer': '<b'}"
b"{'answer': '>'}"

I would like to reformat the Sources to only return the metadata, not the entire chunk conte,t

and for the answer I would like to stream only the chunk without the json, is that possible?

System Info

System Information

OS: Windows
OS Version: 10.0.22621
Python Version: 3.12.3 (tags/v3.12.3:f6650f9, Apr 9 2024, 14:05:25) [MSC v.1938 64 bit (AMD64)]

Package Information

langchain_core: 0.2.9
langchain: 0.2.5
langchain_community: 0.2.5
langsmith: 0.1.82
langchain_openai: 0.1.9
langchain_text_splitters: 0.2.1
langchainhub: 0.1.20

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph
langserve

@dosu · 2024-07-02T05:58:51Z

dosubot[bot]
bot Jul 2, 2024

To reformat the streaming response to only return the metadata for sources and stream the answer without the JSON format, you can modify the streaming logic to extract and print only the relevant parts. Here is an example of how you can achieve this:

async def policy_gpt_chain_lcel(prompt, username, inputs, outputs):
    async def llm_thread_gpt_lcel(prompt, username, inputs, outputs):
        llm = AzureChatOpenAI(
            azure_endpoint=SHD_AZURE_OPENAI_ENDPOINT,
            openai_api_version="2023-03-15-preview",
            deployment_name=SHD_OPENAI_DEPLOYMENT_NAME,
            openai_api_key=SHD_OPENAI_KEY,
            openai_api_type="Azure",
            model_name=SHD_OPENAI_GPT_MODEL_NAME,
            streaming=True,
            temperature=0
        )        

        embeddings = setup_embeddings()
        vector_store = setup_vector_store(embeddings)
        retriever = ArgenxUserRetrieverPolicy2(vectorstore=vector_store, username=username)
        
        demo_ephemeral_chat_history = ChatMessageHistory()
        for i in range(len(inputs)):
            demo_ephemeral_chat_history.add_user_message(inputs[i])
            demo_ephemeral_chat_history.add_ai_message(outputs[i])
            
        contextualize_q_system_prompt = (
            "Given a chat history and the latest user question "
            "which might reference context in the chat history, "
            "formulate a standalone question which can be understood "
            "without the chat history. Do NOT answer the question, "
            "just reformulate it if needed and otherwise return it as is."
        )
        contextualize_q_prompt = ChatPromptTemplate.from_messages(
            [
                ("system", contextualize_q_system_prompt),
                MessagesPlaceholder("chat_history"),
                ("human", "{input}"),
            ]
        )
        history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt) 
        qa_prompt = ChatPromptTemplate.from_messages(
            [
                ("system", ANSWER_PROMPT_POLICY),
                MessagesPlaceholder("chat_history"),
                ("human", "{input}"),
            ]
        )
        question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)  | StrOutputParser()
             
        rag_chain_from_docs = create_retrieval_chain(history_aware_retriever, question_answer_chain) 
        conversational_rag_chain = RunnableWithMessageHistory(
            rag_chain_from_docs,
            lambda session_id: demo_ephemeral_chat_history, 
            input_messages_key="input",
            history_messages_key="chat_history",
            output_messages_key="answer",
        )

        async for chunk in conversational_rag_chain.astream({"input": prompt}, config={"configurable": {"session_id": "unbound"}}):
            if 'context' in chunk:
                # Extract and print metadata for sources
                metadata = [doc.metadata for doc in chunk['context']]
                print(f"Sources Metadata: {metadata}")
            if 'answer' in chunk:
                # Stream the answer without JSON format
                print(chunk['answer'], end="", flush=True)
                yield chunk['answer'].encode()

    async for chunk in llm_thread_gpt_lcel(prompt, username, inputs, outputs):
        yield chunk

This code will print the metadata for sources when it encounters the 'context' key and stream the answer directly when it encounters the 'answer' key, without the JSON format ^[1].

To continue talking to Dosu, mention @dosu.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to format streaming response with sources. #23751

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

How to format streaming response with sources. #23751

levalencia Jul 2, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

System Information

Package Information

Packages not installed (Not Necessarily a Problem)

Replies: 1 comment

dosubot[bot] bot Jul 2, 2024

levalencia
Jul 2, 2024

dosubot[bot]
bot Jul 2, 2024