How to format streaming response with sources. #23751
Unanswered
levalencia
asked this question in
Q&A
Replies: 1 comment
-
To reformat the streaming response to only return the metadata for sources and stream the answer without the JSON format, you can modify the streaming logic to extract and print only the relevant parts. Here is an example of how you can achieve this: async def policy_gpt_chain_lcel(prompt, username, inputs, outputs):
async def llm_thread_gpt_lcel(prompt, username, inputs, outputs):
llm = AzureChatOpenAI(
azure_endpoint=SHD_AZURE_OPENAI_ENDPOINT,
openai_api_version="2023-03-15-preview",
deployment_name=SHD_OPENAI_DEPLOYMENT_NAME,
openai_api_key=SHD_OPENAI_KEY,
openai_api_type="Azure",
model_name=SHD_OPENAI_GPT_MODEL_NAME,
streaming=True,
temperature=0
)
embeddings = setup_embeddings()
vector_store = setup_vector_store(embeddings)
retriever = ArgenxUserRetrieverPolicy2(vectorstore=vector_store, username=username)
demo_ephemeral_chat_history = ChatMessageHistory()
for i in range(len(inputs)):
demo_ephemeral_chat_history.add_user_message(inputs[i])
demo_ephemeral_chat_history.add_ai_message(outputs[i])
contextualize_q_system_prompt = (
"Given a chat history and the latest user question "
"which might reference context in the chat history, "
"formulate a standalone question which can be understood "
"without the chat history. Do NOT answer the question, "
"just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
[
("system", contextualize_q_system_prompt),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt)
qa_prompt = ChatPromptTemplate.from_messages(
[
("system", ANSWER_PROMPT_POLICY),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt) | StrOutputParser()
rag_chain_from_docs = create_retrieval_chain(history_aware_retriever, question_answer_chain)
conversational_rag_chain = RunnableWithMessageHistory(
rag_chain_from_docs,
lambda session_id: demo_ephemeral_chat_history,
input_messages_key="input",
history_messages_key="chat_history",
output_messages_key="answer",
)
async for chunk in conversational_rag_chain.astream({"input": prompt}, config={"configurable": {"session_id": "unbound"}}):
if 'context' in chunk:
# Extract and print metadata for sources
metadata = [doc.metadata for doc in chunk['context']]
print(f"Sources Metadata: {metadata}")
if 'answer' in chunk:
# Stream the answer without JSON format
print(chunk['answer'], end="", flush=True)
yield chunk['answer'].encode()
async for chunk in llm_thread_gpt_lcel(prompt, username, inputs, outputs):
yield chunk This code will print the metadata for sources when it encounters the 'context' key and stream the answer directly when it encounters the 'answer' key, without the JSON format [1]. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked other resources
Commit to Help
Example Code
Description
The code works perfectly fine and I noticed that it streams as first chunk the sources with a specific format:
b"{'context': [Document(page_content='', metadata={'file_id': 'File_0000, 'source': 'File.pdf_someid_page=7', 'page_number': '7', 'document_title': 'File.pdf', 'document_owning_department': 'N/A', 'manual': True}),]}"
and then it streams the answer in chunks with the following format:
b"{'answer': ''}"
b"{'answer': '<b'}"
b"{'answer': '>'}"
I would like to reformat the Sources to only return the metadata, not the entire chunk conte,t
and for the answer I would like to stream only the chunk without the json, is that possible?
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
Beta Was this translation helpful? Give feedback.
All reactions