-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: extend PromptBuilder and deprecate DynamicPromptBuilder #7655
Conversation
We decided to go with this approach B. |
I've removed all breaking changes. |
Pull Request Test Coverage Report for Build 9172968594Details
💛 - Coveralls |
@bilgeyucel let me try to answer but I'll leave it to @tstadel to confirm.
|
Almost:
|
Ok I think I understand what's going on. Let me explain and you tell me if I'm correct, followed by some thoughts:
What I am worried about:
Please educate me here though, maybe I'm misunderstanding something |
Fortunately no :-) |
Additionally, looking at the code, I also see some inconsistencies that we shouldn't have if we must have |
Ok so:
|
No, we need variables so PB can accept data from other pipeline components. The other ones are provided by the user in run time |
@TuanaCelik @bilgeyucel @vblagoje @bilgeyucel 's example from haystack import Pipeline
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
basic_rag_pipeline = Pipeline()
basic_rag_pipeline.add_component("retriever", InMemoryBM25Retriever(document_store))
basic_rag_pipeline.add_component("prompt_builder", prompt_builder = PromptBuilder(template=template))
basic_rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-3.5-turbo"))
basic_rag_pipeline.connect("retriever", "prompt_builder.documents")
basic_rag_pipeline.connect("prompt_builder", "llm")
query = "What does Rhodes Statue look like?"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query}
}) Here the following input slots are inferred from
Now let's change fancy_template = """
This is a super fancy dynamic template:
Documents:
{% for document in documents %}
Document {{ document.id }}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
query = "What does Rhodes Statue look like?"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query, "template": fancy_template}
}) Then this will work seamlessly as we use the same input slots:
Now there are two more cases for dynamic templates: query_only_template = """
Question: {{query}}
Answer:
"""
query = "What does Rhodes Statue look like?"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query, "template": query_only_template}
}) This will also work seamlessly as all template variables (i.e. Case B) even_fancier_template = """
{{ header }}
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
query = "What does Rhodes Statue look like?"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query, "template": even_fancier_template}
}) Note that the passed
The first two are covered by input slots, but the third Case B1) even_fancier_template = """
{{ header }}
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
query = "What does Rhodes Statue look like?"
header = "This is my header"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query, "template": even_fancier_template, "template_variables": {"header": header}}
}) Case B2) from haystack import Pipeline
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
basic_rag_pipeline = Pipeline()
basic_rag_pipeline.add_component("retriever", InMemoryBM25Retriever(document_store))
basic_rag_pipeline.add_component("prompt_builder", prompt_builder = PromptBuilder(template=template, variables=["query", "documents", "header"]))
basic_rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-3.5-turbo"))
basic_rag_pipeline.connect("retriever", "prompt_builder.documents")
basic_rag_pipeline.connect("prompt_builder", "llm")
even_fancier_template = """
{{ header }}
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{query}}
Answer:
"""
query = "What does Rhodes Statue look like?"
headers = "This is my header"
response = basic_rag_pipeline.run({
"retriever": {"query": query},
"prompt_builder": {"query": query, "template": even_fancier_template, "header": header}
}) Note, that
Hence, we can pass |
@vblagoje please don't forget that |
@TuanaCelik |
@vblagoje Changing the template at runtime (Prompt Engineering)
documents = [
Document(content="Joe lives in Berlin", meta={"name": "doc1"}),
Document(content="Joe is a software engineer", meta={"name": "doc1"}),
]
new_template = """
You are a helpful assistant.
Given these documents, answer the question.
Documents:
{% for doc in documents %}
Document {{ loop.index }}:
Document name: {{ doc.meta['name'] }}
{{ doc.content }}
{% endfor %}
Question: {{ query }}
Answer:
"""
p.run({
"prompt_builder": {
"documents": documents,
"query": question,
"template": new_template,
},
}) If you want to use different variables during prompt engineering than in the default template, you can do so by setting Overwriting variables at runtimeIn case you want to overwrite the values of variables, you can use language_template = """
You are a helpful assistant.
Given these documents, answer the question.
Documents:
{% for doc in documents %}
Document {{ loop.index }}:
Document name: {{ doc.meta['name'] }}
{{ doc.content }}
{% endfor %}
Question: {{ query }}
Please provide your answer in {{ answer_language | default('English') }}
Answer:
"""
p.run({
"prompt_builder": {
"documents": documents,
"query": question,
"template": language_template,
"template_variables": {"answer_language": "German"},
},
}) Note that
|
…stack into feat/extend_promptbuilder
I also very much like this user-perspective driven documentation rather than what I first suggested. And would even merge this straight into main. But let's proceed forward with what we all agree. Shall we use the above written user perspective description in class pydocs as well @tstadel ? |
@vblagoje Yes, why not. I can update it. |
@vblagoje pydocs have been updated. |
Related Issues
Currently we cannot have both:
There are two options:
DynamicPromptBuilder
and leavePromptBuilder
as isPromptBuilder
and deprecateDynamicPromptBuilder
Edit 07.05.: We decided to go with B
This is Option B
See #7652 for Option A
Proposed Changes:
This extends PromptBuilder to change prompts at query time.
How did you test it?
Notes for the reviewer
There is a breaking change:required_variables
param has been changed tooptional_variables
as most variables of templates are required anyways. We can undo that if necessary.DynamicPromptBuilder
is being deprecatedChat
counterpart toPromptBuilder
is implemented in feat: add ChatPromptBuilder, deprecate DynamicChatPromptBuilder #7663Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.