Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add DynamicPromptBuilder to Haystack 2.x #6328

Merged
merged 16 commits into from
Nov 23, 2023

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Nov 17, 2023

Why:

This pull request introduces the DynamicPromptBuilder component. This new prompt builder enables dynamic and parametrized prompting, significantly enhancing the flexibility and functionality of prompt generation within the haystack pipeline.

What:

The following key changes have been made in this PR:

  1. DynamicPromptBuilder Implementation: The main addition is DynamicPromptBuilder class in dynamic_prompt_builder.py. This class leverages Jinja2 templating to dynamically generate prompts based on either a list of ChatMessage instances or a string template. It can operate in chat or non-chat mode and supports injecting variables resolved at runtime.
  2. init.py Update: Modified to include DynamicPromptBuilder in the module's __all__ list, making it accessible as part of the builders module.
  3. Unit Tests: Added comprehensive tests in test_dynamic_prompt_builder.py to ensure the correct functioning of this new component under various scenarios.
  4. Release Notes: A release note has been added to document the introduction of this new feature.

How can it be used:

Developers can utilize DynamicPromptBuilder in the haystack pipeline to create rich prompting scenarios where input variables are injected from the pipeline runtime and the user's pipeline run invocation. For some common use case scenarios for both chat and non-chat scenarios, see this colab

How did you test it:

Testing was performed using unit tests located in test/preview/components/builders/test_dynamic_prompt_builder.py. These tests cover various scenarios, including initialization of the builder, processing simple templates, handling chat messages, and validation of template variables. Manual tests were completed with the colab above.

Notes for the reviewer:

  • Review of Jinja2 Usage: Special attention should be given to the integration with Jinja2.
  • Validation of Runtime Variables: Since this component heavily relies on runtime variables, it would be beneficial to review the variable handling and error messages to ensure they are clear and helpful.
  • Colab Experiments: Manually test the components using the provided Colab link mentioned earlier. Check how they perform and determine if they meet your requirements for various prompting scenarios.

@vblagoje vblagoje requested review from a team as code owners November 17, 2023 09:55
@vblagoje vblagoje requested review from silvanocerza and removed request for a team November 17, 2023 09:55
@github-actions github-actions bot added topic:tests 2.x Related to Haystack v2.0 type:documentation Improvements on the docs labels Nov 17, 2023
@dfokina
Copy link
Contributor

dfokina commented Nov 20, 2023

Hey @vblagoje , I added the component to API reference and updated some docstrings :)

Comment on lines 152 to 154
DynamicPromptBuilder in the pipeline, these variable names can be different. For example, if your component
connected to the DynamicPromptBuilder has an output named `documents`, the `expected_runtime_variables` should
contain `documents` as one of its values. The values associated with variables from the pipeline runtime are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part sounds really strange to me. It seems to imply that the name of variables in the template must be identical to the output name of the senders components, canals is not supposed to work like that. 🤔

For example, if your component connected to the DynamicPromptBuilder has an output named documents, the expected_runtime_variables should contain documents as one of its values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, hmm, how would you phrase it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this maybe.

For example, if expected_runtime_variables contains documents your instance of DynamicPromptBuilder will expect an input called documents.

I would probably remove this part too, the name of input really doesn't depend on the connected components.

Depending on the components connected to the DynamicPromptBuilder in the pipeline, these variable names can be different.

Copy link
Contributor

@silvanocerza silvanocerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something funky going on I think.
I tested the component with this snippet but it raises a ValueError at line 245.
I would have expected this to work though. 🤔

from haystack.preview.components.builders.dynamic_prompt_builder import DynamicPromptBuilder
from haystack.preview import component, Pipeline
from typing import List
from haystack.preview.dataclasses import Document


@component
class DocumentProducer:
    @component.output_types(documents=List[Document])
    def run(self, doc_input: str):
        return {"documents": [Document(content=doc_input)]}


pipe = Pipeline()
producer = DocumentProducer()
builder = DynamicPromptBuilder(expected_runtime_variables=["d1"], chat_mode=False)
pipe.add_component("p1", producer)
pipe.add_component("builder", builder)
pipe.connect("p1.documents", "builder.d1")
pipe.run(data={"p1": {"doc_input": "hey hey"}, "builder": {"prompt_source": {"This is what I received: {{d1}}"}}})

@vblagoje
Copy link
Member Author

@silvanocerza You have one extra dictionary for prompt_source that is unnecessary and got the ValueError complaining about the type received. Just remove that one level of the dictionary, and it works.

@silvanocerza
Copy link
Contributor

silvanocerza commented Nov 21, 2023

@silvanocerza You have one extra dictionary for prompt_source that is unnecessary and got the ValueError complaining about the type received. Just remove that one level of the dictionary, and it works.

Right, it works now. 🤦‍♀️


"""

def __init__(self, expected_runtime_variables: Optional[List[str]] = None, chat_mode: Optional[bool] = True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change expected_runtime_variables to just runtime_variables. It seems a bit redundant to prefix it with expected_, what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with both, IIRC correctly @masci and I talked about expected_variables or expected_runtime_variables I didn't write it down tbh.

@vblagoje
Copy link
Member Author

@silvanocerza a small favor 🙏 Have another look now and coordinate with Massi if needed regarding that variable name. Also please adjust the wording in the pydoc regarding runtime variables.

Copy link
Contributor

@silvanocerza silvanocerza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes we talked about, we can merge when CI is green. 👍

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 6968334557

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 567 unchanged lines in 18 files lost coverage.
  • Overall coverage decreased (-0.4%) to 40.065%

Files with Coverage Reduction New Missed Lines %
utils/context_matching.py 1 95.7%
preview/components/builders/init.py 2 0.0%
preview/components/routers/init.py 2 0.0%
preview/document_stores/decorator.py 6 0.0%
nodes/prompt/prompt_model.py 8 81.82%
preview/components/generators/chat/openai.py 9 0.0%
utils/openai_utils.py 14 73.5%
preview/document_stores/protocols.py 15 0.0%
nodes/prompt/invocation_layer/open_ai.py 16 69.91%
nodes/prompt/invocation_layer/chatgpt.py 19 55.79%
Totals Coverage Status
Change from base Build 6902028938: -0.4%
Covered Lines: 10622
Relevant Lines: 26512

💛 - Coveralls

@silvanocerza silvanocerza merged commit e04a1f1 into main Nov 23, 2023
69 checks passed
@silvanocerza silvanocerza deleted the dynamic_prompt_builder branch November 23, 2023 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants