Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Quick Start Ready Made Template Example doesnt work #8438

Closed
msanaulla opened this issue Sep 20, 2024 · 3 comments
Closed

The Quick Start Ready Made Template Example doesnt work #8438

msanaulla opened this issue Sep 20, 2024 · 3 comments
Labels
2.x Related to Haystack v2.0 type:bug Something isn't working

Comments

@msanaulla
Copy link

The Ready made template example given in: https://haystack.deepset.ai/overview/quick-start doesnt work.

This is the code I tried:

import os
from dotenv import load_dotenv
from haystack import Pipeline, PredefinedPipeline
load_dotenv()

pipeline = Pipeline.from_template(PredefinedPipeline.CHAT_WITH_WEBSITE)
result = pipeline.run({
    "fetcher": {"urls": ["https://haystack.deepset.ai/overview/quick-start"]},
    "prompt": {"query": "Which components do I need for a RAG pipeline?"}}
)
print(result["llm"]["replies"][0])

I get the below error:

metadata: {}
PS D:\projects\python-demo> & C:/Users/msanaulla/AppData/Local/Programs/Python/Python312/python.exe d:/projects/python-demo/haystack
PS D:\projects\python-demo> & C:/Users/msanaulla/AppData/Local/Programs/Python/Python312/python.exe d:/projects/python-demo/haystack-demo.py
Traceback (most recent call last):
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 186, in f
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 186, in from_dict
    instance = component_from_dict(component_class, component_data, name, callbacks)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 118, in c
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 118, in component_from_dict
    return do_from_dict()
           ^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 113, in d
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 113, in do_from_dict
    return cls.from_dict(data)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\components\converters\html.py", line 6
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\components\converters\html.py", line 67, in from_dict
    return default_from_dict(cls, data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 192, in d
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\serialization.py", line 192, in default_from_dict
    return cls(**init_params)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\component\component.py", line 254
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\component\component.py", line 254, in __call__
    instance = super().__call__(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: HTMLToDocument.__init__() got an unexpected keyword argument 'extractor_type'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 849, in f
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 849, in from_template
    return cls.loads(rendered)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 258, in l
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 258, in loads
    return cls.from_dict(deserialized_data, callbacks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 195, in f
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 195, in from_dict
    raise DeserializationError(msg) from e
haystack.core.errors.DeserializationError: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following d
haystack.core.errors.DeserializationError: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.converters.html.HTMLToDocument'}. Poss
haystack.core.errors.DeserializationError: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.converters.html.HTMLToDocument'}. Possible reasons include malformed serialized data, mismatch between the serialized component and the loaded one (due to a breaking chan
haystack.core.errors.DeserializationError: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.converters.html.HTMLToDocument'}. Possible reasons include malformed serialized data, mismatch between the serialized component and the loaded one (due to a breaking change, see https://github.com/deepset-ai/haystack/releases), etc.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:\projects\python-demo\haystack-demo.py", line 7, in <module>
    pipeline = Pipeline.from_template(PredefinedPipeline.CHAT_WITH_WEBSITE)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 853, in f
  File "C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages\haystack\core\pipeline\base.py", line 853, in from_template
    raise PipelineUnmarshalError(msg)
haystack.core.errors.PipelineUnmarshalError: Error unmarshalling pipeline: Couldn't deserialize component 'converter' of class 'HTML
haystack.core.errors.PipelineUnmarshalError: Error unmarshalling pipeline: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.conver
haystack.core.errors.PipelineUnmarshalError: Error unmarshalling pipeline: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.converters.html.HTMLToDocument'}. Possible reasons include malformed serialized data, mismatch between the serialized component and the lo
haystack.core.errors.PipelineUnmarshalError: Error unmarshalling pipeline: Couldn't deserialize component 'converter' of class 'HTMLToDocument' with the following data: {'init_parameters': {'extractor_type': 'DefaultExtractor'}, 'type': 'haystack.components.converters.html.HTMLToDocument'}. Possible reasons include malformed serialized data, mismatch between the serialized component and the loaded one (due to a breaking change, see https://github.com/deepset-ai/haystack/releases), etc.
Source:
components:
  converter:
    init_parameters:
      extractor_type: DefaultExtractor
    type: haystack.components.converters.html.HTMLToDocument

  fetcher:
    init_parameters:
      raise_on_failure: true
      retry_attempts: 2
      timeout: 3
      user_agents:
      - haystack/LinkContentFetcher/2.0.0b8
    type: haystack.components.fetchers.link_content.LinkContentFetcher

  llm:
    init_parameters:
      api_base_url: null
      api_key:
        env_vars:
        - OPENAI_API_KEY
        strict: true
        type: env_var
      generation_kwargs: {}
      model: gpt-3.5-turbo
      streaming_callback: null
      system_prompt: null
    type: haystack.components.generators.openai.OpenAIGenerator

  prompt:
    init_parameters:
      template: |

        "According to the contents of this website:
        {% for document in documents %}
          {{document.content}}
        {% endfor %}
        Answer the given question: {{query}}
        Answer:
        "
    type: haystack.components.builders.prompt_builder.PromptBuilder

connections:
- receiver: converter.sources
  sender: fetcher.streams
- receiver: prompt.documents
  sender: converter.documents
- receiver: llm.prompt
  sender: prompt.prompt

metadata: {}

I am using below Haystack version:

PS D:\projects\python-demo> pip show haystack-ai                                                                                    
Name: haystack-ai
Version: 2.5.1
Summary: LLM framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data.
Home-page: https://github.com/deepset-ai/haystack
Author:
Author-email: "deepset.ai" <[email protected]>
License:
Location: C:\Users\msanaulla\AppData\Local\Programs\Python\Python312\Lib\site-packages
Requires: haystack-experimental, jinja2, lazy-imports, more-itertools, networkx, numpy, openai, pandas, posthog, python-dateutil, pyyaml, requests, tenacity, tqdm, typing-extensions
Required-by: haystack-experimental
@bilgeyucel
Copy link
Contributor

Thanks for opening the issue @msanaulla, I can reproduce the error in this colab with Haystack 2.6.0 as well

@bilgeyucel bilgeyucel transferred this issue from deepset-ai/haystack-home Oct 4, 2024
@bilgeyucel bilgeyucel added 2.x Related to Haystack v2.0 type:bug Something isn't working labels Oct 4, 2024
@anakin87
Copy link
Member

anakin87 commented Oct 4, 2024

The original error was fixed in #8401.

If you install trafilatura (specified in the Quick start), the error in the notebook disappears.

@bilgeyucel feel free to close the issue if it works.

@bilgeyucel
Copy link
Contributor

Yes, that fixes the issue! Thank you @anakin87!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants