-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LLM RAG example #89
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didnt need to change ALL the licences , just for your project :-) Other than that left some reviews
llm-agents/src/requirements.txt
Outdated
langchain==0.0.263 | ||
openai==0.27.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for fun, can we try and use the latest versions to see what happens?
llm-agents/src/requirements.txt
Outdated
@@ -1,7 +1,5 @@ | |||
langchain==0.0.263 | |||
openai==0.27.2 | |||
slack-bolt==1.16.2 | |||
slack-sdk==3.20.0 | |||
zenml==0.43.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should upgrade to 0.55.0
llm-agents/src/steps/url_scraper.py
Outdated
# examples_readme_urls = get_nested_readme_urls(repo_url) | ||
# docs_urls = get_all_pages(docs_url) | ||
# website_urls = get_all_pages(website_url) | ||
# all_urls = docs_urls + website_urls + [release_notes_url] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep this as an option for people to include a greater number of documents as data sources for the vector store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? all objects get pickled automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cloudpickle in the default implementation didn't work for the AgentExecutor class AFAIR. So I implemented this pickle materializer instead.
llm-agents/src/steps/url_scraper.py
Outdated
|
||
|
||
@step(enable_cache=True) | ||
@step(enable_cache=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should put all config in a YAML file tbh. Btw why do we disable cache here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes, if the step fails and the URLs are not loaded, the next pipeline run skips the step and goes into the next step with a None object. This leads to some obscure errors. Doesn't happen usually but it's an edge case that the DSD guys ran into, sometime around the first bootcamp.
No description provided.