-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MongoDB + new abstraction of vectordb #2942
Conversation
@microsoft-github-policy-service agree |
@microsoft-github-policy-service agree company="MongoDB" |
nice addition, it would be nice with an example, and in my opinion it would also be nice with an "advanced example" because it appears to be possible to also use mongodb on cloud services for example cosmosdb on azure :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting up this PR. Looks great and is helpful for us!
I've added some suggestions around implementation details, but overall it's a solid PR. Let me know what you think.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2942 +/- ##
===========================================
- Coverage 33.99% 15.71% -18.29%
===========================================
Files 89 90 +1
Lines 9593 9719 +126
Branches 2054 2242 +188
===========================================
- Hits 3261 1527 -1734
- Misses 6057 8142 +2085
+ Partials 275 50 -225
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the PR, @ranfysvalle02 !
Could you provide an example of how to set up a mongodb in local? You could refer to https://github.com/microsoft/autogen/blob/main/notebook/agentchat_pgvector_RetrieveChat.ipynb
Could you also update the dependencies in setup.py
?
To fix the code formatting issue:
|
Will be working on the feedback received from this thread as well as internally from MongoDB. I'll update the PR soon |
There are some issues with the tests --- working them out now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ranfysvalle02 , the format issue still exists. Please let me know once it's ready for review. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code has been updated ;)
It keeps showing "one change requested", but I'm not sure what this one is about @Hk669 -- Let me know if I covered it with the code update or if there is anything else I should change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. if the comments are addressed. thanks @ranfysvalle02 for your contribution
@ranfysvalle02 can you run the pre-commit to fix the formatting. |
@thinkall can you please review the PR. |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
10404662 | Triggered | Generic CLI Secret | c44c8bd | .github/workflows/dotnet-build.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
" \"collection_name\": \"flaml_collection_two\",\n", | ||
" \"index_name\": \"flaml_index_two\",\n", | ||
" \"db_config\": {\n", | ||
" \"connection_string\": \"mongodb+srv://user:[email protected]/test\", # MongoDB Atlas connection string\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ranfysvalle02 , I still see ConfigurationError: The DNS query name does not exist: _mongodb._tcp.shared.demo.mongodb.net.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the notebook -- I think this will do it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mongodb://username:password@localhost/?directConnection=true
worked for me. Then I see:
-
Trying to create collection.
2024-06-21 21:44:25,912 - autogen.agentchat.contrib.retrieve_user_proxy_agent - INFO - Found 2 chunks.
VectorDB returns doc_ids: [[]]
Which means retrieve_docs doesn't work as expected. -
ValueError: Collection flaml_collection_two already exists.
which meansoverwrite=True
doesn't work as expected.
@ranfysvalle02 If you run |
I will be better about this one sorry :) |
Sorry guys!!!!! Here is the new Pull Request with a fresh commit history... Did a lot of learning here :) |
Why are these changes needed?
MongoDB has been ranked as the best vector database(https://www.mongodb.com/blog/post/atlas-vector-search-commands-highest-developer-nps-retool-state-ai-2023-survey) in the Retool AI report, so it is quite important to add MongoDB vector search as an option for Autogen RAG.
You can easily start the MongoDB vector search on a free tier M0 MongoDB Atlas cluster. Free tier cluster provides the full functionality of the MongoDB vector search. https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/
But why is MongoDB such a standout? Well, there are a few key reasons.
As such, implementing MongoDB as a Retrieval Agent can unlock new potential in your AI applications, bringing the full power of vector storage to bear.
Related issue number: 711
Closes #711
Checks