Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor langchain.retrievers.self_query #16115

Closed

Conversation

leo-gan
Copy link
Collaborator

@leo-gan leo-gan commented Jan 17, 2024

Issue: all classes in langchain.retrievers.self_query namespace belong to the community package.

Following issue:
We cannot just move these classes to community because of the dependency on the Visitor class and other classes from the langchain.chains.query_constructor namespace.

What is interesting is that langchain.chains.query_constructor artifacts are used ONLY in the langchain.retrievers.self_query artifacts.
It happens that langchain.chains.query_constructor artifacts do not depend on the langchain.chains. The langchain.chains.query_constructor is quite independent. Because of that we can move langchain.chains.query_constructor under langchain_core as the langchain_core.sql_constructor
The reason for renaming query_constructor into sql_constructor is simple. It IS the SQL constructor. Moreover, the query_.. is the ambiguous name in the LLM context because we use query in retrievers and many other places not as the SQL queries but as the queries in prompts. sql is unambiguous name here.

NOTE: The langchain.chains.query_constructor.base.load_query_constructor_chain is not moved! This is the only dependency to chains. It is not important because Chains will be deprecated soon.

Copy link

vercel bot commented Jan 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 22, 2024 6:53pm

@leo-gan leo-gan marked this pull request as ready for review January 17, 2024 02:02
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jan 17, 2024
@leo-gan leo-gan requested a review from baskaryan January 17, 2024 02:02
@dosubot dosubot bot added 🤖:refactor A large refactor of a feature(s) or restructuring of many files 🔌: chroma Primarily related to ChromaDB integrations labels Jan 17, 2024
@efriis efriis added the partner label Jan 17, 2024
@efriis efriis self-assigned this Jan 17, 2024
@leo-gan leo-gan force-pushed the refactor-retrievers-self_query branch 2 times, most recently from 23d5299 to 235a9df Compare January 17, 2024 19:11
@leo-gan
Copy link
Collaborator Author

leo-gan commented Jan 17, 2024

@baskaryan Conflicts resolved. Any comments?

@leo-gan
Copy link
Collaborator Author

leo-gan commented Jan 17, 2024

@baskaryan Hey Bagatur, let me know if you have other plans for extracting community code from langchain.

@hwchase17 hwchase17 closed this Jan 30, 2024
@baskaryan baskaryan reopened this Jan 30, 2024
@baskaryan baskaryan closed this Jan 30, 2024
@baskaryan baskaryan reopened this Jan 30, 2024
@leo-gan leo-gan force-pushed the refactor-retrievers-self_query branch 3 times, most recently from 002095e to b54479e Compare February 8, 2024 18:39
@leo-gan
Copy link
Collaborator Author

leo-gan commented Feb 8, 2024

@efriis Merging conflicts resolved. Please, take a look.

@leo-gan leo-gan requested review from efriis and removed request for baskaryan February 8, 2024 20:08
@leo-gan
Copy link
Collaborator Author

leo-gan commented Feb 22, 2024

@baskaryan Please let me know if this PR makes sense. If it makes sense, then I resolve conflicts. I've resolved branch conflicts several times, but nobody has reviewed PR.

@leo-gan leo-gan force-pushed the refactor-retrievers-self_query branch from b28dea6 to 580cd70 Compare February 22, 2024 18:43
@eyurtsev eyurtsev assigned baskaryan and unassigned efriis Mar 7, 2024
@leo-gan leo-gan removed the request for review from efriis March 27, 2024 22:19
@leo-gan leo-gan requested a review from eyurtsev March 27, 2024 22:19
@@ -0,0 +1,70 @@
"""Logic for converting internal query language to a valid AstraDB query."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Apr 12, 2024
@@ -1,256 +1,29 @@
"""LLM Chain for turning a user text query into a structured query."""
from __future__ import annotations
from typing import Any, List, Optional, Sequence, Union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep as much of the provider agnostic code in langchain and not move it, so that providers that move to partner packages are able to re-use it without importing from community

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eyurtsev Do you mean to keep the from __future__ import annotations line?

@eyurtsev
Copy link
Collaborator

OK I think this PR is blocked on release 0.2.0 that will break dependency on langchain-community, and will make it possible for langchain-community to import these base chains form langchain (rather than langchain-core)

The issue is that these are higher order chains and it's not clear that they should live in langchain-core.

@leo-gan
Copy link
Collaborator Author

leo-gan commented Apr 12, 2024

OK I think this PR is blocked on release 0.2.0 that will break dependency on langchain-community, and will make it possible for langchain-community to import these base chains form langchain (rather than langchain-core)

The issue is that these are higher order chains and it's not clear that they should live in langchain-core.

@eyurtsev Do you mean to wait with this PR till the langchain 0.2.0 release?

@leo-gan
Copy link
Collaborator Author

leo-gan commented Apr 12, 2024

Converted to Draft. It could be a part of 0.2.0 langchain.

@leo-gan leo-gan marked this pull request as draft April 12, 2024 16:55
@eyurtsev eyurtsev marked this pull request as ready for review April 18, 2024 21:21
@dosubot dosubot bot added Ɑ: retriever Related to retriever module 🔌: astradb Primarily related to AstraDB integrations 🔌: elasticsearch Primarily related to elastic/elasticsearch integrations 🔌: milvus Primarily related to Milvus vector store integration 🔌: mongo Primarily related to Mongo integrations 🔌: postgres Related to postgres integrations 🔌: qdrant Primarily related to Qdrant vector store integration 🔌: redis Primarily related to Redis integrations 🔌: supabase Primarily related to Supabase integrations 🔌: weaviate Primarily related to Weaviate vector store integration labels Apr 18, 2024
@eyurtsev
Copy link
Collaborator

Re-opened temporarily to make it easier to reference. I'm going to make a similar change, but slightly different name spaces

@cbornet
Copy link
Collaborator

cbornet commented Apr 18, 2024

This will need to be rebased carefully as there have been changes done to the self query retriever since the beginning of this PR.

@eyurtsev
Copy link
Collaborator

@cbornet yeah was mostly using for reference. @baskaryan doing it here: #20849


@leo-gan / @cbornet, context is that we're in the process of reversing the dependency between community and langchain

So langchain-community will start depending on langchain (rather than vice versa). So we'll be doing some acrobatics during the next week or so

@eyurtsev eyurtsev closed this Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔌: astradb Primarily related to AstraDB integrations 🔌: chroma Primarily related to ChromaDB integrations 🔌: elasticsearch Primarily related to elastic/elasticsearch integrations lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: milvus Primarily related to Milvus vector store integration 🔌: mongo Primarily related to Mongo integrations partner 🔌: postgres Related to postgres integrations 🔌: qdrant Primarily related to Qdrant vector store integration 🔌: redis Primarily related to Redis integrations 🤖:refactor A large refactor of a feature(s) or restructuring of many files Ɑ: retriever Related to retriever module size:XXL This PR changes 1000+ lines, ignoring generated files. 🔌: supabase Primarily related to Supabase integrations 🔌: weaviate Primarily related to Weaviate vector store integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants