Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: support Nomic embeddings #17138

Closed
wants to merge 4 commits into from
Closed

Conversation

wt3639
Copy link
Contributor

@wt3639 wt3639 commented Feb 6, 2024

Description: support loading the current SOTA long context text embedder nomic-ai/nomic-embed-text in langchain.
Dependencies: Sentence Transformers

Copy link

vercel bot commented Feb 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Feb 7, 2024 0:34am

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: embeddings Related to text embedding models module 🤖:improvement Medium size change to existing code to handle new use-cases labels Feb 6, 2024
Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be a separate class from existing hugging face embeddings?

@wt3639
Copy link
Contributor Author

wt3639 commented Feb 7, 2024

does this need to be a separate class from existing hugging face embeddings?

Actually, no. It's similar to the methods: HuggingFaceBgeEmbeddings and HuggingFaceInstructEmbeddings. However, HuggingFaceBgeEmbeddings does not provide an embed instructor, and HuggingFaceInstructEmbeddings uses the class INSTRUCTOR, which does not support passing the parameter trust_remote_code to sentence_transformers. This parameter is necessary for the nomic model. Therefore, the solution could be either to modify one of these two methods or to create a new class.

@hwchase17
Copy link
Contributor

what would it look like to modify the instructor class?

@hwchase17 hwchase17 self-assigned this Feb 8, 2024
@wt3639 wt3639 closed this Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: embeddings Related to text embedding models module 🤖:improvement Medium size change to existing code to handle new use-cases size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants