Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Now text embeddings generation is supported on every available model.
It's a working v1, as the text template for building the class sentences is currently hardcoded in the dataset class.
Also, I eliminated the
SiglipModel
class as it is a Hugging Face model, so the code was a 99% duplicate. For supporting the required tokenizer for the text embeddings in all HF models, I usedAutoModel
,AutoTokenizer
andAutoProcessor
in theClosedCLIPModel
class, so we don't have to explicitly state the tokenizer that correspond to each model type (as the repo contains several model variations already).Went ahead with some standardisation of tensor shapes in the output of methods that appear in the model classes and when to convert the tensors from GPU to CPU, and vice versa.