You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With vec2txt we should be able to get a reasonably useful sentence out of the average embeddings of a cluster. This could serve as the cluster label, or perhaps as guidance for summarizing the label.
There are pre-trained models, like for OpenAI's text-embedding-ada-002 and perhaps others. Part of this issue might be helping to pre-train for other supported models in our list.
One could imagine a new API endpoint that takes in an embedding vector and outputs a sentence. We could also have an alternative summarize script that uses this instead (or in conjunction with) summarizing. We currently have a description field per cluster which is not really being used, it could be populated with this or we could add another field.
The text was updated successfully, but these errors were encountered:
With vec2txt we should be able to get a reasonably useful sentence out of the average embeddings of a cluster. This could serve as the cluster label, or perhaps as guidance for summarizing the label.
https://github.com/jxmorris12/vec2text/
There are pre-trained models, like for OpenAI's text-embedding-ada-002 and perhaps others. Part of this issue might be helping to pre-train for other supported models in our list.
One could imagine a new API endpoint that takes in an embedding vector and outputs a sentence. We could also have an alternative summarize script that uses this instead (or in conjunction with) summarizing. We currently have a description field per cluster which is not really being used, it could be populated with this or we could add another field.
The text was updated successfully, but these errors were encountered: