Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in output when running via Trasformers.js and when hosting on Huggingface #46

Open
jtmuller5 opened this issue Feb 10, 2024 · 1 comment

Comments

@jtmuller5
Copy link

jtmuller5 commented Feb 10, 2024

I created an application that uses the UAE-large-V1 model inside Transformers.js and was able to embed sentences in a browser without issues. The model would return a single vector for a single input:

extractor = await pipeline("feature-extraction", "WhereIsAI/UAE-Large-V1", {
      quantized: true,
});

let result = await extractor(text, { pooling: "mean", normalize: true });

When I hosted the model on Huggingface using their inference endpoint solution, it no longer works as expected. Instead of returning a single vector, it returns a variable length of 1024 dimension vectors.

Sample input:

{
   "inputs":  "Where are you"
}

This returns a list of lists of lists of numbers.

Is there a way to make hosted model return a single vector? And why does the the model act differently based on where it's hosted?

@SeanLee97
Copy link
Owner

It is strange. It should return a single vector because you have specified the mean pooling.

You could ask for help in the Transformers.js project because I am unfamiliar with it. Sorry for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants