Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SantaCoder model to templates #90

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

YuryYakhno
Copy link

SantaCoder model uses different (but very similar) special tokens, comparing to StarCoder model. The current settings contain template only for StarCoder, so it appears to be logical just to change "bigcode/starcoder" to "bigcode/santacoder" in "Model ID or Endpoint" setting. But actually it is not enough, because SantaCoder tokens start with "fim-", while StarCoder uses tokens starting with "fim_". It is hard to notice by brief settings overview. If wrong FIM tokens are used, it leads to improper work of SantaCoder: "fim_..." tokens are parsed as text, and the model adds them to the output from time to time.

This issue was discussed in SantaCoder's model page. To prevent this issue in the future without changing the SantaCoder's interface, I propose to add a separate template for SantaCoder with proper special tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant