Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SantaCoder model uses different (but very similar) special tokens, comparing to StarCoder model. The current settings contain template only for StarCoder, so it appears to be logical just to change "bigcode/starcoder" to "bigcode/santacoder" in "Model ID or Endpoint" setting. But actually it is not enough, because SantaCoder tokens start with "fim-", while StarCoder uses tokens starting with "fim_". It is hard to notice by brief settings overview. If wrong FIM tokens are used, it leads to improper work of SantaCoder: "fim_..." tokens are parsed as text, and the model adds them to the output from time to time.
This issue was discussed in SantaCoder's model page. To prevent this issue in the future without changing the SantaCoder's interface, I propose to add a separate template for SantaCoder with proper special tokens.