Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

Open
JunHyungKang opened this issue Oct 4, 2023 · 1 comment

Comments

@JunHyungKang
Copy link

When aiming to fine-tune starcoder or octocoder on a custom dataset for integration with an IDE, would it be more appropriate to process the data in a question & answer format by masking custom code for instruction tuning, or would it be better to train it like a base model, utilizing concat tokens to attach the entire code and maintain identical input labels for certain sequence units? Could you share any opinions or experiences regarding this?

@loubnabnl
Copy link
Contributor

For Code completion in The IDE (GitHub copilot style) we recommend just combining the code files like we did for pre-training, for chat-like applications and instruction tuning it's more common to use the instruction/answer format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants