Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

JunHyungKang · 2023-10-04T06:34:51Z

When aiming to fine-tune starcoder or octocoder on a custom dataset for integration with an IDE, would it be more appropriate to process the data in a question & answer format by masking custom code for instruction tuning, or would it be better to train it like a base model, utilizing concat tokens to attach the entire code and maintain identical input labels for certain sequence units? Could you share any opinions or experiences regarding this?

loubnabnl · 2023-11-15T15:25:54Z

For Code completion in The IDE (GitHub copilot style) we recommend just combining the code files like we did for pre-training, for chat-like applications and instruction tuning it's more common to use the instruction/answer format

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

JunHyungKang commented Oct 4, 2023

loubnabnl commented Nov 15, 2023

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

Fine-tuning Starcoder or Octocoder for IDE Integration: Instruction Tuning vs Base Model Training Approach #142

Comments

JunHyungKang commented Oct 4, 2023

loubnabnl commented Nov 15, 2023