Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Fine-tune TinyLlama and Qwen2.5-coder models for Magistrala and Prism codebase #27

Open
drasko opened this issue Oct 9, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@drasko
Copy link
Contributor

drasko commented Oct 9, 2024

Is your feature request related to a problem? Please describe.

No

Describe the feature you are requesting, as well as the possible use case(s) for it.

As LLM can fine-tuned on custom data sets, so can SLMs.

We want to fine-tune:

  • TinyLlama
  • Phi-3

And we want to fine-tune them on our custom Magistrala, Prism and Cocos repositories, so that we can enhance their intelligence for code generation for our purposes.

We want to compare:

  • Which is better to fine-tune - more documented, easier, faster, etc ...
  • Which shows better result after fine-tuning

Some references:

Analysis should be done if we should use fine-tuning or RAG for this purpose: https://medium.com/@bijit211987/when-to-apply-rag-vs-fine-tuning-90a34e7d6d25

Indicate the importance of this feature to you.

Must-have

Anything else?

No response

@drasko drasko added the enhancement New feature or request label Oct 9, 2024
@drasko drasko changed the title Feature: Fine-tune Phi-3 and/or TinyLlama models for Magistrala and Prism codebase Feature: Fine-tune Phi-3 and TinyLlama models for Magistrala and Prism codebase Oct 9, 2024
@drasko
Copy link
Contributor Author

drasko commented Oct 9, 2024

Actually, https://github.com/QwenLM/Qwen2.5-Coder seems more promising - especially that https://ollama.com/library/qwen2.5-coder:1.5b works fast.

We should examine how this one can be fine-tuned as well.

@drasko
Copy link
Contributor Author

drasko commented Oct 9, 2024

This probably should be done wia SWIFT, as explained here.

@drasko drasko assigned JeffMboya and unassigned rodneyosodo Oct 9, 2024
@drasko
Copy link
Contributor Author

drasko commented Oct 9, 2024

Definitely qwen2.5-coder:1.5b should be set as our default model for now, tested and confirmed by @dborovcanin as well.

@JeffMboya and @rodneyosodo test performance, and if OK, send a PR to replace TinyLlama.

@JeffMboya JeffMboya changed the title Feature: Fine-tune Phi-3 and TinyLlama models for Magistrala and Prism codebase Feature: Fine-tune TinyLlama and Qwen2.5-coder models for Magistrala and Prism codebase Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants