Instruction tuning dataset for Danish #210
KennethEnevoldsen
started this conversation in
Missing pieces for Danish NLP
Replies: 1 comment
-
The repo for creating the dataset can be found here: https://github.com/kasperjunge/skolegpt-instruct-dataset It will likely be moved to Københavns Professionshøjskoles GitHub Organisation. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Statement of Need
Dataset for instruction tuning are important for improving the quality of Danish LLMs
Current status
As far as I know, I know that @kasperjunge and the DanskGPT project are working on developing an instruction-tuned dataset for Danish NLP by translating OpenOrca. Models released under this would be problematic to use in a commercial setting (due to the potential legal implication as the dataset is generated using OpenAI).
Beta Was this translation helpful? Give feedback.
All reactions