Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface #907

Merged
merged 16 commits into from
Feb 23, 2024

Conversation

kta-intel
Copy link
Collaborator

Description: this PR add an example of fine-tuning neuralchat-7b on a medical qa dataset using the experimental workflow interface and intel(r) extension for transformers

Objectives:

  1. demonstrate OpenFL support for fine-tuning LLMs in a federated learning workflow and provide example users may follow
  2. demonstrate OpenFL support for Intel(R) Extension for Transformers by fine-tuning the Intel neuralchat-7b model

Changes:
(+) preprocess_dataset.py: to preprocess the MedQuAD dataset to be ingestible by the model and workflow
(+) Workflow_Interface_NeuralChat.ipynb: tutorial notebook
(+) requirements.txt
(mod) stream_redirect.py: resolution for AttributeError: 'RedirectStdStream' object has no attribute 'flush', caused by Trainer

@kta-intel kta-intel marked this pull request as draft January 8, 2024 17:25
@kta-intel kta-intel marked this pull request as ready for review January 10, 2024 19:12
@kta-intel kta-intel changed the title [WIP] finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface Jan 10, 2024
@kta-intel kta-intel changed the title finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface [WIP] finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface Jan 10, 2024
@kta-intel kta-intel marked this pull request as draft January 10, 2024 19:13
@kta-intel kta-intel marked this pull request as ready for review January 19, 2024 20:37
@kta-intel kta-intel changed the title [WIP] finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface finetuning neuralchat-7b using intel(r) extension for transformers and workflow interface Jan 25, 2024
Copy link
Contributor

@psfoley psfoley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great contribution, @kta-intel! Approved

@psfoley psfoley merged commit 8e69760 into securefederatedai:develop Feb 23, 2024
23 of 26 checks passed
nammbash pushed a commit to nammbash/openfl that referenced this pull request Feb 27, 2024
…d workflow interface (securefederatedai#907)

* implementation of neuralchat-7b finetuning using itrex and openfl

Signed-off-by: kta-intel <[email protected]>

* enabling support for itrex-neuralchat with openfl workflow-interface

Signed-off-by: kta-intel <[email protected]>

* updated model and description

Signed-off-by: kta-intel <[email protected]>

* fix colab link and add citation

Signed-off-by: kta-intel <[email protected]>

* add readme and additional setup and preprocess steps

Signed-off-by: kta-intel <[email protected]>

* fix preprocess step

Signed-off-by: kta-intel <[email protected]>

* modify readme, fix preprocess_dataset.py, add setup steps in notebook

Signed-off-by: kta-intel <[email protected]>

* fix lint issues

Signed-off-by: kta-intel <[email protected]>

* remove whitespace in preprocess_data.py

Signed-off-by: kta-intel <[email protected]>

* removed some extra torch.saves that were being used for debugging

Signed-off-by: kta-intel <[email protected]>

* deleted new requirements.txt files and modified setup instructions to point toward original requirements.txt

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

---------

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: nammbash <[email protected]>
nammbash pushed a commit to nammbash/openfl that referenced this pull request Feb 27, 2024
…d workflow interface (securefederatedai#907)

* implementation of neuralchat-7b finetuning using itrex and openfl

Signed-off-by: kta-intel <[email protected]>

* enabling support for itrex-neuralchat with openfl workflow-interface

Signed-off-by: kta-intel <[email protected]>

* updated model and description

Signed-off-by: kta-intel <[email protected]>

* fix colab link and add citation

Signed-off-by: kta-intel <[email protected]>

* add readme and additional setup and preprocess steps

Signed-off-by: kta-intel <[email protected]>

* fix preprocess step

Signed-off-by: kta-intel <[email protected]>

* modify readme, fix preprocess_dataset.py, add setup steps in notebook

Signed-off-by: kta-intel <[email protected]>

* fix lint issues

Signed-off-by: kta-intel <[email protected]>

* remove whitespace in preprocess_data.py

Signed-off-by: kta-intel <[email protected]>

* removed some extra torch.saves that were being used for debugging

Signed-off-by: kta-intel <[email protected]>

* deleted new requirements.txt files and modified setup instructions to point toward original requirements.txt

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

---------

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: nammbash <[email protected]>
nammbash pushed a commit to nammbash/openfl that referenced this pull request Feb 29, 2024
…d workflow interface (securefederatedai#907)

* implementation of neuralchat-7b finetuning using itrex and openfl

Signed-off-by: kta-intel <[email protected]>

* enabling support for itrex-neuralchat with openfl workflow-interface

Signed-off-by: kta-intel <[email protected]>

* updated model and description

Signed-off-by: kta-intel <[email protected]>

* fix colab link and add citation

Signed-off-by: kta-intel <[email protected]>

* add readme and additional setup and preprocess steps

Signed-off-by: kta-intel <[email protected]>

* fix preprocess step

Signed-off-by: kta-intel <[email protected]>

* modify readme, fix preprocess_dataset.py, add setup steps in notebook

Signed-off-by: kta-intel <[email protected]>

* fix lint issues

Signed-off-by: kta-intel <[email protected]>

* remove whitespace in preprocess_data.py

Signed-off-by: kta-intel <[email protected]>

* removed some extra torch.saves that were being used for debugging

Signed-off-by: kta-intel <[email protected]>

* deleted new requirements.txt files and modified setup instructions to point toward original requirements.txt

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

---------

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: nammbash <[email protected]>
nammbash pushed a commit to nammbash/openfl that referenced this pull request Feb 29, 2024
…d workflow interface (securefederatedai#907)

* implementation of neuralchat-7b finetuning using itrex and openfl

Signed-off-by: kta-intel <[email protected]>

* enabling support for itrex-neuralchat with openfl workflow-interface

Signed-off-by: kta-intel <[email protected]>

* updated model and description

Signed-off-by: kta-intel <[email protected]>

* fix colab link and add citation

Signed-off-by: kta-intel <[email protected]>

* add readme and additional setup and preprocess steps

Signed-off-by: kta-intel <[email protected]>

* fix preprocess step

Signed-off-by: kta-intel <[email protected]>

* modify readme, fix preprocess_dataset.py, add setup steps in notebook

Signed-off-by: kta-intel <[email protected]>

* fix lint issues

Signed-off-by: kta-intel <[email protected]>

* remove whitespace in preprocess_data.py

Signed-off-by: kta-intel <[email protected]>

* removed some extra torch.saves that were being used for debugging

Signed-off-by: kta-intel <[email protected]>

* deleted new requirements.txt files and modified setup instructions to point toward original requirements.txt

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

---------

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: nammbash <[email protected]>
manuelhsantana pushed a commit that referenced this pull request Jul 10, 2024
…d workflow interface (#907)

* implementation of neuralchat-7b finetuning using itrex and openfl

Signed-off-by: kta-intel <[email protected]>

* enabling support for itrex-neuralchat with openfl workflow-interface

Signed-off-by: kta-intel <[email protected]>

* updated model and description

Signed-off-by: kta-intel <[email protected]>

* fix colab link and add citation

Signed-off-by: kta-intel <[email protected]>

* add readme and additional setup and preprocess steps

Signed-off-by: kta-intel <[email protected]>

* fix preprocess step

Signed-off-by: kta-intel <[email protected]>

* modify readme, fix preprocess_dataset.py, add setup steps in notebook

Signed-off-by: kta-intel <[email protected]>

* fix lint issues

Signed-off-by: kta-intel <[email protected]>

* remove whitespace in preprocess_data.py

Signed-off-by: kta-intel <[email protected]>

* removed some extra torch.saves that were being used for debugging

Signed-off-by: kta-intel <[email protected]>

* deleted new requirements.txt files and modified setup instructions to point toward original requirements.txt

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

* fix typo in notebook

Signed-off-by: kta-intel <[email protected]>

---------

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: manuelhsantana <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants