diff --git a/use_case_examples/lora_finetuning/README.md b/use_case_examples/lora_finetuning/README.md index cf16d2176..36ae88b3e 100644 --- a/use_case_examples/lora_finetuning/README.md +++ b/use_case_examples/lora_finetuning/README.md @@ -1,16 +1,20 @@ -# Privacy Preserving GPT2 LoRA +# Privacy Preserving Language Models LoRA Fine-tuning -This project demonstrates how to fine-tune GPT-2 using Low-Rank Adaptation (LoRA) weights with Fully Homomorphic Encryption (FHE). The goal is to train a specialized model in a privacy-preserving manner, with minimal memory requirements. +This use case demonstrates how to fine-tune language models (GPT-2 and LLaMA) using Low-Rank Adaptation (LoRA) weights with Fully Homomorphic Encryption (FHE). The goal is to train specialized models in a privacy-preserving manner, with minimal memory requirements. ## Overview -Fine-tuning large language models typically requires access to sensitive data, which can raise privacy concerns. By leveraging FHE, we can perform computations on encrypted data, ensuring that the data remains private throughout the training process. In this approach, the LoRA weights are only known to the user who owns the data and the memory hungry foundation model remains on the server. +Fine-tuning large language models typically requires access to sensitive data, which can raise privacy concerns. By leveraging FHE, we can perform computations on encrypted foundation model weights, ensuring that the data remain private throughout the training process. The LoRA weights are kept in clear on the client side. + ## Key Features -- **LoRA Fine-Tuning**: Fine-tune GPT-2 by adapting low-rank weights. -- **Hybrid Model**: Combine traditional and encrypted computations for optimal performance. -- **Low Memory Requirements**: Minimal client-side memory needed for LoRA weights. +- **LoRA Fine-Tuning**: Fine-tune language models by adapting low-rank weights +- **Hybrid Model**: Combine encrypted foundation model weights with clear LoRA weights for optimal performance +- **Low Memory Requirements**: Minimal client-side memory needed for LoRA weights +- **Multiple Approaches**: + - Custom training implementation for GPT-2 + - Simplified API-based approach for LLaMA using the `LoraTrainer` ## Setup @@ -26,9 +30,25 @@ pip install -r requirements.txt ## Usage -### Prepare the Dataset +### Available Notebooks + +The repository includes two example notebooks: + +1. **GPT2FineTuneHybrid.ipynb**: + - Uses a custom training implementation + - Fine-tunes GPT-2 on a small Q&A data-set about FHE + - Shows low-level control over the training process + +2. **LLamaFineTuning.ipynb**: + - Uses Concrete ML's `LoraTrainer` API for simplified implementation + - Fine-tunes LLaMA on Concrete ML code examples + - Shows how to use the high-level API for encrypted fine-tuning -Replace the data-set in the `data_finetune` directory to the one you want to use for fine-tuning. +### Prepare the data-set + +Each notebook includes its own data-set: +- GPT-2 uses a small Q&A data-set about FHE in `data_finetune/what_is_fhe.txt` +- LLaMA uses Concrete ML code examples in `data_finetune/data-set.jsonl` ### Run the Fine-Tuning Script @@ -47,14 +67,18 @@ In a deployment or production scenario, the model can be fine-tuned as follows: ## Results -The fine-tuned model can generate specialized text based on the provided data-set while ensuring data privacy through FHE. +### GPT-2 Results After fine-tuning, the model's weights are distributed between the client and server as follows: - Total weights removed from the server: 68.24% - LoRA weights kept on the client: 147,456 (approximately 0.12% of the original model's weights) -Note that the embedding are not considered for now but contain a significant amount of weights (around 30%) for GPT2. They will be considered in a future version of Concrete ML. +Note that the embeddings are not considered for now but contain a significant amount of weights (around 30%) for GPT2. They will be considered in a future version of Concrete ML. + +### LLaMA Results + +TBD ## Conclusion