-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathfine-tuning.py
43 lines (32 loc) · 1.85 KB
/
fine-tuning.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
## Prepare Your Dataset:
## Your dataset should be in JSONL format. Each line in the file represents one training example. For example:
{"prompt": "The capital of France is", "completion": " Paris"}
{"prompt": "The largest mammal is", "completion": " the blue whale"}
## Save this dataset to a file, e.g., training_data.jsonl.
## Write the Fine-Tuning Code: Below is an example code snippet for fine-tuning an OpenAI model.
import openai
from dotenv import load_dotenv
import os
# Load the .env file
load_dotenv()
def fine_tune_model(training_file_path):
openai.api_key = os.getenv('OPENAI_API_KEY')
try:
## Upload the file
file = openai.File.create(file=open(training_file_path), purpose="fine-tune")
file_id = file.id
## Start the fine-tuning process
fine_tune = openai.FineTune.create(training_file=file_id, model="gpt-3.5-turbo")
return fine_tune
except Exception as e:
return str(e)
## Example usage
training_file = "path/to/your/training_data.jsonl"
print(fine_tune_model(training_file))
## In this code, we define a function fine_tune_model that takes a file path to your training data.
## The function sets the API key, uploads the training data, and initiates the fine-tuning process.
## The function returns the fine-tuning job details or an error message if an exception occurs.
## Monitor the Fine-Tuning Process: After initiating fine-tuning, you can monitor the progress using the FineTune API.
## Once the fine-tuning is complete, you will get a new model that you can use for inference.
## Error Handling and Validation: The provided example includes basic error handling. Ensure to validate your training data and handle any potential issues.
## Remember to replace 'your-api-key' with your actual OpenAI API key and 'path/to/your/training_data.jsonl' with the path to your training data file.