Idea to improve train loss curves (thx!) #582

Oyiyi · 2023-09-12T18:16:06Z

Oyiyi
Sep 12, 2023

Hi - below are two of my train loss curves from training the alpaca data on LLaMa2-hf. Pleeease anyone has insights into these two questions? (params below)

Why did both training loss experience a significant drop within the first 100 steps?
How to fix oscillations the green curve has? What can be the possible reasons?

-----------------------params for Blue line-----------------------
# model/data params
base_model: str = "meta-llama/Llama-2-7b-hf",
data_path: str = "alpaca_data.json",
output_dir: str = "", #./lora-alpaca/fine-tuned_outputs
# training hyperparams
batch_size: int = 128,
micro_batch_size: int = 8,
num_epochs: int = 10,
learning_rate: float = 3e-4,
cutoff_len: int = 512,
val_set_size: int = 2000,
# lora hyperparams
lora_r: int = 16,
lora_alpha: int = 16,
lora_dropout: float = 0.05,
lora_target_modules: List[str] = [
"q_proj",
"v_proj",
],
# llm hyperparams
train_on_inputs: bool = True, # if False, masks out inputs in loss
add_eos_token: bool = False,
group_by_length: bool = False, # faster, but produces an odd training loss curve
# wandb params
wandb_project: str = "",
wandb_run_name: str = "",
wandb_watch: str = "", # options: false | gradients | all
wandb_log_model: str = "", # options: false | true
resume_from_checkpoint: str = None, # either training checkpoint or final adapter
prompt_template_name: str = "alpaca", # The prompt template to use, will default to alpaca.
):

-----------------------params for Green line-----------------------
same params as above, except:
python finetune.py
--base_model='meta-llama/Llama-2-7b-hf'
--data_path: "yahma/alpaca-cleaned"
--num_epochs=10
--cutoff_len=512
--group_by_length
--output_dir='./lora-alpaca-finetuned-outputs'
--lora_target_modules='[q_proj,k_proj,v_proj,o_proj]'
--lora_r=16
--micro_batch_size=8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea to improve train loss curves (thx!) #582

{{title}}

Replies: 0 comments

Select a reply

Idea to improve train loss curves (thx!) #582

Oyiyi Sep 12, 2023

Replies: 0 comments

Oyiyi
Sep 12, 2023