You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when I use DPO to train 7B starcoder, OOM happened, i used 16 A100 ,zero3
used TRL and transformers, When the code runs to AutoModelForCausalLM.from_pretrained , OOM happened. but qwencoder don't have this trouble. Are there any special settings in the model's structure that are not suitable for DPO (Direct Policy Optimization)?
code is
`parser = HfArgumentParser(DPOTrainingArguments)
args = parser.parse_args_into_dataclasses()[0]
down_file(args.model_path, args.pretrained_model)
model = AutoModelForCausalLM.from_pretrained(args.model_path)
model_ref = AutoModelForCausalLM.from_pretrained(args.model_path)
tokenizer = AutoTokenizer.from_pretrained(args.model_path)
when I use DPO to train 7B starcoder, OOM happened, i used 16 A100 ,zero3
used TRL and transformers, When the code runs to
AutoModelForCausalLM.from_pretrained
, OOM happened. but qwencoder don't have this trouble. Are there any special settings in the model's structure that are not suitable for DPO (Direct Policy Optimization)?code is
`parser = HfArgumentParser(DPOTrainingArguments)
args = parser.parse_args_into_dataclasses()[0]
down_file(args.model_path, args.pretrained_model)
model = AutoModelForCausalLM.from_pretrained(args.model_path)
model_ref = AutoModelForCausalLM.from_pretrained(args.model_path)
tokenizer = AutoTokenizer.from_pretrained(args.model_path)
dpo_trainer = MyDPOTrainer(
model,
model_ref,
args=args,
....
)
`
The text was updated successfully, but these errors were encountered: