--gradient_accumulation_steps #33

wxtnull77 · 2021-06-27T11:49:54Z

Hello, I would like to ask you about this step in README, The default batch size is 512. When GPU memory is insufficient, you can proceed with training by adjusting the value of --gradient_accumulation_steps.how to do it specifically？

rentainhe · 2021-09-15T07:12:08Z

I have the same question :(

HughSeven · 2021-10-29T06:43:09Z

my GPU reported that the memory was full, and then I looked at the code
There is a line in the train function:
args.train_ batch_ size = args.train_ batch_ size // args.gradient_ accumulation_ steps
so I use ''python train.py -- XX -- XX -- XX -- args.gradient during training_ accumulation_ steps 3''
Then I can run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--gradient_accumulation_steps #33

--gradient_accumulation_steps #33

wxtnull77 commented Jun 27, 2021

rentainhe commented Sep 15, 2021

HughSeven commented Oct 29, 2021

--gradient_accumulation_steps #33

--gradient_accumulation_steps #33

Comments

wxtnull77 commented Jun 27, 2021

rentainhe commented Sep 15, 2021

HughSeven commented Oct 29, 2021