[BUG] clip_grad_norm for zero_optimization mode is not working #6767

chengmengli06 · 2024-11-20T06:38:14Z

set "gradient_clipping" in deepspeed does not work, look into the source code in deepspeed.runtime.engine.DeepSpeedEngine,in line 2101

    def _take_model_step(self, lr_kwargs, block_eigenvalue={}):
        if self.gradient_clipping() > 0.0:
            if not (self.fp16_enabled() or self.bfloat16_enabled() or self.amp_enabled() or self.zero_optimization()):
                self.clip_fp32_gradients()
            elif self.amp_enabled():
                # AMP's recommended way of doing clipping
                # https://nvidia.github.io/apex/advanced.html#gradient-clipping
                master_params = amp.master_params(self.optimizer)
                clip_grad_norm_(parameters=master_params, max_norm=self.gradient_clipping(), mpu=self.mpu)
        self.optimizer.step()

thus gradient clipping do nothing at all!!!

The text was updated successfully, but these errors were encountered:

tjruwase · 2024-11-20T19:42:44Z

@chengmengli06, this is incorrect reading of the code. Gradient clipping is handled in the respective optimizer implementations such as:

chengmengli06 · 2024-11-21T08:07:33Z

I find it, and verify that it does work under zero_2 mode. Thanks!

chengmengli06 · 2024-11-21T08:43:21Z

@tjruwase another question is how log the pre-clip and after clip gradient norms to tensorboard? is there any interface to get the pre and after clip gradient norms?

chengmengli06 added bug Something isn't working compression labels Nov 20, 2024

chengmengli06 changed the title ~~[REQUEST]Please add clip_grad_norm for zero_optimization mode~~ [Bug] clip_grad_norm for zero_optimization mode is not working Nov 20, 2024

chengmengli06 changed the title ~~[Bug] clip_grad_norm for zero_optimization mode is not working~~ [BUG] clip_grad_norm for zero_optimization mode is not working Nov 20, 2024

tjruwase added training and removed compression labels Nov 20, 2024

chengmengli06 closed this as completed Nov 21, 2024

chengmengli06 reopened this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] clip_grad_norm for zero_optimization mode is not working #6767

[BUG] clip_grad_norm for zero_optimization mode is not working #6767

chengmengli06 commented Nov 20, 2024

tjruwase commented Nov 20, 2024

chengmengli06 commented Nov 21, 2024

chengmengli06 commented Nov 21, 2024 •

edited

Loading

[BUG] clip_grad_norm for zero_optimization mode is not working #6767

[BUG] clip_grad_norm for zero_optimization mode is not working #6767

Comments

chengmengli06 commented Nov 20, 2024

tjruwase commented Nov 20, 2024

chengmengli06 commented Nov 21, 2024

chengmengli06 commented Nov 21, 2024 • edited Loading

chengmengli06 commented Nov 21, 2024 •

edited

Loading