Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About NoamLR #7

Open
ShaneTian opened this issue Mar 6, 2022 · 0 comments
Open

About NoamLR #7

ShaneTian opened this issue Mar 6, 2022 · 0 comments

Comments

@ShaneTian
Copy link

def get_lr(self):
last_epoch = max(1, self.last_epoch)
scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5))
return [base_lr * scale for base_lr in self.base_lrs]

The custom NoamLR results in same LR in the first two steps, like:

warmup_steps = 10
lr = 0.01
  • step -- before (current step LR) -- after (next step LR)
  • 0 -- 0.001 -- 0.001
  • 1 -- 0.001 -- 0.002
  • 2 -- 0.002 -- 0.003

There are two ways to fix this:

  • last_epoch = self.last_epoch + 1, like ESPnet
    def get_lr(self): 
        last_epoch = self.last_epoch + 1
        scale = self.warmup_steps ** 0.5 * min(last_epoch ** (-0.5), last_epoch * self.warmup_steps ** (-1.5)) 
        return [base_lr * scale for base_lr in self.base_lrs] 
  • use LambdaLR directly
    noam_scale = lambda epoch: (warmup_steps ** 0.5) * min((epoch + 1) ** -0.5, (epoch + 1) * (warmup_steps ** -1.5))
    scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=noam_scale)

Of course, the above two approaches are equivalent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant