You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The prodigy (or Dadapt) calculate lr itself. I know the lr rate argument is actually the ratio on that calculated lr. Does it also work that way if I use cosine_with_restarts? Will the ratio change as the way when a AdamW is used?
The text was updated successfully, but these errors were encountered:
The prodigy (or Dadapt) calculate lr itself. I know the lr rate argument is actually the ratio on that calculated lr. Does it also work that way if I use cosine_with_restarts? Will the ratio change as the way when a AdamW is used?
The text was updated successfully, but these errors were encountered: