-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move function make_optimizer_and_scheduler
to policy
#401
base: main
Are you sure you want to change the base?
Move function make_optimizer_and_scheduler
to policy
#401
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I wanted to avoid this design (method from the policy instantiating the optimizer), but I dont think we can ^^
optimizer, _ = make_optimizer_and_scheduler(cfg, policy) | ||
optimizer, _ = policy.make_optimizer_and_scheduler(cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add optimizer, _ = policy.make_optimizer_and_scheduler(cfg)
to another place in our unit tests?
It feels like this code logic should be tested for all policies, not just act. Thanks ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cadene Should we change test_act_backbone_lr
to a more general function for all policies like:
@pytest.mark.parametrize(
"env_name,policy_name",
[
("pusht", "tdmpc"),
("pusht", "diffusion"),
("pusht", "vqbet"),
("aloha", "act")
],
)
def test_policy_backbone_lr(env_name, policy_name):
"""
Test that the ACT policy can be instantiated with a different learning rate for the backbone.
"""
cfg = init_hydra_config(
DEFAULT_CONFIG_PATH,
overrides=[
f"env={env_name}",
f"policy={policy_name}",
f"device={DEVICE}",
"training.lr_backbone=0.001",
"training.lr=0.01",
],
)
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for drafting this up @michel-aractingi.
So, thinking about this more deeply, I don't think the policy should return the optimizer and scheduler objects. This is because someone should be able to freely decide what these should be. Eg: Want to use exponential decaying LR on ACT? Fine.
What I do think should be provided by the policy is a list of parameter groups for optimization (which the user may use or may ignore - but they are there as a suggestion and for convenience).
So, supposing you agree with me, I would consider:
-
Changing the method name to
get_optimizer_param_groups
, and returning a list of param groups for an optimizer. -
Using a standard
training.lr_scheduler
parameter in all policies. It can be None. -
Using a standard
optimizer
parameter in all policies. It is required. -
Using Hydra's class instantiatiation tooling to handle creating the optimizer and scheduler objects. https://hydra.cc/docs/advanced/instantiate_objects/overview/
- And using the param groups from the policy to pass to the optimizer class on instantiation.
What we would achieve with this is 1 common interface for all policies for setting the optimizer and scheduler via the hydra config.
What this does
Move function
make_optimizer_and_scheduler
intrain.py
to the policy class. The function becomes increasingly long as we add new policies. After this PR the optimizer and scheduler of each policy can be created by callingpolicy.make_optimizer_and_scheduler()
.How it was tested
Tested by running a training command for diffusion policy, act and tdmpc to make sure there are not syntax issues. Also the unit tests in test policies.