You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really like your work, and want to ask for some clarifications on your new observation on training a flat A3C agent without the meta-controller. In this case are the sub-goals randomly generated every 'c' timesteps? (instead of the meta-controller outputting the sub-goal)
Thanks,
Dai
The text was updated successfully, but these errors were encountered:
Yes, the sub-goals were randomly generated every c=100 time-steps. I also found that fixing the sub-goal to be just the first one also works in some seed. This only works with feature-control pseudo reward tho.
Hi,
I really like your work, and want to ask for some clarifications on your new observation on training a flat A3C agent without the meta-controller. In this case are the sub-goals randomly generated every 'c' timesteps? (instead of the meta-controller outputting the sub-goal)
Thanks,
Dai
The text was updated successfully, but these errors were encountered: