Flat A3C agent #1

dai-dao · 2017-12-24T23:29:42Z

Hi,

I really like your work, and want to ask for some clarifications on your new observation on training a flat A3C agent without the meta-controller. In this case are the sub-goals randomly generated every 'c' timesteps? (instead of the meta-controller outputting the sub-goal)

Thanks,
Dai

Nat-D · 2017-12-25T03:16:40Z

Hi Dai,

Yes, the sub-goals were randomly generated every c=100 time-steps. I also found that fixing the sub-goal to be just the first one also works in some seed. This only works with feature-control pseudo reward tho.

Best,
Nat

nina124 · 2017-12-30T11:50:29Z

Does "randomly generated meta-action" work with pixel-control pseudo reward?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flat A3C agent #1

Flat A3C agent #1

dai-dao commented Dec 24, 2017

Nat-D commented Dec 25, 2017

nina124 commented Dec 30, 2017

Flat A3C agent #1

Flat A3C agent #1

Comments

dai-dao commented Dec 24, 2017

Nat-D commented Dec 25, 2017

nina124 commented Dec 30, 2017