-
Notifications
You must be signed in to change notification settings - Fork 319
Parameters for Breakout #125
Comments
I used the default parameters from the code, gamma 0.99, lambda 1.0, learning rate 1e-4, gradient clip 40. I also can't reproduce the results for Breakout after 24h with 16 workers (3 independent trainings): However: #87 (comment)
The machine had 24 cores. So, I did run it again for 14h with 8 workers (3 independent trainings): And A3C (16 workers) from https://arxiv.org/pdf/1602.01783.pdf (page 5) looks better: |
I realized something, in the above A3C paper:
Unfortunately, it seems that it's not written what the learning rate and gradient norm clipping values were. But they could be different than the default ones used in the code here. |
Has anyone found optimal hyperparameters? |
I haven't (tried a bit), but this is helpful:
Here are the hyperparams for the original A3C work, but for Seems possible to find "working" ones for By the way, be aware that |
Hello there!
I tried to implement my own version of the A3C using tensorflow (here), but ended up not getting good results. Thus, I used the same network architecture as this implementation (universe starter agent) to see if it would change the results. Initially, I thought that the default convolutional layers from tensorflow (tensorflow.contrib.layers) was the responsible. I then used the same convolution function used here, but to no avail....I have already checked the flow of my code and compared it to universe starter agent, and found them to be the same.
The environment that is giving me problems is Breakout. For Pong for example, my code (with the current parameters) works very well. But when I try it with the Breakout, I can't get past the score of 40...I have already tried several parameters (different network architecture, learning rates, frame skipping), but still no success. Has anyone tried this code for Breakout? What parameters did you use? Since I have limited computational power, it is hard for me to make several tests, which forced me to post this question.
Thank you all!
The text was updated successfully, but these errors were encountered: