You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Did anybody successfully train using this code? We don't get the pinball (VideoPinball-v0) to do usefull stuff.
There seems to be a subtle bug in the calculation of the loss function. According to the nature paper (see Algorithm 1) the Q-Value of the target function should be the maximum. However in the code dqn in function doMinibatch (line 122)
its
q_target_max = np.argmax(q_target, axis=1)
and thus not the maximum. Shouldn't that be
q_target_max = np.amax(q_target, axis=1)
Cheers,
Oliver
The text was updated successfully, but these errors were encountered:
Hello
Did anybody successfully train using this code? We don't get the pinball (VideoPinball-v0) to do usefull stuff.
There seems to be a subtle bug in the calculation of the loss function. According to the nature paper (see Algorithm 1) the Q-Value of the target function should be the maximum. However in the code dqn in function doMinibatch (line 122)
its
and thus not the maximum. Shouldn't that be
Cheers,
Oliver
The text was updated successfully, but these errors were encountered: