Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong Loss Function? #7

Open
oduerr opened this issue Nov 20, 2016 · 0 comments
Open

Wrong Loss Function? #7

oduerr opened this issue Nov 20, 2016 · 0 comments

Comments

@oduerr
Copy link

oduerr commented Nov 20, 2016

Hello

Did anybody successfully train using this code? We don't get the pinball (VideoPinball-v0) to do usefull stuff.

There seems to be a subtle bug in the calculation of the loss function. According to the nature paper (see Algorithm 1) the Q-Value of the target function should be the maximum. However in the code dqn in function doMinibatch (line 122)

its

q_target_max = np.argmax(q_target, axis=1)

and thus not the maximum. Shouldn't that be

q_target_max =  np.amax(q_target, axis=1)

Cheers,
Oliver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant