-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculating Reward on Game End #38
Comments
Both players died here, during the same rollout. Then one player won at the last rollout. The rewards are aggregated only per-rollout. |
But a single player needs to die twice for game to be over. They each died once according to record, so game should not be over. What I believe happened is that a player died a 2nd time and that this info was not captured in our reward aggregation. |
it doesn't need to be captured, because it got rewarded for the win.
…On Wed, Feb 6, 2019 at 5:21 PM Nostrademous ***@***.***> wrote:
But a single player needs to die twice for game to be over. They each died
once according to record, so game should not be over. What I believe
happened is that a player died a 2nd time and that this info was not
captured in our reward aggregation.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHXSRGPiACfHUiROkV4XHGoBonfDG8unks5vK3-0gaJpZM4ajiU0>
.
|
but doesn't the bot that died a 2nd time not get the negative reward from the 2nd death? Sure, it "loses" and gets the -5 but it won't necessarily make the connection that 2nd death is the cause as it doesn't see the 2nd death in the rewards. or am I misunderstanding something about our algo? |
The reward is a single scalar of the sum of all rewards. It doesn't know
how it's compounded.
…On Wed, Feb 6, 2019 at 6:39 PM Nostrademous ***@***.***> wrote:
but doesn't the bot that died a 2nd time not get the negative reward from
the 2nd death? Sure, it "loses" and gets the -5 but it won't necessarily
make the connection that 2nd death is the cause as it doesn't see the 2nd
death in the rewards.
or am I misunderstanding something about our algo?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#38 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHXSRCBGsPqAHGPllLHz832VEu_-V5Q7ks5vK5HjgaJpZM4ajiU0>
.
|
If you look at the stream of rewards below (entire Game #2) you will see that it ends in victory for Dire, however you only see 1 death each from both agents. Also, based on
tower_hp
it looks like the tower was not even close to dying, meaning the game ended b/c the Radiant agent died a 2nd time, but I don't have the -3.0 kill reward for Player 0 in the reward a second time.This make me believe that we don't capture the rewards between the last reward sync and the game end.
The text was updated successfully, but these errors were encountered: