ActorNetwork - sample_normal method log_probs issue #59

zenineasa · 2023-02-24T22:09:23Z

In the following line, the code can break if the value of 'self.max_action' is high enough that 'action' could have a high value, making the value within the logarithm negative. Negative values of logarithms return NaN.

log_probs -= T.log(1-action.pow(2)+self.reparam_noise)

Youtube-Code-Repository/ReinforcementLearning/PolicyGradient/SAC/networks.py

Line 130 in a600647

log_probs -= T.log(1-action.pow(2)+self.reparam_noise)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ActorNetwork - sample_normal method log_probs issue #59

ActorNetwork - sample_normal method log_probs issue #59

zenineasa commented Feb 24, 2023 •

edited

Loading

ActorNetwork - sample_normal method log_probs issue #59

ActorNetwork - sample_normal method log_probs issue #59

Comments

zenineasa commented Feb 24, 2023 • edited Loading

zenineasa commented Feb 24, 2023 •

edited

Loading