You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, neither BlueRewardMachine nor EmptyRewardCalculator use this state as part of their calculate_reward methods. Additionally, the collection of this state is very time consuming, taking up 40-60% of the time associated with stepping through the environment.
If both RewardCalculator subclasses remove this current state collection, performance of the environment is improved dramatically. For example, stepping through 500 steps goes from 12s down to 4s.
The text was updated successfully, but these errors were encountered:
By default, the
RewardCalculator
class will collect the current true state of the environment to pass on to each reward calculator subclass.cage-challenge-4/CybORG/Shared/RewardCalculator.py
Lines 39 to 45 in 313bf33
However, neither BlueRewardMachine nor EmptyRewardCalculator use this state as part of their
calculate_reward
methods. Additionally, the collection of this state is very time consuming, taking up 40-60% of the time associated with stepping through the environment.If both RewardCalculator subclasses remove this current state collection, performance of the environment is improved dramatically. For example, stepping through 500 steps goes from 12s down to 4s.
The text was updated successfully, but these errors were encountered: