You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The PettingZooParallelWrapper class generates invalid actions.
The method int_to_cyborg_action returns a dictionary that maps the number selected by the agent to a specific CybORG action. However, the dictionary is only constructed during initialisation and reset of the environment. The consequence is that the actions returned by the dictionary have invalid parameters (e.g., keys 'blocked' and 'dropped' which are picked up transparently when the actions are created) at run-time throughout each episode. It is specifically the RetakeControl action where this issue seems to manifest.
This issue was very hard to pin down and I still don't entirely understand why the parameters are invalid, however I compared actions with and without the wrapper class and found inequality as demonstrated below:
I modified the int_to_cyborg_action to return the action directly rather than a potentially outdated dictionary and this seems to have fixed the issue as verifiable by WrappedCanary and Canary now achieving the same results in the evaluation. Please note that WrappedCanary implements more-or-less the same algorithm as Canary so the substantial score discrepancy is caused by the issue in the wrapper.
The fixed class can be found here and I will also create a pull request shortly.
Cheers
The text was updated successfully, but these errors were encountered:
The PettingZooParallelWrapper class generates invalid actions.
The method
int_to_cyborg_action
returns a dictionary that maps the number selected by the agent to a specific CybORG action. However, the dictionary is only constructed during initialisation and reset of the environment. The consequence is that the actions returned by the dictionary have invalid parameters (e.g., keys 'blocked' and 'dropped' which are picked up transparently when the actions are created) at run-time throughout each episode. It is specifically the RetakeControl action where this issue seems to manifest.This issue was very hard to pin down and I still don't entirely understand why the parameters are invalid, however I compared actions with and without the wrapper class and found inequality as demonstrated below:
I modified the
int_to_cyborg_action
to return the action directly rather than a potentially outdated dictionary and this seems to have fixed the issue as verifiable by WrappedCanary and Canary now achieving the same results in the evaluation. Please note that WrappedCanary implements more-or-less the same algorithm as Canary so the substantial score discrepancy is caused by the issue in the wrapper.The fixed class can be found here and I will also create a pull request shortly.
Cheers
The text was updated successfully, but these errors were encountered: