Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'dict' object has no attribute 'terminated' with multi agent env having gymnasium.spaces.Dict observation space #1219

Open
5 of 9 tasks
fortminors opened this issue Oct 7, 2024 · 0 comments

Comments

@fortminors
Copy link

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
    • design request (i.e. "X should be changed to Y.")
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
>>> import tianshou, gymnasium as gym, torch, numpy, sys
>>> print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)
1.2.0-dev 0.28.1 2.1.1+cu121 1.24.4 3.11.10 (main, Oct  3 2024, 07:29:13) [GCC 11.2.0] linux

I have this minimal reproducible example with a square 20x20 grid environment and 5 agents, that are allowed to take actions \in {left, top, right, bottom, idle}. The observation contains displacement for each robot from the current position to its goal position in the grid and the 5x5 local observation for each robot around it.

import random

import numpy as np
from gymnasium import Env
from gymnasium import spaces

from tianshou.data import Batch, ReplayBuffer


class GridEnv(Env):
    def __init__(self):
        super(GridEnv, self).__init__()
        self.N = 20
        self.n_agents = 5

        self.action_space = spaces.MultiDiscrete([5] * self.n_agents)

        self.observation_space = spaces.Dict({
            "displacement": spaces.Box(low=-self.N + 1, high=self.N - 1, shape=(self.n_agents, 2), dtype=np.int32),
            "grid": spaces.Box(low=0, high=1, shape=(self.n_agents, 5 * 5), dtype=np.int32),
        })

    def reset(self, *, seed=None, options=None):
        obs_info = {}
        super().reset(seed=seed)
        random.seed(seed)
        return self.get_observations(), obs_info

    def step(self, actions):
        rewards = 0
        terminated = False
        truncated = False
        obs_info = {}
        return self.get_observations(), rewards, terminated, truncated, obs_info

    def get_observations(self) -> dict:
        displacements = np.random.randint(-self.N + 1, self.N - 1, (self.n_agents, 2), dtype=np.int32)
        grid_obs = np.random.randint(0, 2, (self.n_agents, 5 * 5), np.int32)
        return {"displacement": displacements, "grid": grid_obs}


env = GridEnv()

print(env.reset())

b = ReplayBuffer(size=3)
b.add(Batch(obs=env.reset(), act=0, rew=0, done=0))
print(b)

I am trying to use it with Tianshou, but I get the following output with error

({'displacement': array([[-11,   8],
       [-19,  15],
       [-15,   1],
       [ -4, -15],
       [ 11,  -9]], dtype=int32), 'grid': array([[0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
        1, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0,
        0, 0, 1],
       [1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1,
        0, 0, 0],
       [0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1,
        1, 0, 0],
       [0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0,
        1, 0, 0]], dtype=int32)}, {})
Traceback (most recent call last):
  File "repr.py", line 47, in <module>
    b.add(Batch(obs=env.reset(), act=0, rew=0, done=0))
  File "../tianshou/data/buffer/base.py", line 430, in add
    batch.__dict__["done"] = np.logical_or(batch.terminated, batch.truncated)
                                           ^^^^^^^^^^^^^^^^
  File "../tianshou/tianshou/data/batch.py", line 687, in __getattr__
    return getattr(self.__dict__, key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'terminated'

How would you suggest to fix this issue? Or perhaps there is some better way to train an algorithm with such a setup? Maybe making a different observation space? I am feeding them the local 5x5 grid so that they are aware of the other agents and I don't want them to hit each other. And I want to train it in a centralized manner. Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant