[BUG] Replay Buffer + Non-Flat Observations Fail #116

JesseSilverberg · 2024-09-11T21:16:22Z

Describe the bug

Trying to run an algorithm that uses a replay buffer with an environment that has non-flat observations yields a shape error:

Traceback (most recent call last):
  File "/home/REDACTED/projects/Stoix/stoix/systems/q_learning/ff_dqn.py", line 572, in hydra_entry_point
    eval_performance = run_experiment(cfg)
  File "/home/REDACTED/projects/Stoix/stoix/systems/q_learning/ff_dqn.py", line 443, in run_experiment
    learn, eval_q_network, learner_state = learner_setup(env, (key, q_net_key), config)
  File "/home/REDACTED/projects/Stoix/stoix/systems/q_learning/ff_dqn.py", line 414, in learner_setup
    env_states, timesteps, keys, buffer_states = warmup(
  File "/home/REDACTED/projects/Stoix/stoix/systems/q_learning/ff_dqn.py", line 84, in warmup
    buffer_states = buffer_add_fn(buffer_states, traj_batch)
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/flashbax/buffers/item_buffer.py", line 96, in add_fn
    return buffer.add(state, flattened_batch)
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/flashbax/buffers/trajectory_buffer.py", line 149, in add
    experience = jax.tree_util.tree_map(
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/flashbax/buffers/trajectory_buffer.py", line 150, in <lambda>
    lambda experience_field, batch_field: experience_field.at[:, indices].set(
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/jax/_src/numpy/array_methods.py", line 500, in set
    return scatter._scatter_update(self.array, self.index, values, lax.scatter,
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/jax/_src/ops/scatter.py", line 76, in _scatter_update
    return _scatter_impl(x, y, scatter_op, treedef, static_idx, dynamic_idx,
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/jax/_src/ops/scatter.py", line 111, in _scatter_impl
    y = jnp.broadcast_to(y, tuple(indexer.slice_shape))
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/jax/_src/numpy/lax_numpy.py", line 2252, in broadcast_to
    return util._broadcast_to(array, shape)
  File "/home/REDACTED/projects/Stoix/venv/lib/python3.10/site-packages/jax/_src/numpy/util.py", line 421, in _broadcast_to
    raise ValueError(f"Cannot broadcast to shape with fewer dimensions: {arr_shape=} {shape=}")
ValueError: Cannot broadcast to shape with fewer dimensions: arr_shape=(1, 16384, 10, 10, 7) shape=(1, 16384, 700)

To Reproduce

python stoix/systems/q_learning/ff_dqn.py env=gymnax/freeway

Context (Environment)

This is on the latest version of Stoix (installed a few days ago).

Possible Solution

#115

The text was updated successfully, but these errors were encountered:

EdanToledo · 2024-09-12T08:59:04Z

Hey, thanks so much for pointing this out. Lets move discussion over to PR but imagine it will be a quick one :)

JesseSilverberg added the bug Something isn't working label Sep 11, 2024

EdanToledo linked a pull request Sep 12, 2024 that will close this issue

fix: make FlattenObservationWrapper also flatten next_obs #115

Merged

EdanToledo closed this as completed in #115 Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Replay Buffer + Non-Flat Observations Fail #116

[BUG] Replay Buffer + Non-Flat Observations Fail #116

JesseSilverberg commented Sep 11, 2024

EdanToledo commented Sep 12, 2024

[BUG] Replay Buffer + Non-Flat Observations Fail #116

[BUG] Replay Buffer + Non-Flat Observations Fail #116

Comments

JesseSilverberg commented Sep 11, 2024

Describe the bug

To Reproduce

Context (Environment)

Possible Solution

EdanToledo commented Sep 12, 2024