Reverb Replay Buffers + Vectorized/Batched Environments #300

wbrenton · 2023-05-25T05:16:46Z

I'm trying to use a reverb replay buffer with a batched environment like 'envpool' where the api returns a batch of experience whenever the either .reset or .step is called.

I'm guessing there must be a better way to insert that data into the buffer than to have a writer for each individual environment and iterate over the writers adding their respective batch index of experience experience.

The below is clearly suboptimal and defeats the purpose of using a vectorized environment opposed to many workers executing a single environment.

num_envs = 100
envs = make_envs(num_envs)
writer = [client.writer() for _ in range(num_envs)]
obs = envs.reset()
# obs.shape ==  (100, 3, 86, 86) 100 atari obs

while True:
    next_obs, reward, done, info = envs.step(action)
    # next_obs.shape ==  (100, 3, 86, 86)
    for i, writer in enumerate(writers):
        writer.append({
             'obs': obs[i],
             .....
             }
        obs = next_obs

If there are any examples of working with batched environments and reverb in the codebase or if anyone could provide some direction, I'd greatly appreciate it.

The text was updated successfully, but these errors were encountered:

ethanluoyc · 2023-06-20T13:18:14Z

I think currently creating multiple writers is the way to go as reverb does not provide a native way of doing batched append. There were some discussions about supporting batched environments. See
google-deepmind/reverb#52

If you want to instantiate multiple writers, there are some recommended setups for that which allows you to do this concurrently, see
google-deepmind/reverb#78

If you want to do use multiple workers, I think the recommended workflow is to use the launchpad library and the distributed experiment. You should be able to find some examples on how to do that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverb Replay Buffers + Vectorized/Batched Environments #300

Reverb Replay Buffers + Vectorized/Batched Environments #300

wbrenton commented May 25, 2023

ethanluoyc commented Jun 20, 2023

Reverb Replay Buffers + Vectorized/Batched Environments #300

Reverb Replay Buffers + Vectorized/Batched Environments #300

Comments

wbrenton commented May 25, 2023

ethanluoyc commented Jun 20, 2023