You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know this may not be an issue but I wish to get an answer from gurus in Python and Pytorch.(I am not a CS major student but loves Deep RL)
In homework 2's pdf, you said:
A serious bottleneck in the learning, for more complex environments, is the sample collection time. In infrastructure/rl_trainer.py, we only collect trajectories in a single thread, but this process can be fully parallelized across threads to get a useful speedup. Implement the parallelization and report on the difference in training time.
I have tried using multiprocessing in cs285.infrastructure.utils.sample_trajectories and
and modified cs285.infrastructure.utils.sample_trajectory with:
defsample_trajectory(mp_timesteps_this_batch, min_timesteps_per_batch, mp_paths, enough_event,
env, policy, max_path_length, render=False, render_mode=('rgb_array')):
whileTrue:
# initialize env for the beginning of a new rolloutob=env.reset() # HINT: should be the output of resetting the env
......
path_to_append=Path(obs, image_obs, acs, rewards, next_obs, terminals)
len_path_to_append=get_pathlength(path_to_append)
withmp_timesteps_this_batch.get_lock():
ifmp_timesteps_this_batch.value>=min_timesteps_per_batch:
enough_event.set()
elifmp_timesteps_this_batch.value+len_path_to_append>=min_timesteps_per_batch:
mp_paths.append(path_to_append)
mp_timesteps_this_batch.value+=len_path_to_appendenough_event.set()
else:
mp_paths.append(path_to_append)
mp_timesteps_this_batch.value+=len_path_to_append
However, because sample_trajectory uses policy.get_action, which involves CUDA operations, there is a lot errors thrown such as: CUDA error: an illegal memory access was encountered.
It seems OK if running completely with only CPU.
So, can anyone gives me a thought on how to implement a parallel version of trajectories collecting?
Thanks a lot!
The text was updated successfully, but these errors were encountered:
I know this may not be an issue but I wish to get an answer from gurus in Python and Pytorch.(I am not a CS major student but loves Deep RL)
In homework 2's pdf, you said:
A serious bottleneck in the learning, for more complex environments, is the sample collection time. In infrastructure/rl_trainer.py, we only collect trajectories in a single thread, but this process can be fully parallelized across threads to get a useful speedup. Implement the parallelization and report on the difference in training time.
I have tried using multiprocessing in
cs285.infrastructure.utils.sample_trajectories
andand modified
cs285.infrastructure.utils.sample_trajectory
with:However, because
sample_trajectory
usespolicy.get_action
, which involves CUDA operations, there is a lot errors thrown such as:CUDA error: an illegal memory access was encountered
.It seems OK if running completely with only CPU.
So, can anyone gives me a thought on how to implement a parallel version of trajectories collecting?
Thanks a lot!
The text was updated successfully, but these errors were encountered: