Pendulumnovel_ddpg_lstm sampled hidden states in _update() seems not fully utilized. #149
Unanswered
zhangmingcheng28
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Thank you for the great work in assembling various DRL learning framework. May I ask in the file "torch_gymnasium_pendulumnovel_ddpg_lstm.py", inside function compute(), during _update() phase of ddpg, after we have transpose the sampled hidden_states to (D * num_layers, N, L, Hout), there is this binary variable called "sequence_index", it is to decide which seqence to use, amonut the 20 sequence of the sampled hidden_states. But since "sequence_index" is binary variable, does that means the rest of the 18 seqeuce is not ever used.
In addition, I am also curious that why hidden_states[:, :, 1, :], can be used in target network, then hidden_states[:, :, 0, :] can be used for action network, why not use hidden_states[:, :, 2, :] or hidden_states[:, :, 3, :], or any other sequence.
Thank you,
Mingcheng
Beta Was this translation helpful? Give feedback.
All reactions