Issue #55

Zaibali9999 · 2023-01-14T11:46:24Z

(array([-0.02680779, 0.00466264, -0.02511859, -0.04842809], dtype=float32), {})
Traceback (most recent call last):
File "main.py", line 31, in
action, prob, val = agent.choose_action(observation)
File "D:\AI\PPO\agent.py", line 41, in choose_action
state = tf.convert_to_tensor([observation],dtype=tf.float32)
File "C:\Users\Buster.conda\envs\PPO\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\Buster.conda\envs\PPO\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Can't convert non-rectangular Python sequence to Tensor.

Zaibali9999 · 2023-01-15T06:58:08Z

(array([ 0.0047165 , -0.04676152, -0.03735694, -0.0472385 ], dtype=float32), {})
D:\AI\PPO\torch\ppo_torch.py:137: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_new.cpp:233.)
state = T.tensor([observation], dtype=T.float).to(self.actor.device)
Traceback (most recent call last):
File "D:\AI\PPO\torch\main.py", line 31, in
action, prob, val = agent.choose_action(observation)
File "D:\AI\PPO\torch\ppo_torch.py", line 137, in choose_action
state = T.tensor([observation], dtype=T.float).to(self.actor.device)
ValueError: expected sequence of length 4 at dim 2 (got 0)

Zaibali9999 · 2023-01-15T06:58:43Z

same issue with torch agent

rafayaamirgull · 2023-03-19T16:50:56Z

Hey Hi ,
I'm getting the same error:

state = T.tensor([observation], dtype=T.float).to(self.actor.device)
Traceback (most recent call last):
File "/home/rafay/RL/ReinforcementLearning/PolicyGradient/PPO/torch/main.py", line 33, in
action, prob, val = agent.choose_action(observation)
File "/home/rafay/RL/ReinforcementLearning/PolicyGradient/PPO/torch/ppo_torch.py", line 142, in choose_action
state = T.tensor([observation], dtype=T.float).to(self.actor.device)
ValueError: expected sequence of length 4 at dim 2 (got 0)

@Zaibali9999 if you have solved the issue, can you please help me out?
@philtabor please guide.

Thanks

tuan124816 · 2024-10-07T16:58:47Z

I'm having the same issue, I'm looking into the output type and the library itself since there might be some difference between versions. I will update my stat if I find something new
Update: I found that I need to change the env.reset() to env.reset()[0] since it output a tuple and we need to access the NumPy array in it

(method) def reset(
*,
seed: int | None = None,
options: dict[str, Any] | None = None
) -> tuple[Any, dict[str, Any]]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #55

Issue #55

Zaibali9999 commented Jan 14, 2023

Zaibali9999 commented Jan 15, 2023

Zaibali9999 commented Jan 15, 2023

rafayaamirgull commented Mar 19, 2023

tuan124816 commented Oct 7, 2024 •

edited

Loading

Issue #55

Issue #55

Comments

Zaibali9999 commented Jan 14, 2023

Zaibali9999 commented Jan 15, 2023

Zaibali9999 commented Jan 15, 2023

rafayaamirgull commented Mar 19, 2023

tuan124816 commented Oct 7, 2024 • edited Loading

tuan124816 commented Oct 7, 2024 •

edited

Loading