You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am currently trying to reproduce the results from the paper for pybullet-envs.
To create the dataset I am using your pretrained TD3 model './models/../best_model.zip'.
However, I get the following error.
File "autoregressive_pybullet.py", line 558, in <module>
mean_reward, std_reward, observations = evaluate_policy(model, env, n_eval_episodes=args.n_eval_episodes)
ValueError: not enough values to unpack (expected 3, got 2)
evaluate_policy is a function from stable_baselines3.common.evaluation and only returns (mean_reward, std_reward) but no observations.
I therefore modified it such that it also returns a stacked list of observations (n_episodes x episode_lengths).
Is this the correct way to do this?
However, when I then train the predictor and test the anomly detection, the AUC scroes are much lower than reported in the paper: avg AUC: 0.39, max AUC 0.57.
Since not all hyperparameters are specified in the paper I was not sure what to use exactly.
These are the respective commands I used: python autoregressive_pybullet.py --test_policy --env_name HalfCheetahBulletEnv-v0 --n_eval_episodes 1_000 python autoregressive_pybullet.py --train_predictive_model --env_name HalfCheetahBulletEnv-v0 --is_recurrent_v2 --iterations 10_000 python autoregressive_pybullet.py --anomaly_detection --anomaly_injection 20 --horizons 1 --sampling_sizes 8 --n_eval_episodes 10 --is_recurrent_v2 --env_name HalfCheetahBulletEnv-v0 --case 1
The text was updated successfully, but these errors were encountered:
Hi,
I am currently trying to reproduce the results from the paper for pybullet-envs.
To create the dataset I am using your pretrained TD3 model './models/../best_model.zip'.
However, I get the following error.
evaluate_policy
is a function fromstable_baselines3.common.evaluation
and only returns (mean_reward, std_reward) but no observations.I therefore modified it such that it also returns a stacked list of observations (n_episodes x episode_lengths).
Is this the correct way to do this?
However, when I then train the predictor and test the anomly detection, the AUC scroes are much lower than reported in the paper: avg AUC: 0.39, max AUC 0.57.
Since not all hyperparameters are specified in the paper I was not sure what to use exactly.
These are the respective commands I used:
python autoregressive_pybullet.py --test_policy --env_name HalfCheetahBulletEnv-v0 --n_eval_episodes 1_000
python autoregressive_pybullet.py --train_predictive_model --env_name HalfCheetahBulletEnv-v0 --is_recurrent_v2 --iterations 10_000
python autoregressive_pybullet.py --anomaly_detection --anomaly_injection 20 --horizons 1 --sampling_sizes 8 --n_eval_episodes 10 --is_recurrent_v2 --env_name HalfCheetahBulletEnv-v0 --case 1
The text was updated successfully, but these errors were encountered: