-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an evaluator that can unroll a model into an RL environment and get metrics #3
Comments
Waiting for a template to be posted by either @AdamJelley or @trevormcinroe so I can get started. |
Hi @AntreasAntoniou, here's a simple eval function that you can use as a template: @torch.no_grad()
def eval_actor(
env: gym.Env, actor: Actor, device: str, n_episodes: int, seed: int
) -> np.ndarray:
env.seed(seed)
actor.eval()
episode_rewards = []
for _ in range(n_episodes):
state, done = env.reset(), False
episode_reward = 0.0
while not done:
action = actor.act(state, device)
state, reward, done, _ = env.step(action)
episode_reward += reward
episode_rewards.append(episode_reward)
actor.train()
return np.array(episode_rewards) It takes in an env and an actor (network that maps state->action). Hopefully pretty straightforward (not much has changed since 2019 here!). The interesting part is probably device usage. The env is normally on the cpu and expects a |
Small nitpick @AdamJelley @AntreasAntoniou -- it might be good to pass for _ in range(n_episodes):
state, done = env.reset(seed=np.random.choice(seeds)), False AFAIK, all envs can take a seed in for _ in range(n_episodes):
env.seed(np.random.choice(seeds))
state, done = env.reset(), False |
Write an evaluator such that it can receive an environment, a model, and some seed etc, and then do unrolling and collect rewards etc.
The text was updated successfully, but these errors were encountered: