Skip to content

Commit

Permalink
Fix bugs in greedy exploration
Browse files Browse the repository at this point in the history
Summary:
Greedy exploration had two bugs:
- available actions mask was ignored when sampling an exploratory action
- sampled exploratory action was not moved to proper device.

This Diff fixes those problems.

Reviewed By: jb3618columbia

Differential Revision: D54887996

fbshipit-source-id: b93958936c0d47b54cef6fa0966954b1c5c71c98
  • Loading branch information
rodrigodesalvobraz authored and facebook-github-bot committed Mar 14, 2024
1 parent 7d6785e commit 5ba11ff
Showing 1 changed file with 5 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -46,5 +46,8 @@ def act(
if not isinstance(action_space, DiscreteActionSpace):
raise TypeError("action space must be discrete")
if random.random() < self.epsilon:
return torch.randint(action_space.n, (1,))
return exploit_action
return action_space.sample(action_availability_mask).to(
exploit_action.device
)
else:
return exploit_action

0 comments on commit 5ba11ff

Please sign in to comment.