Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base classes for Off Policy Agents #169

Closed
Sharad24 opened this issue Jun 12, 2020 · 6 comments
Closed

Base classes for Off Policy Agents #169

Sharad24 opened this issue Jun 12, 2020 · 6 comments
Assignees

Comments

@Sharad24
Copy link
Member

Sharad24 commented Jun 12, 2020

This should make the code much more comprehensible, especially with the number of arguments we have. And at the same time resolve a lot of maintainability issues.

@sampreet-arthi
Copy link
Member

There's also a lot of code duplication issues that show up pylint. Maybe work on that too.

@Sharad24
Copy link
Member Author

Yup, that's the goal. Another thing to do would be properly decide the parameters to be kept/removed in agents and added/removed from the trainers.

@sampreet-arthi
Copy link
Member

Since we already have the On Policy Agents, I'm renaming this to Off Policy. I'll raise a PR for this soon.

@sampreet-arthi sampreet-arthi changed the title Base classes for agents Base classes for Off Policy Agents Jul 17, 2020
@sampreet-arthi sampreet-arthi self-assigned this Jul 18, 2020
@sampreet-arthi
Copy link
Member

I'm thinking of refactoring each of the individual off policy algorithms first so that the code is neater, more uniform and shorter.

@sampreet-arthi
Copy link
Member

sampreet-arthi commented Jul 31, 2020

To-do:

  • Refactor DDPG
  • Refactor TD3
  • Refactor SAC
  • Finalise BaseAgent and Base OffPolicyAgent classes
  • Refactor Trainer and OffPolicyTrainer (Trainer classes are also really long, would be a good idea to first shorten them then maybe we can separate them into multiple files if they're still too big)
  • Add CUDA support for all of them
  • Add support for Prioritized Experience Replay for all Off Policy algos

@Sharad24
Copy link
Member Author

Tracking in separate issues now. #263, #162 and #264

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants