RL-playground

In this repository some of my experiments with Reinforcement Learning algorithms based on OpenAi Gym ToolKit

Overview

Packages:

openai/envs the OpenAi Gym compatible environments for evaluation
openai/agents the learning agents

Environments:

NArmedBanditEnv - N-armed bandit (stationary, nonstationary)

Learning agents:

SampleAverageActionValueAgent - the learning agent based on sample-average action-value selection algorithm for both stationary and nonstationary environments

Usage

import gym

from openai.agents.sampleaverage import SampleAverageActionValueAgent

def main():
    # load environment
    env = gym.make('10ArmedBanditStationary-v0')

    # setup
    agent = SampleAverageActionValueAgent(num_actions = 10)
    episode_count = 1
    max_steps = 100
    reward = 0
    done = False

    for i in xrange(episode_count):
        ob = env.reset()

        for j in xrange(max_steps):
            action = agent.evaluate(reward, done)
            ob, reward, done, _ = env.step(action)
            if done:
                break


if __name__ == '__main__':
    main()

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
openai		openai
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bandit_benchmark.py		bandit_benchmark.py
bandit_launcher.py		bandit_launcher.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-playground

Overview

Usage

About

Releases

Packages

Languages

License

yaricom/RL-playground

Folders and files

Latest commit

History

Repository files navigation

RL-playground

Overview

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages