RL-LTL

Solving an LTL specified problem through Self-Play

Suppose we have a grid world and an agent within that grid world. Suppose also that we have been given a set of LTL formulas specifying the rules and laws That the agent should abide within this grid world. For example, assume we have the following grid world:

A -> Location of the agent

E -> Empty cell

O -> Obsticales

G -> Location of the goal

E	O	G
E	E	E
a	O	E

Also assume we have the following simple LTL specification: Eventually G, and always not O. (Meanting the agent a should reach G withough crossing over obstacles O)

<[]~O /\ <>G>

Now, we aim to create policies that abide by this rule and lead to trajectories which satisfy the given specs. We are using an AlphaGo zero approach to this problem, meaning that we have a policy+value network that outputs a policy (for movement) and a value (chances of satisfying the specs) at each time-step. We then run multiple Monte-Carlo rollouts from the current position to improve the given policy, and then, to make a move we sample from the resulting policy.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Main		Main
legacy_files_keep		legacy_files_keep
parser/ply-master		parser/ply-master
.gitignore		.gitignore
Create_animation - Copy.ipynb		Create_animation - Copy.ipynb
Create_animation.ipynb		Create_animation.ipynb
Main.ipynb		Main.ipynb
README.md		README.md
rab_to_path.bat		rab_to_path.bat
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-LTL

About

Releases

Packages

Languages

CL2-UWaterloo/RL-LTL

Folders and files

Latest commit

History

Repository files navigation

RL-LTL

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages