Skip to content

Latest commit

 

History

History
69 lines (51 loc) · 3.2 KB

README.md

File metadata and controls

69 lines (51 loc) · 3.2 KB

Figurer: Sometimes you need a little help with your figuring

Figurer helps with messy, real world, planning problems. It can help you drive a car or fly a quadcopter by choosing the best sequence of actions. But figurer can't do these things for you.

You will still need to thoroughly define the problem and suggest possible solutions. Then Figurer uses your predictions about cause-and-effect, your suggestions for likely next moves, and your judgement about the value of each possible result. Figurer tries a variety of possibilities, looks ahead 10 steps, and finds the right combination of your suggestions that will best satisfy your goals.

Figurer was designed for use with self-driving cars, yet is applicable to any planning problem for which the physics and desired outcome are well understood.

Installation and Usage

See language-specific README files for instructions on installation and usage.

Roadmap

  • Documentation
    • Simple example code for README
    • Explain policy, predict, and value functions
    • Explain state and actuation
    • Move functions for internal use only to separate namespace
  • API Changes
    • Support randomness in environment (random policy)
    • Support adversarial behavior (maximize vs minimize policy)
    • Refine previous solution with new starting state
  • Speed Optimization (See References to learn about progressive widening, UCT, and KR-UCT.)
    • Focus exploration on most promising paths (UCT)
    • Balance between adding new nodes and refining old nodes (progressive widening)
    • Interpolate policy and value between nodes (KR-UCT)
    • Re-use nodes to share data between branches (new idea)
    • Multi-threading

References

Monte Carlo Tree Search – beginners guide

This guide explains Monte Carlo tree search, the basis for Figurer's algorithm. Note that the original Monte Carlo tree search, as described in this guide, only works for discrete problems. Understanding the original algorithm is still helpful as background before learning about extensions, such as progressive widening, to handle continuous problems.

Monte Carlo Tree Search in Continuous Action Spaces with Execution Uncertainty

This paper describes progressive widening as a continuous extension of Monte Carlo tree search, as well as KR-UCT as a way of optimizing for speed. This paperis very helpful for understanding how Figurer works, including upcoming speed optimizations.

A0C: Alpha Zero in Continuous Action Space

Deep reinforcement learning combines a planner like Figurer with neural network-based policy and value functions. This paper discusses such applications and is relevant to anyone who wants to use Figurer for deep reinforcement learning.

License

Copyright © 2018-2019 Eric Lavigne

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.