Skip to content

Latest commit

 

History

History
62 lines (38 loc) · 2.01 KB

README.md

File metadata and controls

62 lines (38 loc) · 2.01 KB

Policy Evaluation in Grid World

This project implements Iterative Policy Evaluation in a Grid World environment, based on techniques from Reinforcement Learning: An Introduction by Barto and Sutton (Chapter 4). Using an equiprobable random policy, the agent evaluates states iteratively. The value function approximation for each state is computed over a fixed number of steps, with the max delta (|v - v'|) plotted at each step to visualize convergence.

Value Function
Iterative Policy Evaluation

After learning is complete, the policy is adjusted based on the computed value function and tested in the gymnasium environment.

Final Policy

Installation

  1. Clone the repository and navigate to the project directory.

  2. Install dependencies:

    pip install -r requirements.txt

Usage

Run the main script with default settings:

python3 main.py

This will execute the policy evaluation on a large grid. Resulting visualizations for the value function and policy execution will be generated.

Large Map Execution
Max Delta Plot for Large Map

Example with a Smaller Map

To run the policy evaluation on a smaller grid, use the --map option:

python3 main.py --map='small'

Small Map Execution
Max Delta Plot for Small Map

Extended CLI Options

Customize the evaluation with additional command-line options:

python3 main.py --map='small' --video_folder='./videos' --steps=1000 --episodes=1
  • --map: Choose from available map sizes (small, large).
  • --video_folder: Specify a folder to save video recordings of the policy execution.
  • --steps: Set the number of steps for iterative policy evaluation.
  • --episodes: Define the number of episodes to run.

License

This project is licensed under the MIT License.