Adds some extra notes on RL training #1

LukeWood · 2020-08-09T17:53:46Z

I've recently been trying to take on RL again (bit of redemption after my
glaring defeat to box2d car racing 2 years ago!). In this pursuit I
came across some new analogies that were extremely useful to me in
creating a mental model for how some of these techniques work. Thought
they may be useful to some of your students.

This commit

adds an analogy for the purpose of the target network
emphasizes the reason experience replay works
adds a section on advanced RL techniques used to overcome sparse
reward functions

I've recently been trying to take on RL again (bit of redemption after my glaring defeat to box2d car racing 2 years ago!). In this pursuit I came across some new analogies that were extremely useful to me in creating a mental model for how some of these techniques work. Thought they may be useful to some of your students. This commit - adds an analogy for the purpose of the target network - emphasizes the _reason_ experience replay works - adds a section on advanced RL techniques used to overcome sparse reward functions

LukeWood requested a review from eclarson August 9, 2020 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds some extra notes on RL training #1

Adds some extra notes on RL training #1

LukeWood commented Aug 9, 2020

Adds some extra notes on RL training #1

Are you sure you want to change the base?

Adds some extra notes on RL training #1

Conversation

LukeWood commented Aug 9, 2020