From 759bd628e1f46be34f5c33022ba5345ffabf1815 Mon Sep 17 00:00:00 2001 From: Devin Schumacher Date: Mon, 27 Nov 2023 05:45:27 +0000 Subject: [PATCH] GITBOOK-171: change request with no subject merged in GitBook --- chapters/policy-gradients.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chapters/policy-gradients.md b/chapters/policy-gradients.md index d2091fb..db595f3 100644 --- a/chapters/policy-gradients.md +++ b/chapters/policy-gradients.md @@ -2,7 +2,7 @@ Policy Gradients (PG) is an optimization algorithm used in artificial intelligence and machine learning, specifically in the field of reinforcement learning. This algorithm operates by directly optimizing the policy the agent is using, without the need for a value function. The agent's policy is typically parameterized by a neural network, which is trained to maximize expected return. -{% embed url="https://youtu.be/9zTxddZRRac?si=nhFUSg6YFbIiMW_h" %} +{% embed url="https://youtu.be/-BUFm1sH6Mk?si=sO6X99jk6wW34Glm" %} ## Policy Gradients: Introduction