average critic loss in twin critic

Summary: avearge critic loss in twin critic instead of taking the sum Reviewed By: BerenLuthien Differential Revision: D59823732 fbshipit-source-id: 5fe68de1aab8e26d68ecc89364d2f83a1f0c2639
facebookresearch · Jul 16, 2024 · d1b22eb · d1b22eb
1 parent 13e5ffc
commit d1b22eb
Showing 1 changed file with 1 addition and 0 deletions.
diff --git a/pearl/utils/functional_utils/learning/critic_utils.py b/pearl/utils/functional_utils/learning/critic_utils.py
@@ -194,4 +194,5 @@ def twin_critic_action_value_loss(
     loss = criterion(
         q_1.reshape_as(expected_target_batch), expected_target_batch.detach()
     ) + criterion(q_2.reshape_as(expected_target_batch), expected_target_batch.detach())
+    loss = loss / 2.0
     return loss, q_1, q_2