Update vector autoreset documentation. (#1318)

pseudo-rnd-thoughts · web-flow · commit e2599f05a4af · 2025-02-25T17:12:08.000Z
diff --git a/gymnasium/vector/async_vector_env.py b/gymnasium/vector/async_vector_env.py
@@ -121,7 +121,7 @@ def __init__(
                 'different' defines that there can be multiple observation spaces with different parameters though requires the same shape and dtype,
                 warning, may raise unexpected errors. Passing a ``Tuple[Space, Space]`` object allows defining a custom ``single_observation_space`` and
                 ``observation_space``, warning, may raise unexpected errors.
-            autoreset_mode: The Autoreset Mode used, see todo for more details.
+            autoreset_mode: The Autoreset Mode used, see https://farama.org/Vector-Autoreset-Mode for more information.
 
         Warnings:
             worker is an advanced mode option. It provides a high degree of flexibility and a high chance
diff --git a/gymnasium/vector/sync_vector_env.py b/gymnasium/vector/sync_vector_env.py
@@ -75,7 +75,7 @@ def __init__(
             observation_mode: Defines how environment observation spaces should be batched. 'same' defines that there should be ``n`` copies of identical spaces.
                 'different' defines that there can be multiple observation spaces with the same length but different high/low values batched together. Passing a ``Space`` object
                 allows the user to set some custom observation space mode not covered by 'same' or 'different.'
-            autoreset_mode: The Autoreset Mode used, see todo for more details.
+            autoreset_mode: The Autoreset Mode used, see https://farama.org/Vector-Autoreset-Mode for more information.
 
         Raises:
             RuntimeError: If the observation space of some sub-environment does not match observation_space
diff --git a/gymnasium/vector/vector_env.py b/gymnasium/vector/vector_env.py
@@ -54,6 +54,13 @@ class VectorEnv(Generic[ObsType, ActType, ArrayType]):
     vector environments that contains several unique arguments for modifying environment qualities, number of environment,
     vectorizer type, vectorizer arguments.
 
+    To avoid having to wait for all sub-environments to terminated before resetting, implementations can autoreset
+    sub-environments on episode end (`terminated or truncated is True`). This is crucial for correct implementing training
+    algorithms with vector environments. By default, Gymnasium's implementation uses `next-step` autoreset, with
+    :class:`AutoresetMode` enum as the options. The mode used by vector environment should be available in `metadata["autoreset_mode"]`.
+    Warning, some vector implementations or training algorithms will only support particular autoreset modes.
+    For more information, read https://farama.org/Vector-Autoreset-Mode.
+
     Note:
         The info parameter of :meth:`reset` and :meth:`step` was originally implemented before v0.25 as a list
         of dictionary for each sub-environment. However, this was modified in v0.25+ to be a dictionary with a NumPy
@@ -102,12 +109,6 @@ class VectorEnv(Generic[ObsType, ActType, ArrayType]):
         {}
         >>> envs.close()
 
-    To avoid having to wait for all sub-environments to terminated before resetting, implementations will autoreset
-    sub-environments on episode end (`terminated or truncated is True`). As a result, when adding observations
-    to a replay buffer, this requires knowing when an observation (and info) for each sub-environment are the first
-    observation from an autoreset. We recommend using an additional variable to store this information such as
-    ``has_autoreset = np.logical_or(terminated, truncated)``.
-
     The Vector Environments have the additional attributes for users to understand the implementation
 
     - :attr:`num_envs` - The number of sub-environment in the vector environment