OpenAI Gym Environments for the StarCraft II PySC2 environment.
After cloning the repository, you can use the environments in one of two ways:
- Add the directory where you cloned the repo to your
PYTHON_PATH
- Install the package in development mode using pip:
pip install -e .
If you use the first option, you need to manually make sure the dependencies are installed.
The second option will install the package into your pip
environment
as a link to the directory, so it will reflect the changes when
you git pull
or make any changes to the code.
You need the following minimum code to run any environment:
Import gym and this package:
import gym
import sc2gym.envs
Import and initialize absl.flags: (this is due to pysc2
dependency)
import sys
from absl import flags
FLAGS = flags.FLAGS
FLAGS(sys.argv)
Create and initialize the specific environment as indicated in the next section.
The full StarCraft II game environment. Initialize as follows:
env = gym.make('SC2Game-v0')
env.settings['map_name'] = '<desired map name>'
Versions:
SC2Game-v0
: The full game with complete access to action and observation space.
-
The action space for this environment doesn't require the call to
functionCall
likepysc2
does. You just need to call it with an array of action and arguments. For example:_SELECT_ARMY = actions.FUNCTIONS.select_army.id _SELECT_ALL = [0] env.step([_SELECT_ARMY, _SELECT_ALL])
It will check the first element in the array (the action) and make sure it's available before trying to pass it along to the
pysc2
environment. -
This environment doesn't specify the
observation_space
andaction_space
members like traditionalgym
environments. Instead, it provides access to theobservation_spec
andaction_spec
objects from thepysc2
environment.
The MoveToBeacon mini game. Initialize as follows:
env = gym.make('SC2MoveToBeacon-v0')
Versions:
- 'SC2MoveToBeacon-v0': The observation is a
[1, 64, 64]
numpy array that represents theobs.observation['screen'][_PLAYER_RELATIVE]
plane from thepysc2
observation. The action is a number between 0 and 4095 (64x64-1), which is a 1-d representation of the screen coordinates to move the marine towards. The environment pre-selects the marine at the start of each episode. - 'SC2MoveToBeacon-v1': The observation is a
[1, 64, 64]
numpy array that represents theobs.observation['screen'][_PLAYER_RELATIVE]
plane from thepysc2
observation. The action is an array of two numbers, each between 0 and 63, representing the screen coordinates to move the marine towards. The environment pre-selects the marine at the start of each episode.
The CollectMineralShards mini game. Initialize as follows:
env = gym.make('SC2CollectMineralShards-v0')
Versions:
- 'SC2CollectMineralShards-v0': The observation is a
[1, 64, 64]
numpy array that represents theobs.observation['screen'][_PLAYER_RELATIVE]
plane from thepysc2
observation. The action is a number between 0 and 4095 (64x64-1), which is a 1-d representation of the screen coordinates to move the marines towards. The environment pre-selects both marines at the start of each episode. - 'SC2CollectMineralShards-v1': The observation is a
[1, 64, 64]
numpy array that represents theobs.observation['screen'][_PLAYER_RELATIVE]
plane from thepysc2
observation. The action is an array of two numbers, each between 0 and 63, representing the screen coordinates to move the marines towards. The environment pre-selects the marine at the start of each episode. - 'SC2CollectMineralShards-v2': The observation is a
[2, 64, 64]
numpy array that represents theobs.observation['screen'][_PLAYER_RELATIVE]
and theobs.observation['screen'][_SELECTED]
planes from thepysc2
observation. The action is an array of three numbers. The first number is between 0 and 2, representing which control group to move. The remaining two numbers are integers between 0 and 63, representing the screen coordinates to move the marines in this control group towards. The environment automatically creates the following control groups at the start of each episode. Each episode starts with control group 3 pre-selected.- Group 1 [index 0]: Both marines
- Group 2 [index 1]: Marine 1
- Group 3 [index 2]: Marine 2
- Per the Gym environment specifications, the reset function returns an observation, and the step function returns a tuple (observation, reward, done, info), where info is an empty dictionary and the observation is the observation object from the pysc2 environment. The reward is the same as observation.reward, and done is equal true if observation.step_type is LAST.
- In addition to
step()
andreset()
, the environments define asave_replay()
method, that takes a single parameter,replay_dir
, which is the name of the replay directory to save to inside theStarCraft II/Replays/
folder. - All the environments have
action_spec
andobservation_spec
properties, in addition to theaction_space
andobservation_space
properties defined for the mini game environments. - All the environments have the following additional properties:
episode
: The current episode numbernum_step
: The total number of steps takenepisode_reward
: The total reward received this episodetotal_reward
: The total reward received for all episodes
- The examples folder contains examples of using the various environments.
Copyright 2017 Islam Elnabarawy
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.