Mixture of Experts MCTS (MoE MCTS) (#216)

* - added game phase detection file - adjusted initial Dockerfile - minor changes to convert_pgn_to_planes.ipynb and pgn_to_planes_converter.py * - changed openspiel git * - changed openspiel git * fixed phase ids * added dataset creation option for specific phases * param changes in train_cnn.ipynb * - fixed plys_to_end list to only include values for moves that really have been used - added counter for games without positions for current phase * - changes to train_cnn to make it compatible - added analyse_game_phases.py to analyse game phase distribution and other information - minor changes * mcts phase integration working, some improvements missing * - added phase_to_nets map to make sure the right net is used for each phase - board->get_phase now expects the total amount of phases as an argument - phaseCountMap is now immediately filled * - added game phase vector to created datasets - added sample weighting to losses pytorch training files - load_pgn_dataset() now returns a dict - added file for easily generating normalized cutechess-cli commands * minor fixes for weighted training * - fixes and improvements to prs.py from cutechess-cli - added file to generate plots based on cutechess results * - changes for continuing training from tar file (pytorch) * - added python file for training (exported notebook) * - added python file for executing cutechess shell commands * - added the option to specify additional eval sets (unweighted) to pass through the trainer agent - you can now pass a phase to load_pgn_dataset to load a non default dataset * - minor changes * - minor changes for debugging * - bugfix in train_cnn.py for additional dataloaders * - bugfix in to correctly determine train iterations - added printing total positions in dataset when loading * - minor changese in prs.py * - minor changes for chess 960 * - reverted mode and version back to 2 and 3 * fixed bug when executing isready multiple times consecutively while setting networkLoaded back to false * alternative bugfix attempt for linux * - temporary fix for chess960 wrong training representation - adjusted cutechess run file to support 960 matches * - changes to incorporate 960 dataset analysis - new and adjusted graphs in game_phase_detector.py (should be put into a separate file) - new plots in create_cutechess_plots.py * chess960 input representation fix (c++ engine files still unadjusted and assuming a wrong input representation) * - added plot generating notebooks to git (/etc folder) - moved game phase analysis code from game_phase_detector.py to own file (analyse_game_phase_definition.py) - minor changes in train_cnn.py - adjusted .gitignore * - added support for naive movecount phases * - minor path fix in dataset_loader.py * undone temporary fix for broken chess960 input representation * - added support for phases by movecount in c++ code (currently always assumes phases by movecount) - set default value for UCI_Chess960 back to false - minor fixes * - minor plotting adjustments - added colorblind palette * - adjusted run_cutechess_experiments.py to be able to do experiments against stockfish * - added documentation * - minor assertion change in train_cnn.py * - cleaned code and removed sections that are not needed anymore * - changed underscore naming to camelCase naming in several cases * - added UCI option Game_Phase_Definition with options "lichess" and "movecount" and corresponding searchsettings enum GamePhaseDefinition * - added searchSettings to RawNetAgent to access selected gamePhaseDefinition * - aligned train_cnn.ipynb with code inside train_cnn.py * - cleaned cell outputs of main notebooks * - further notebook output cleanings * - removed files unnecessary for pull request and reverted several files back to initial state of fork * - reverted .gitignore and Dockerfile to older state * - .gitignore update to different previous state * Update crazyara.cpp Fix compile error * Update board.cpp Fix error: control reaches end of non-void function * Add GamePhase get_phase to states * Add GamePhase OpenSpielState::get_phase() * Update get_data_loader() to load dict instead --------- Co-authored-by: Felix Helfenstein <[email protected]>
QueensGambit · May 3, 2024 · ce12a9c · ce12a9c
1 parent 9e927c3
commit ce12a9c
Show file tree

Hide file tree

Showing 48 changed files with 1,101 additions and 459 deletions.
diff --git a/DeepCrazyhouse/configs/main_config.py b/DeepCrazyhouse/configs/main_config.py
@@ -11,7 +11,12 @@
 
 # define the default dir where the training data in plane representation is located
 # e.g. for supervised learning default_dir = "/data/planes/"
-default_dir = "/data/RL/export/"
+default_dir = "/data/kingbase2019_lite_pgn_months/"
+#default_dir = "C:/workspace/Python/CrazyAra/data/kingbase2019_lite_pgn_months/"
+#default_dir = "C:/workspace/Python/CrazyAra/data/chess960_pgns/"
+phase = None  # current phase to use, set to None to treat everything as a single phase
+# type of phase definition, either "lichess" or "movecountX" with X determining the number of phases
+phase_definition = "movecount3"
 
 if default_dir[-1] != "/":
     default_dir = default_dir + "/"
@@ -30,29 +35,38 @@
     # The test directory includes games from the month:             2017-05
     # The mate_in_one directory includes games from the month:      lichess_db_standard_rated_2015-08.pgn
 
+    "phase": phase,
+    "phase_definition": phase_definition,
+    "default_dir": default_dir,
+
     # The pgn directories contain all files which are converted to plane representation
-    "pgn_train_dir": "/home/demo_user/datasets/lichess/Crazyhouse/pgn/train/",
-    "pgn_val_dir": "/home/demo_user/datasets/lichess/Crazyhouse/pgn/val/",
-    "pgn_test_dir": "/home/demo_user/datasets/lichess/Crazyhouse/pgn/test/",
-    "pgn_mate_in_one_dir": "/home/demo_user/datasets/lichess/Crazyhouse/pgn/mate_in_one/",
+    "pgn_train_dir": default_dir + "pgn/train/",
+    "pgn_val_dir": default_dir + "pgn/val/",
+    "pgn_test_dir": default_dir + "pgn/test/",
+    "pgn_mate_in_one_dir": default_dir + "pgn/mate_in_one/",
     # The plane directories contain the plane representation of the converted board state
     #  (.zip files which have been compressed by  the python zarr library)
-    "planes_train_dir": default_dir + "train/",
-    "planes_val_dir": default_dir + "val/",
-    "planes_test_dir": default_dir + "test/",
-    "planes_mate_in_one_dir": default_dir + "mate_in_one/",
+
+    "planes_train_dir": default_dir + f"planes/{phase_definition}/phase{phase}/train/",
+    "planes_val_dir": default_dir + f"planes/{phase_definition}/phase{phase}/val/",
+    "planes_test_dir": default_dir + f"planes/{phase_definition}/phase{phase}/test/",
+    "planes_mate_in_one_dir": default_dir + f"planes/{phase_definition}/phase{phase}/mate_in_one/",
 
     # The rec directory contains the plane representation which are used in the training loop of the network
     # use the notebook create_rec_dataset to generate the .rec files:
     # (Unfortunately when trying to start training with the big dataset a memory overflow occurred.
     # therefore the old working solution was used to train the latest model by loading the dataset via batch files)
     #  "train.idx", "val.idx", "test.idx", "mate_in_one.idx", "train.rec", "val.rec", "test.rec", "mate_in_one.rec"
-    "rec_dir": "/home/demo_user/datasets/lichess/Crazyhouse/rec/",
+
+    "rec_dir": default_dir + "rec/",
     # The architecture dir contains the architecture definition of the network in mxnet .symbol format
     # These directories are used for inference
-    "model_architecture_dir": "/home/demo_user/models/Crazyhouse/symbol/",
+    #"model_architecture_dir": "/home/demo_user/models/Crazyhouse/symbol/",
+    "model_architecture_dir": "/DeepCrazyhouse/models/Classic/symbol/",
+
     # the weight directory contains the of the network in mxnet .params format
-    "model_weights_dir": "/home/demo_user/models/Crazyhouse/params/",
+    #"model_weights_dir": "/home/demo_user/models/Crazyhouse/params/",
+    "model_weights_dir": "/DeepCrazyhouse/models/Classic/params/",
 
     # layer name of the value output layer (e.g. value_tanh0 for legacy crazyhouse networks and value_out for newer
     # networks)

diff --git a/DeepCrazyhouse/configs/train_config.py b/DeepCrazyhouse/configs/train_config.py
@@ -79,12 +79,8 @@ class TrainConfig:
                            " pytorch training loop.)"
     k_steps_initial: int = 0
 
-    info_symbol_file: str = "symbol_file is the neural network architecture file to continue training with (deprecated)" \
-                            "(e.g. 'model_init-symbol.json', model-1.19246-0.603-symbol.json')"
-    symbol_file: str = ''
-    info_params_file: str = "params_file is the neural network weight file to continue training with (deprecated)" \
-                            "(e.g. 'model_init-0000.params' # model-1.19246-0.603-0223.params')"
-    params_file: str = ''
+    info_tar_file: str = "tar_file is the neural network weight file to continue training with" \
+    tar_file: str = ''
 
     info_optimizer_name: str = "optimizer_name is the optimizer that used in the training loop to update the weights." \
                                "(e.g. nag, sgd, adam, adamw)"
@@ -214,4 +210,5 @@ class TrainObjects:
     momentum_schedule = None
     metrics = None
     variant_metrics = None
+    phase_weights = {0: 1., 1: 1., 2: 1.}
 
diff --git a/DeepCrazyhouse/src/domain/util.py b/DeepCrazyhouse/src/domain/util.py
@@ -165,7 +165,7 @@ def get_numpy_arrays(pgn_dataset):
     Loads the content of the dataset file into numpy arrays
 
     :param pgn_dataset: dataset file handle
-    :return: numpy-arrays:
+    :return: pgn_dataset_arrays_dict: dict of {specific dataset part: numpy-array} with the following keys
             starting_idx - defines the index where each game starts
             x - the board representation for all games
             y_value - the game outcome (-1,0,1) for each board position
@@ -174,6 +174,7 @@ def get_numpy_arrays(pgn_dataset):
              This can be used to apply discounting
             y_best_move_q - Q-value for the position of the selected move
              (this information is only available for generated data during selfplay)
+            phase_vector - array of the game phase of each position
     """
     # Get the data
     start_indices = np.array(pgn_dataset["start_indices"])
@@ -184,14 +185,22 @@ def get_numpy_arrays(pgn_dataset):
     except Exception:
         y_policy = np.array(pgn_dataset["y_policy"])
 
-    possible_entries = ["plys_to_end", "y_best_move_q"]
-    entries = [None] * 2
+    possible_entries = ["plys_to_end", "y_best_move_q", "phase_vector"]
+    entries = [None] * 3
     for idx, entry in enumerate(possible_entries):
         try:
             entries[idx] = np.array(pgn_dataset[entry])
         except KeyError:
             pass
-    return start_indices, x, y_value, y_policy, entries[0], entries[1]
+
+    pgn_dataset_arrays_dict = {"start_indices": start_indices,
+                               "x": x,
+                               "y_value": y_value,
+                               "y_policy": y_policy,
+                               "plys_to_end": entries[0],
+                               "y_best_move_q": entries[1],
+                               "phase_vector": entries[2]}
+    return pgn_dataset_arrays_dict
 
 
 def get_x_y_and_indices(dataset):

diff --git a/DeepCrazyhouse/src/preprocessing/dataset_loader.py b/DeepCrazyhouse/src/preprocessing/dataset_loader.py
@@ -19,19 +19,23 @@ def _load_dataset_file(dataset_filepath):
     """
     Loads a single dataset file give by its path
     :param dataset_filepath: path where the file is located
-    :return:starting_idx: [int] - List of indices where ech game starts
+    :return: pgn_dataset_arrays_dict: dict of {specific dataset part: numpy-array} with the following keys
+            starting_idx: [int] - List of indices where ech game starts
             x: nd.array - Numpy array which contains the game positions
             y_value: nd.array - Numpy array which describes the winner for each board position
             y_policy: nd.array - Numpy array which describes the policy distribution for each board state
                                  (in case of a pgn dataset the move is one hot encoded)
             plys_to_end - array of how many plys to the end of the game for each position.
              This can be used to apply discounting
+            y_best_move_q - Q-value for the position of the selected move
+             (this information is only available for generated data during selfplay)
+            phase_vector - array of the game phase of each position
     """
     return get_numpy_arrays(zarr.group(store=zarr.ZipStore(dataset_filepath, mode="r")))
 
 
 def load_pgn_dataset(
-    dataset_type="train", part_id=0, verbose=True, normalize=False, q_value_ratio=0,
+    dataset_type="train", part_id=0, verbose=True, normalize=False, q_value_ratio=0, phase=None
 ):
     """
     Loads one part of the pgn dataset in form of planes / multidimensional numpy array.
@@ -43,25 +47,26 @@ def load_pgn_dataset(
     :param normalize: True if the inputs shall be normalized to 0-1
     ! Note this only supported for hist-length=1 at the moment
     :param q_value_ratio: Ratio for mixing the value return with the corresponding q-value
+    :param phase: if specified use planes dataset of this phase. If None, the phase specified in main_config is used
     For a ratio of 0 no q-value information will be used. Value must be in [0, 1]
-    :return: numpy-arrays:
+    :return: pgn_dataset_arrays_dict: dict of {specific dataset part: numpy-array} with the following keys
             start_indices - defines the index where each game starts
             x - the board representation for all games
             y_value - the game outcome (-1,0,1) for each board position
             y_policy - the movement policy for the next_move played
             plys_to_end - array of how many plys to the end of the game for each position.
              This can be used to apply discounting
             pgn_datasets - the dataset file handle (you can use .tree() to show the file structure)
+            phase_vector - array of the game phase of each position
     """
 
-    if dataset_type == "train":
-        zarr_filepaths = glob.glob(main_config["planes_train_dir"] + "**/*.zip")
-    elif dataset_type == "val":
-        zarr_filepaths = glob.glob(main_config["planes_val_dir"] + "**/*.zip")
-    elif dataset_type == "test":
-        zarr_filepaths = glob.glob(main_config["planes_test_dir"] + "**/*.zip")
-    elif dataset_type == "mate_in_one":
-        zarr_filepaths = glob.glob(main_config["planes_mate_in_one_dir"] + "**/*.zip")
+    if dataset_type in ["train", "val", "test", "mate_in_one"]:
+        if phase is None:
+            zarr_filepaths = glob.glob(main_config[f"planes_{dataset_type}_dir"] + "**/*.zip")
+        else:
+            zarr_filepaths = glob.glob(main_config["default_dir"] +
+                                       f"planes/{main_config['phase_definition']}/phase{phase}/{dataset_type}/" +
+                                       "**/*.zip")
     else:
         raise Exception(
             'Invalid dataset type "%s" given. It must be either "train", "val", "test" or "mate_in_one"' % dataset_type
@@ -78,7 +83,15 @@ def load_pgn_dataset(
         logging.debug("")
 
     pgn_dataset = zarr.group(store=zarr.ZipStore(pgn_datasets[part_id], mode="r"))
-    start_indices, x, y_value, y_policy, plys_to_end, y_best_move_q = get_numpy_arrays(pgn_dataset)  # Get the data
+    # Get the data
+    pgn_dataset_arrays_dict = get_numpy_arrays(pgn_dataset)
+    start_indices = pgn_dataset_arrays_dict["start_indices"]
+    x = pgn_dataset_arrays_dict["x"]
+    y_value = pgn_dataset_arrays_dict["y_value"]
+    y_policy = pgn_dataset_arrays_dict["y_policy"]
+    plys_to_end = pgn_dataset_arrays_dict["plys_to_end"]
+    y_best_move_q = pgn_dataset_arrays_dict["y_best_move_q"]
+    phase_vector = pgn_dataset_arrays_dict["phase_vector"]
 
     if verbose:
         logging.info("STATISTICS:")
@@ -87,7 +100,7 @@ def load_pgn_dataset(
                 print(member, list(pgn_dataset["statistics"][member]))
         except KeyError:
             logging.warning("no statistics found")
-
+        print("total_positions", f"[{len(y_value)}]")
         logging.info("PARAMETERS:")
         try:
             for member in pgn_dataset["parameters"]:
@@ -105,7 +118,15 @@ def load_pgn_dataset(
         y_policy = y_policy.astype(np.float32)
         # apply rescaling using a predefined scaling constant (this makes use of vectorized operations)
         x *= MATRIX_NORMALIZER
-    return start_indices, x, y_value, y_policy, plys_to_end, pgn_dataset
+
+    pgn_dataset_arrays_dict = {"start_indices": start_indices,
+                               "x": x,
+                               "y_value": y_value,
+                               "y_policy": y_policy,
+                               "plys_to_end": plys_to_end,
+                               "pgn_dataset": pgn_dataset,
+                               "phase_vector": phase_vector}
+    return pgn_dataset_arrays_dict
 
 
 def load_xiangqi_dataset(dataset_type="train", part_id=0, verbose=True, normalize=False):

diff --git a/DeepCrazyhouse/src/preprocessing/game_phase_detector.py b/DeepCrazyhouse/src/preprocessing/game_phase_detector.py
@@ -0,0 +1,160 @@
+"""
+@file: game_phase_detector.py
+Created on 08.06.2023
+@project: CrazyAra
+@author: HelpstoneX
+
+Analyses a given board state defined by a python-chess object and outputs the game phase according to a given definition
+"""
+
+
+import chess
+import chess.pgn
+import numpy as np
+import matplotlib.pyplot as plt
+import io
+from DeepCrazyhouse.configs.main_config import main_config
+import os
+import re
+
+
+def get_majors_and_minors_count(board):
+    """
+    Returns the number of major and minor pieces (not including king) currently present on the board (either color)
+
+    :param board:  python-chess board object
+    :return: pieces_left - integer representing how many pieces are left
+    """
+    pieces_left = bin(board.queens | board.rooks | board.knights | board.bishops).count("1")
+    return pieces_left
+
+
+def is_backrank_sparse(board, max_pieces_allowed=3):
+    """
+    Determines whether the backrank of either player is sparse
+    where sparseness is defined by the amount of pieces on the first (for white) or last (for black) rank
+
+    :param board:  python-chess board object
+    :param max_pieces_allowed: integer representing the maximum pieces (including the king) allowed on the backrank
+                               for it to be considered sparse
+    :return: backrank_sparseness - boolean representing whether either backrank is currently sparse
+    """
+    white_backrank_sparse = bin(board.occupied_co[chess.WHITE] & chess.BB_RANK_1).count("1") <= max_pieces_allowed
+    black_backrank_sparse = bin(board.occupied_co[chess.BLACK] & chess.BB_RANK_8).count("1") <= max_pieces_allowed
+    return white_backrank_sparse or black_backrank_sparse
+
+
+def score(num_white_pieces_in_region, num_black_pieces_in_region, rank):
+    """
+    Calculates the mixedness contribution of a particular 2x2 square/region
+
+    :param num_white_pieces_in_region: integer representing the amount of white pieces in the current 2x2 region
+    :param num_black_pieces_in_region: integer representing the amount of black pieces in the current 2x2 region
+    :param rank: rank of the current 2x2 region
+    :return: mixedness_score - integer representing the mixedness score of the current 2x2 square
+    """
+    score_map = {
+        (0, 0): 0,
+        (1, 0): 1 + (8 - rank),
+        (2, 0): 2 + (rank - 2) if rank > 2 else 0,
+        (3, 0): 3 + (rank - 1) if rank > 1 else 0,
+        (4, 0): 3 + (rank - 1) if rank > 1 else 0,
+        (0, 1): 1 + rank,
+        (1, 1): 5 + abs(3 - rank),
+        (2, 1): 4 + rank,
+        (3, 1): 5 + rank,
+        (0, 2): 2 + (6 - rank) if rank < 6 else 0,
+        (1, 2): 4 + (6 - rank),
+        (2, 2): 7,
+        (0, 3): 3 + (7 - rank) if rank < 7 else 0,
+        (1, 3): 5 + (6 - rank),
+        (0, 4): 3 + (7 - rank) if rank < 7 else 0
+    }
+    return score_map.get((num_white_pieces_in_region, num_black_pieces_in_region), 0)
+
+
+def get_mixedness(board):
+    """
+    Calculates the mixedness of a position based on the lichess definition of mixedness,
+    which is roughly speaking the amount of intertwining of black and white pieces in all 2x2 squares of the board
+    more info: https://github.com/lichess-org/scalachess/blob/master/src/main/scala/Divider.scala
+
+    :param board: python-chess board object
+    :return: mixedness_score - integer representing the current mixedness score of the position
+                               (according to the lichess definition)
+    """
+    mix = 0
+
+    for rank_idx in range(7):  # use ranks 1 to 7 (indices 0 to 6)
+        for file_idx in range(7):  # use files A to G (indices 0 to 6)
+            num_white_pieces_in_region = 0
+            num_black_pieces_in_region = 0
+            for dx in [0, 1]:
+                for dy in [0, 1]:
+                    square = chess.square(file_idx+dx, rank_idx+dy)
+                    if board.piece_at(square):
+                        if board.piece_at(square).color == chess.WHITE:
+                            num_white_pieces_in_region += 1
+                        else:
+                            num_black_pieces_in_region += 1
+            mix += score(num_white_pieces_in_region, num_black_pieces_in_region, rank_idx + 1)
+
+    return mix
+
+
+def get_game_phase(board, phase_definition="lichess", average_movecount_per_game=42.85):
+    """
+    Determines the game phase based on the current board state and the given phase definition type
+
+    :param board: python-chess board object
+    :param phase_definition: determines, which phase definition type to use,
+                             either "lichess"
+                             or "movecountX" where X describes the amount of phases
+                             (separated by equidistant move count buckets)
+    :param average_movecount_per_game: specifies the average movecount per game
+                                       (used to determine phase borders when using phases by movecount)
+    :return: str - str representation of the phase (for lichess definition) or empty str
+             num_majors_and_minors - the amount of major and minor pieces left (for lichess phase EDA purposes)
+             backrank_sparse - whether the backrank of either player is sparse (for lichess phase EDA purposes)
+             mixedness_score - current mixedness score of the position (for lichess phase EDA purposes)
+             phase - integer from 0 to num_phases-1 representing the phase the current position belongs to
+    """
+
+    if phase_definition == "lichess":
+        # returns the game phase based on the lichess definition implemented in:
+        # https://github.com/lichess-org/scalachess/blob/master/src/main/scala/Divider.scala
+
+        num_majors_and_minors = get_majors_and_minors_count(board)
+        backrank_sparse = is_backrank_sparse(board)
+        mixedness_score = get_mixedness(board)
+
+        if num_majors_and_minors <= 6:
+            return "endgame", num_majors_and_minors, backrank_sparse, mixedness_score, 2
+        elif num_majors_and_minors <= 10 or backrank_sparse or (mixedness_score > 150):
+            return "midgame", num_majors_and_minors, backrank_sparse, mixedness_score, 1
+        else:
+            return "opening", num_majors_and_minors, backrank_sparse, mixedness_score, 0
+
+    # matches "movecount" directly followed by a number
+    pattern_match_result = re.match(r"\bmovecount(\d+)", phase_definition)
+
+    if pattern_match_result:  # if it is a valid match
+        # use number at the end of the string to determine the number of phases to be used
+        num_phases = int(pattern_match_result.group(1))
+        phase_length = round(average_movecount_per_game/num_phases)
+
+        # board.fullmove_number describes the move number of the next move that happens in the game,
+        # e.g., after 8 half moves board.fullmove_number is 5
+        # so we use board.fullmove_number -1 to get the current full moves played
+        moves_completed = board.fullmove_number - 1
+        phase = int(moves_completed/phase_length)  # determine phase by rounding down to the next integer
+        phase = min(phase, num_phases-1)  # ensure that all higher results are attributed to the last phase
+        return "", 0, 0, 0, phase
+
+    else:
+        return "Phase definition not supported or wrongly formatted. Should be 'movecountX' or 'lichess'"
+
+
+if __name__ == "__main__":
+    print(get_game_phase(chess.Board("q6k/P1P5/3p2Q1/5p1p/3N4/3b3P/5KP1/R3R3 w - - 1 36"), "movecount4"))
+    print("done")