Releases: QueensGambit/CrazyAra
CrazyAra, ClassicAra, MultiAra 0.9.5.post0
This release provides binaries for CrazyAra, ClassicAra and MultiAra with the new OpenVino CPU backend.
The OpenVino backend features a new UCI-option Threads_NN_Inference
which defines how many threads to use for neural network inference. This no longer requires setting up an environment variable called OMP_NUM_THREADS
(#35).
Current limitation for OpenVino backend
- No Int8 inference support enabled yet.
- Only tested on regular CPU and not with Intel GPUs or the Intel Neural Compute Stick.
Installation instructions
The binary packages include the required inference libraries for each platform.
The latest ClassicAra model is included within each release package.
However, the models for the CrazyAra and MultiAra should be downloaded separately and unzipped (see release 0.9.5).
CrazyAra-rl-model-os-96.zip
MultiAra-rl-models.zip
(improved MultiAra models using reinforcement learning (rl) )MultiAra-sl-models.zip
(initial MultiAra models using supervised learning)
Next, move the model files into the model/<engine-name>/<variant>
folder.
Regression test
ClassicAra
The new OpenVino backend is about 100 - 150 nps faster on CPU and much easier to install than the MXNetMKL backend.
[TimeControl "7+0.1"]
Score of ClassicAra - 0.9.5.post0 OpenVino vs ClassicAra 0.9.5 - MXNetMKL: 82 - 17 - 55 [0.711]
Elo difference: 156.4 +/- 45.9, LOS: 100.0 %, DrawRatio: 35.7 %
154 of 1000 games finished.
Inference libraries
The following inference libraries are used in each package:
CrazyAra_ClassicAra_MultiAra_0.9.5.post0_Linux_OpenVino.zip
- OpenVino 2021 4 LTS
CrazyAra_ClassicAra_MultiAra_0.9.5.post0_Mac_OpenVino.zip
- OpenVino 2021 4 LTS
CrazyAra_ClassicAra_MultiAra_0.9.5.post0_Win_OpenVino.zip
- OpenVino 2021 4 LTS
CrazyAra, ClassicAra, MultiAra 0.9.5
MultiAra
This is the first release which features MultiAra, a version of Ara which supports all chess variants available on lichess.org.
- antichess
- atomic
- chess960
- crazyhouse
- king-of-the-hill
- horde
- racing kings
- three-check
The neural network models have been initialised by lichess.org game data from August 2013 until July 2020.
All variant (except chess960) have been improved using reinforcement learning.
Details can be found in the master thesis by Maximilian Alexander Gehrke which among others covers an ablation study between training from zero knowledge and supervised initialisation, as well as a preliminary strength comparison against Fairy-Stockfish:
- Assessing Popular Chess Variants Using Deep Reinforcement Learning, pdf
ClassicAra
The model for ClassicAra uses a new input representation (#134) and a new WDLP value output head (#123).
The old model and input representation for chess is still supported as the input representation version is now inferred by the model name.
Installation instructions
The binary packages include the required inference libraries for each platform.
However, the models should be downloaded separately and unzipped.
CrazyAra-rl-model-os-96.zip
ClassicAra-sl-model-wdlp-rise3.3-input3.0.zip
MultiAra-rl-models.zip
(improved MultiAra models using reinforcement learning (rl) )MultiAra-sl-models.zip
(initial MultiAra models using supervised learning)
Next, move the model files into the model/<engine-name>/<variant>
folder.
For the TensorRT releases, it is recommended to generate the trt-files for your GPU once from the command-line:
Bug Fixes
In 0.9.5, a critical time management bug has been fixed (#136) which resulted in occasional time losses in time controls with increments.
Previously it was assumed that the increment were applied to the time buffer before and not after the move.
Great thanks to AdminX and Werner Schüle for reporting it:
Summary
- FEN, PGN and arena adjustments for Chess960 (#129)
- RL Logging: Found tablebases and Configs (#130)
- Revert "RL Logging: Found tablebases and Configs" (#131)
- RL Logging: Found tablebases and Configs 2.0 (#132)
- Speed Optimization: get_state_planes() (#133)
- Chess input representation v3.0 (#134)
- Time management refactoring & bugfix (#136)
- Mirror policy (#137)
- RL: Save intermediate model files and delete them after training (#138)
- GitHub Actions / CI (#139)
- Fix channel ordering for flip_board() == False (#140)
- Version Backward Compability (#141)
- 3rdParty Blaze Dependency (#142)
- CI for Variants (#143)
- Variant input changes (#145)
- Add a Open_Spiel Stratego implementation as well adapting MCTSAgent by defining different types of it (#146)
- TensorRT 8.0 Compability (#147)
- Input Representation Version Parsing (#148)
- Single Move Evaluation (#149)
- Include binary name in Model_Directory and miscellaneous (#150)
Inference libraries
The following inference libraries are used in each package:
CrazyAra_ClassicAra_MultiAra_0.9.5_Linux_TensorRT.zip
- CUDA 11.3
- cuDNN 8.2.1
- TensorRT-8.0.1
CrazyAra_ClassicAra_MultiAra_0.9.5_Linux_MKL.zip
- MXNet 1.8.0
- Intel oneAPI MKL 2021.2.0
CrazyAra_ClassicAra_MultiAra_0.9.5_Win_TensorRT.zip
- CUDA 11.3
- cuDNN 8.2.1
- TensorRT-8.0.1
CrazyAra_ClassicAra_MultiAra_0.9.5_Win_MKL.zip
- MXNet 1.8.0
- Intel oneAPI MKL 2021.2.0
Known issues ⚠️
- The first start-up time of the engine may take 1-15 minutes to generate the trt-engine files.
- It is recommended to first start the engine from the command line and issue the
isready
command. More information can be found here. All later loading times should be < 10s. - The process of generating trt-files is done twice, i.e. for batch.size 1 and batch-size 16.
- The available batch sizes are limited to the provided onnx models. Current available batch-sizes are:
- 1
- 8
- 16
- 64
- Using INT8 based inference is still not recommend both for GPU and CPU for chess. On CEGT) it was reported, that ClassicAra performed worse on Pascal Nvidia GPUs despite higher nodes per second.
Regression test
ClassicAra
tc=0/1:0+0.25
-openings file="UHO_V3_6mvs_+090_+099.pgn"
ClassicAra-0.9.5-rise_3.3-input_3.0-wldp vs ClassicAra-0.9.4-rise_3.3-input_1.0:
202 - 125 - 273 (WLD)
Elo difference: 44.83 +/- 20.54
CrazyAra
tc=0/1:0+0.25
-openings file="crazyhouse_mix_cp_130.epd"
CrazyAra-0.9.5 vs CrazyAra-0.9.4
119 - 112 - 369 (WLD)
Elo difference: 4.05 +/- 17.24
Updates
-
fixed problem when loading new ClassicAra model with MXNet API (0f3d60f).
-
CrazyAra_ClassicAra_MultiAra_0.9.5.post0_Mac_MKL.zip
-
CrazyAra_ClassicAra_MultiAra_0.9.5.post0_Linux_MKL.zip
-
2021-11-01: Added bsizes 32, 64, 128 to ClassicAra-sl-model-wdlp-rise3.3-input3.0.zip
ClassicAra 0.9.4
This version was submitted to TCEC S21, League 2.
- Shared pointer atomic operation (#124)
- Should fix crash as seen in TCEC S21, L3, game 37.
- Updated Timemanagement Constants (#126 )
- Retrieve neural network indices in TensorrtAPI (#127)
- WDLP value head (#123)
- Chess input representation v2.8 (#125)
- However, ClassicAra will still use the old network Rise 3.3 in TCEC S21, League 2.
CrazyAra & ClassicAra 0.9.3
Installation instruction
The binaries use the same libraries as in release 0.9.0.
Download the corresponding release from 0.9.0 and replace the binaries CrazyAra
and ClassicAra
.
For chess, it is recommended to delete all files in model/chess
and to download the newer Rise 3.3 model.
Then move all from contents from Rise 3.3 into model/chess
and in case of the TensorRT back-end to first generate the trt-files.
This can be done using the command-line and issuing isready
.
TC 3s + 0.1s
Score of ClassicAra 0.9.1 - Risev3.3 vs ClassicAra 0.9.1 - Risev2: 81 - 15 - 64 [0.706]
Elo difference: 152.4 +/- 42.8, LOS: 100.0 %, DrawRatio: 40.0 %
160 of 1000 games finished.
Known limitations
- The int8 weights for CPU are still only supported on Linux
Changelog compared to 0.9.2.post1
- tablebase hits are now treated as terminal nodes until a tablebase position has been been reached (#108, #116)
- refactored the shared pointer graph (#114)
- reduced the number of terminal simulations during one mini-batch (#117)
Regression test
TC 7s + 0.1s
Score of CrazyAra 0.9.3 - Release vs CrazyAra 0.9.0 - Release: 99 - 97 - 28 [0.504]
Elo difference: 3.1 +/- 42.7, LOS: 55.7 %, DrawRatio: 12.5 %
224 of 1000 games finished.
ClassicAra 0.9.2.post1
ClassicAra 0.9.2.post0
Notes
This version was submitted to TCEC S21, Qualification League testing.
UCI specific
- Replaces fixed timeout constant
TIME_OUT_IS_READY_MS
by UCI-OptionTimeout_MS
(default: 13000, value for TCEC: 60000).
ClassicAra 0.9.2
This version has been submitted to the Top Chess Engine Championship (TCEC) Season 21, where ClassicAra 0.9.2 will start in the Qualification League.
ClassicAra 0.9.2 uses a RISEv3.3 model which was trained on the Kingbase2019lite dataset as for release 0.9.0.
RISEv3.3 is an improvement of our proposed RISEv2 architecture.
The new ClassicAra logo can be found here.
The engine.json configuration file and update.sh shell script can be used to replicate the testing environment on a multi-GPU Linux operating system.
Binaries may be added later.
Notes
Bugfixes
- fixed problem when reaching more than 16 million nodes. Thanks to @catask for reporting it (#39).
- fixed problem where engine might lose on time. Thanks to @brianprichardson for reporting it (#74).
- fixed problem that syzgy tablebases where not loaded. Thanks to Torsten for reporting it (#79).
- the neural network can now be reloaded dynamcially if the value for
Model_Directory
,Threads
orUCI_Variant
was changed after the firstisready
command (#95).
UCI specific
- added new UCI option
Centi_Q_Veto_Delta
which allows choosing a low visit move if its Q-Value exceeded a given threshold (#86).
Memory management
- the memory management is now handled using shared pointers and is done fully in parallel to the search (#97).
Back-end
- a Fairy-Stockfish back-end and supervised training for Xiangqi is now supported. Thanks to @xsr7qsr for the contribution (#93).
Refactoring & documentation
- improved and refactored supervised learning setup. Thanks to @maxalexger for the contribution (#92, #63, #64).
- refactored reinforcement learning loop and improved documentation. Thanks to @maxalexger for the contribution (#77, #91).
Miscellaneous
- the log file is now working in append mode and also logs stderr information. Thanks to @brianprichardson for the suggestion (#78).
- decativate virtual loss when showing evaluation (#82).
- downscaled the value to centi-pawn conversion for classical chess (b3da4a9).
- added timer which replies
readyok
after 13s to avoid running into a time-out (#98).
CrazyAra & ClassicAra 0.9.0
Notes
This release replaces MCTS with MCGS as the new default and is the first release with standard chess support.
The provided model for chess has been trained supervised on the free KingBase Lite 2019 data set without reinforcement learning so far.
The displayed nps is now considereably lower, because now only the actual nodes in the tree are displayed.
Terminal visits and transposition visits do not count as nodes anymore.
By using MCGS, the required memory allocation has been reduced for a given movetime.
New Variant
- Standard chess: A neural network and binary is provided for standard chess.
- The executable is called ClassicAra.
Environment Backend
The CrazyAra project now supports a general back-end which allows a better integration to new environments.
An environment is used for move-generation and terminal condition checks.
The following environments have been integrated so far:
Bugfixes
- fixed crash in release 0.8.4 when too many transpositions occurred.
- fixed bug that single moves were not done instantly if no tree was reused (#56).
- fixed bug decreasing nodes after 16 million nodes (#39).
- workaround for bug in XBoard / Winboard. UCI-Variant is now repeated if there is only a single variant available (#23).
UCI specific
- MultiPV is now supported and can be used for analysis e.g. using LiGround.
- UCI options can now be changed, also after the isready command.
- Current limitation: Changing the loaded neural network dynamicly after a neural network has already been loaded is not (yet) possible.
- UCI Options are now ordered in alphabetical order.
- Updated, removed, or renamed certain UCI options (#71).
root command (debug)
- For the Root Command: Moves are now displayed in SAN notation instead of UCI notation for better readibility.
- UCI moves are now ordered by the current best move and not using the fixed policy ordering.
Miscellaneous
- The loaded neural network input and output shapes are now always displayed when loading.
- An input shape and output shape check is made for the loaded neural network architecture.
INT8 CPU Back-end
- INT8 weights are now available for the MXNet MKL CPU back-end, resulting in a 1.5 - 3x speed-up.
⚠️ The INT8 weights give strange results for the Mac binaries. The Mac binaries use float32 weights by default.⚠️ Unfortunately the MXNet CPP-Package is broken for Windows at the moment (mxnet#20131). Therefore, loading the int8 weights will result in a crash and float32 weights are used by default.
Known issues ⚠️
- The first start-up time of the engine may take 1-15 minutes to generate the trt-engine files.
- It is recommended to first start the engine from the command line and issue the
isready
command. More information can be found here. All later loading times should be < 10s. - The process of generating trt-files is done twice, i.e. for batch.size 1 and batch-size 16.
- The available batch sizes are limited to the provided onnx models. Current available batch-sizes are:
- 1
- 8
- 16
- 64
Regression Test (Crazyhouse)
OS: Ubuntu 18.04
GPU: RTX 2070 OC
Model: Model-OS-96
Opening suite: crazyhouse_mix_cp_130.epd
- TC: 10s + 0.1s
Score of CrazyAra 0.9.0 - Release vs CrazyAra 0.8.0 - Release: 208 - 159 - 58 [0.558]
Elo difference: 40.2 +/- 30.9, LOS: 99.5 %, DrawRatio: 13.6 %
425 of 1000 games finished.
- TC: 3min + 2s
Score of CrazyAra 0.9.0 - Release vs CrazyAra 0.8.0 - Release: 29 - 19 - 14 [0.581]
Elo difference: 56.5 +/- 78.1, LOS: 92.6 %, DrawRatio: 22.6 %
62 of 1000 games finished.
Inference libraries
The following inference libraries are used in each package:
CrazyAra_ClassicAra_0.9.0_Linux_TensorRT.zip
- CUDA 11.2
- cuDNN 8.1.1.33
- TensorRT-7.2.3.4
CrazyAra_ClassicAra_0.9.0_Linux_MKL.zip
- MXNet 1.8.0
- Intel oneAPI MKL 2021.2.0
CrazyAra_ClassicAra_0.9.0_Win_TensorRT.zip
- CUDA 11.2
- cuDNN 8.1.1.33
- TensorRT-7.2.3.4
CrazyAra_ClassicAra_0.9.0_Win_MKL.zip
- MXNet-20190919
- Intel MKL 20190502
CrazyAra_ClassicAra_0.9.0_Mac_MKL_post1.zip
- MXNet 1.8.0
- Intel oneAPI MKL 2021.2.0
The release files include the required dll / so-files for convenience. If you already have them installed on your system, you can delete them.
Information on how to use multiple GPUs can be found in #33 and #76.
Updates
- 2021-04-15: CrazyAra_ClassicAra_0.9.0_Mac_MKL_post1.zip
- added libc++abi.1.dylib
- added @executable_path/. to libmxnet.so to avoid post-installation actions
- 2021-05-03: risev2_kingbase2019.zip
- added
risev2_kingbaselite2019.zip
which is the model trained on the kingbase2019lite data set as separate download
- added
- 2021-05-03: preact_resnet_se_kingbaselite2019.zip.zip
- added
preact_resnet_se_kingbaselite2019.zip
which is a model trained on the kingbase2019lite data set as separate download
- added
- 2021-05-06: risev3.3_kingbaselite2019.zip
- added
risev3.3_kingbaselite2019.zip
which is a model trained on the kingbase2019lite data set as separate download.
- added
CrazyAra & ClassicAra 0.8.4
Notes
This release is provided to allow convenient reproduce-ability of the evaluations presented in our paper Improving AlphaZero Using Monte-Carlo Graph Search, preprint.
It is built using commit CrazyAra#500da21e0bd9152657adbbc6118f3ebbc660e449.
More details are described here:
The binaries can use the same model and TensorRT library dependencies (Linux & Windows) as in release 0.9.0.
The ClassicAra and CrazyAra 0.8.4 binaries expect the Model_Directory
to be model/
while release 0.9.0 expects it to be model/chess
and model/crazyhouse
by default instead (#75).
There are three ways to fix this problem.
-
- You manually set the
Model_Directory
before callingisready
setoption name Model_Directory value model/chess
- You manually set the
-
- You move the files from
model/chess
intomodel/
.
- You move the files from
-
- If you have already generated the trt-files, you should be able to edit the UCI-Option
Model_Directory
via the GUI directly.
- If you have already generated the trt-files, you should be able to edit the UCI-Option
Updates
2021-04-10: Added CrazyAra_pre_pull_47, ClassicAra_pre_pull_47, CrazyAra_pre_pull_47.exe, ClassicAra_pre_pull47.exe using commit CrazyAra#82e821e8721aa635e1415e718027b2cbe19356a0. This commit is a state before adding integrating MCGS and uses transposition tables to copy neural network evaluations from pre-existing nodes.
CrazyAra 0.8.0
Notes
This release provides new features and addresses the main issues from release 0.7.0.
It is expected to be the last release which only supports crazyhouse.
Model-OS-96, the final model after ~ 2.37 million self-play games from release 0.7.0, is included in all release packages.
TensorRT backend
CrazyAra's default GPU backend is now a native TensorRT backend without MXNet library dependency.
The networks are loaded in the ONNX format.
Multi-GPU support
CrazyAra now supports multiple GPUs for the same MCTS search.
The first GPU is defined by the GPU index First_Device_ID
and the last GPU index by Last_Device_ID
.
The Threads
parameters defines the number of threads to use for each GPU.
Low precision inference
CrazyAra now allows float16 and int8 inference besides float32. The active precision can be set by the UCI-option Precision
. Float16 is the new default precision and greatly accelerates search for NVIDIA RTX cards (~2.2x speed increase).
Int8 precision is still experimental and increases speed for older GPUs by about 2.5x at the cost of losing precision. Before creating the TensorRT engine files, the network will be automatically calibrated.
Higher nps is primarily beneficial for low time control, e.g. TC: 30s + 0.1s:
Score of CrazyAra-0.8.0-INT8-GTX1080ti vs CrazyAra-0.8.0-FP16-GTX1080ti: 72 - 17 - 11 [0.775]
Elo difference: 214.8 +/- 77.1, LOS: 100.0 %, DrawRatio: 11.0 %
100 of 100 games finished.
Start-up time
The first start-up time of CrazyAra can take several minutes to generate the trt hardware specific engine files.
All later start-ups should be fast e.g. < 3s. The TensorRT files are stored in the Model_Directory
.
Memory consumption
Reduced memory consumption of the search tree by 90%.
Example: Generation of 1.5 million nodes:
CrazyAra 0.7.0 memory consumption:
- 12.8 GiB total memory:
- 1.5 GiB constant memory (MXNet GPU backend)
- 11.3 GiB tree memory
CrazyAra 0.8.0 memory consumption:
- 2.7 GiB total memory:
- 1.1 GiB constant memory (TensorRT backend)
- 1.6 GiB tree memory
Time manager
The time management was improved to stop searches early if large parts of the tree were reused and is extended for positions with a falling evaluation.
MCTS solver
CrazyAra can now calculate forced mates and tries to choose the shortest line to mate or longest line for getting mated.
The algorithm is an extended version of:
-
"Exact-Win Strategy for Overcoming AlphaZero" by Chen et al.
https://www.researchgate.net/publication/331216459_Exact-Win_Strategy_for_Overcoming_AlphaZero -
"Monte-Carlo Tree Search Solver" by Winands et al.
https://www.researchgate.net/publication/220962507_Monte-Carlo_Tree_Search_Solver
Improved UCI interface
-
The logging of the evaluation information is now performed asynchronously.
-
The commands
go infinite
,stop
andquit
are now supported. -
Every second the current evaluation information is displayed which can be stopped at any point via
stop
. -
The
root
command now displays the eval information in a more structured way. -
selDepth
information is now displayed to show the maximum reached depth in the current search. -
Terminal node visits are now counted separately and not counted as nodes anymore.
Random root exploration
A new UCI Option Random_Playout
was added which explores randomly direct child nodes of the root for 5% of all visits (default: on). This technique helps to detect lines which may look unattractive at first sight.
Tablebase support
CrazyAra now supports Syzygy tablebases which may later be used for chess, anti-chess and atomic.
Asnychronous freeing of memory
The memory of the previous search tree is now asynchronously deallocated which can save up to 1-5s for long time control settings.
Full OS support
The C++ version of CrazyAra now runs on all major operating systems: Linux (GPU+CPU), Windows (GPU+CPU) and Mac-OS (CPU only).
Known issues
- The first start-up time of the engine may take some minutes to generate the trt-engine files.
- The available batch sizes are limited to the provided onnx models. Current available batch-sizes are:
- 1
- 8
- 16
- 64
Regression test
Conductor: Matuiss2
TC: 3min +2s
OS: Windows
GPU: GTX 1070ti
Model: Model-OS-96
Opening suite: zh-equal-long-81.pgn (with 3 openings removed)
Score of CrazyAra 0.8.0 vs CrazyAra 0.7.0: 72 - 21 - 7 [0.245]
Elo difference: +195.51 +/- 77.45
Inference libraries
The following inference libraries are used in each package:
CrazyAra_0.8.0_Linux_TensorRT.zip
- CUDA 10.2
- cuDNN 7.6.5
- TensorRT-7.0.0.11
CrazyAra_0.8.0_Linux_MKL.zip
- Intel MKL 20190502
CrazyAra_0.8.0_Win_TensorRT.zip
- CUDA 10.2
- cuDNN 7.6.5
- TensorRT-7.0.0.11
CrazyAra_0.8.0_Win_MKL.zip
- Intel MKL 20190502
CrazyAra_0.8.0_Mac_CPU.zip
- Apple Accelerate