-
-
Notifications
You must be signed in to change notification settings - Fork 22
Training data generation
There is training data generation code based on different engines
- Fairy-Stockfish (all variants): https://github.com/ianfab/variant-nnue-tools
- YaneuraOu (for Shogi): https://github.com/ianfab/YaneuraOu/tree/fairy_bin (includes changes to adapt for Fairy-Stockfish training data format)
Except for Shogi the Fairy-Stockfish based training data generator should be used.
Training data generator: https://github.com/ianfab/variant-nnue-tools/releases
please type these command in training_data_generator line by line:
uci
setoption name Use NNUE value false
setoption name Threads value 8
setoption name Hash value 2048
setoption name UCI_Variant value your_variant
isready
generate_training_data depth 2 count 10000000 random_multi_pv 4 random_multi_pv_diff 100 random_move_count 8 random_move_max_ply 20 write_min_ply 5 eval_limit 10000 set_recommended_uci_options data_format bin output_file_name your_variant.bin
quit
Please change your_variant
to the variant you want to train. e.g. xiangqi
.
If you want to use an existing NNUE network for training data generation, you need to change Use NNUE
to pure
and set the EvalFile
, e.g., something like
setoption name Use NNUE value pure
setoption name EvalFile value somevariant-1234567890ab.nnue
In case there already is a current best NNUE network for a given variant in the list at https://github.com/fairy-stockfish/fairy-stockfish.github.io/blob/main/nnue.markdown#current-best-nnue-networks it is recommended to download that one and use it in training data generation.
- Since only
bin
format is supported, you need to specifydata_format bin
. - The
count
anddepth
of the training data are the main factors influencing the strength of the resulting NNUE net. Usually at least 100M positions should be used to get decent results. A higher depth generally should be better, but also takes much longer to generate. Depths 4-5 usually already give quite good results. - For variants with a low branching factor like losers/antichess, it is recommended to increase the
random_multi_pv_diff
in order to increase the variety of positions. - You can lower/increase the
eval_diff_limit
(default: 500) to be more/less restrictive in the definition of quiet positions, since this defines the filter threshold for the (absolute) difference between qsearch and static evaluation. - You can specify an opening book by adding the
book
argument like e.g.book startingpositions.epd
. This file should contain one FEN/EPD per line, e.g., for Janggi:
rnba1abnr/4k4/1c5c1/p1p1p1p1p/9/9/P1P1P1P1P/1C5C1/4K4/RNBA1ABNR w - - 0 1
rbna1abnr/4k4/1c5c1/p1p1p1p1p/9/9/P1P1P1P1P/1C5C1/4K4/RNBA1ABNR w - - 0 1
...
- If you want to train variants with particularly many pieces on the board (like >50) or in the pockets (>=32), you should compile the training data generator with
largeboards=yes
. Also see the data format for technical details on the limits.
If you want to use an old HalfKP NNUE network to start generating training data, you can use the old generator code at https://github.com/ianfab/variant-nnue. However, since the training data format was changed in the meantime, this will only work with older versions of the trainer, the latest compatible version should be https://github.com/ianfab/variant-nnue-pytorch/tree/91c302941acb131fbabb441dd6ced992ec04dfcb. Also the syntax for the training data generation command looks slightly different. An example is:
gensfen depth 2 loop 100000000 random_multi_pv 4 random_multi_pv_diff 100 random_move_count 8 random_move_maxply 20 write_minply 5 write_maxply 200 eval_limit 10000 set_recommended_uci_options sfen_format bin output_file_name extinction.bin
In order to generate data compatible to this trainer, you need to use the customized YaneuraOu training data generator from https://github.com/ianfab/YaneuraOu/tree/fairy_bin. Its syntax is slightly different from the Fairy-Stockfish data generator, see the example below.
usi
setoption name Threads value 8
setoption name USI_Hash value 2048
isready
gensfen loop 20000000 depth 1 write_minply 6 random_multi_pv_diff 200 random_multi_pv 4 random_move_count 8 eval_limit 10000 output_file_name shogi.bin
quit