2021-04-25

Joking FTW, Seriously

It all started a couple of months ago. First the Stockfish 13 release announcement and shortly later the Lc0 v0.27.0 one contained identical language that both “teams will join forces to demonstrate our commitment to open source chess engines and training tools, and open data.” While the intention is still there and we stand behind this statement, we haven’t yet managed to make something more formal in this direction.

With this in mind, when I was looking for an April Fools’ joke for this year the idea to use LeelaChessZero training data to train a NNUE net seemed very enticing.

The first concept for the joke was to rename cfish and derive a net from the data of the infamous training series that remains unnamed in polite conversation¹. In the end this was abandoned in favor of using recent T60 training data that are in the new V6 training data format. (We also no longer have the training data of that net, but I only remembered that a lot later).

The next iteration of the joke came after considering the TCEC submission guidelines:

NNUE code can be used and considered as if it was a library (even if it is not literally one).
Custom modifications to the basic NNUE code are strongly encouraged, it should be considered rather like a starting point.
All NNUE training data should be generated by the unique engine’s own search and/or eval code.

This made me think that including cfish in lc0 can lead to an engine that one can argue satisfies those guidelines. Number 2 is a wish, so we can ignore it. Number 3 can be satisfied if selfplay mode is retained, as the training data could have been generated “by the unique engine’s own search and/or eval code.” For number 1 it can be argued that fish is the “NNUE code” for lc0². This led directly to the lcfish release.

To train the NNUE net for this a few days’ T60 training data were converted to NNUE training format using a special version of the rescorer³. After fumbling around for a few days with the cpu (nodchip) trainer I settled on a training protocol that seemed to produce decent results. The training started from an empty net with the following command:

learn targetdir trainingdata epochs 3000 batchsize 1000000 reduction_gameply 12 smart_fen_skipping use_draw_in_training 1 use_draw_in_validation 1 lr 1 lambda 1 eval_limit 32000 nn_batch_size 1000 newbob_decay 1 skip_duplicated_positions_in_training 0 eval_save_interval 250000000 loss_output_interval 1000000 validation_set_file_name validationdata\val.binpack validation_count 10000

I had to restart 2-3 times after a few steps as it was clear the random initialization wasn’t working. I then repeated the process with a 0.1 learning rate starting from the final net generated, and once more with 0.01 learning rate. Training at each learning rate took a bit more than 7 hours, and was done over a weekend. I’m not claiming this is a good training procedure as I have no real experience with NNUE training and several of the values were just guesses. I expect people who know what they are doing can generate a better net from the same training data, and would be very happy to see the results.

To summarize:

The first positive outcome from this joke was the addition of code to the rescorer to output NNUE plain format (for details see below).
Moreover, the lcfish training data were contributed to the Stockfish training data collection.
We also have the single line change to allow building lc0 on windows with gcc (but not all backends).
Finally, some of the training data generated crashed the Stockfish-NNUE trainer, which led to the discovery and fix of a 3 year old bug in lc0.

Not bad for an April Fools’ joke!

PS. The rescorer is a tool we use to post-process LeelaChessZero training data. The main use is to consult endgame databases and set the correct outcome and remaining moves for each position, when they can be determined, and propagate this information to earlier moves. Since the rescorer has all the required functionality to read and process the training data, adding the ability to generate NNUE training data was relatively easy. This conversion capability has often been requested, and is very much in line with our objectives, as we want our data to be used to the largest extent possible.

To generate NNUE training data with the rescorer the --nnue-plain-file=output_file.plainoption is used. Note that the data are appended to the .plain file specified if it is there, and that the input files are deleted unless --delete-files=false is also specified. So a typical command would be

./rescorer rescore --input=/path/to/training/data --syzygy-paths=/path/to/syzygy/files --nnue-plain-file=output_file.plain

(the rescorer requires syzygy files to be available).

The current training data (V6 format) store both the played move, but also the best move, which for training games is very often different, as well as the respective evaluations. This gives us the option to save either the best move evaluation (default) or the played move one (--nnue-best-score=false). Similarly the generated file can contain the played move for each position (default) or the best move (--nnue-best-move=true), but note that changing this will typically write a .plain file that doesn’t compress very well when converted to a .binpack one⁴.

This functionality is now merged to the main rescorer repository.

PPS. During testing of the new rescorer options, the training procedure was refined, giving a somewhat stronger net. Specifically, the last step (lr 0.01) was modified to use lambda 0.5. Since the training data contain several temperature blunders, reducing the use of the first 50 plies seemed to be advantageous (but this was not exhaustively tested). Thus the last step for the updated procedure is:

learn targetdir trainingdata epochs 3000 batchsize 1000000 reduction_gameply 50 smart_fen_skipping use_draw_in_training 1 use_draw_in_validation 1 lr 0.01 lambda 0.5 eval_limit 32000 nn_batch_size 1000 newbob_decay 1 skip_duplicated_positions_in_training 0 eval_save_interval 250000000 loss_output_interval 1000000 validation_set_file_name validationdata\val.binpack validation_count 10000

The updated net has been uploaded to the lcfish repository.

~~T20~~ ↩︎
Admittedly not very convincing. ↩︎
We use the rescorer for post-processing LeelaChessZero training data. ↩︎
The .binpack format depends on the moves giving the next position in a game, to avoid storing the position. ↩︎

Posted by: borg

◃◃◃ Jumping on the NNUE bandwagon The importance of open data ▹▹▹