Books

Polyglot opening books

The polyglot book format is the most wide spread format in use by chess engines. The format contains 4 unused bytes. We are going to use them as follows, see table on your right. Score contains the Stockfish 10 evaluation based on Depth.

 

In ProDeo 2.91 (not available yet) a book search will look like:

Score

2 bytes

Depth

1 byte

Learn

1 byte

Book : books\prodeo.bin

Positions : 267.342

Move

Weight

Score

Depth

Learn

d2d4

51.60%

0.06

20

14

e2e4

41.96%

0.19

20

34

c2c4

4.13%

0.05

20

60

g1f3

2.31%

-0.01

20

0

Table-1

Since Polyglot opening books are made from PGN collections which may contain inaccuracies (and even blunders) the extra information can serve as an extra filter to play the best move from the alternatives available.

 

For instance, playing a book match reliying on the highest Stockfish score, thus playing e2e4 (0.19) already gave an 24 elo improvement.

________________________________________________________________________________________________

 

Besides the ProDeo book we analyzed other books as well and those who are in the public domain you can download.

Book

Positions

Depth

Saturation

Reference

ProDeo.bin

267.342

20

100%

 

Fruit.bin

31.467

20

88.0%

Fabien Letouzey

Perfect2017.bin

6.900

20

88.6%

SedatChess

Performance.bin

92.954

20

95.6%

Marc Lacrosse

Varied.bin

92.229

20

95.7%

Marc Lacrosse

Cerebellum_Light

7.859.767

16

15.7%

Stefan Zipproth

Poly 1.1

First of all download and install the POLY utility and copy the Polyglot book(s) you want to analyze in the BOOKS folder. Run Prepare Analysis from the menu, thereafter Analyze and last Import Analysis. That's it. View the result with Poly Statistics.

 

In detail:

 

Prepare Analysis - It's an unfortunate fact while you can store a position as a 64-bit hashkey you can not retrieve it back and so the util is clueless about the positions that are in the book unless you have the original PGN from which the book is made.

 

To identify most of the positions in a Polyglot book we created an EPD database with 25 million (opening) positions to catch most of the positions in a Polyglot book, expressed as the Saturation Percentage, see above.

 

For example, choose "varied.bin" as opening book and then "10.epd" as reference database for the first 10 moves and it will create "varied.epd" to be analyzed, catching ± 65% of the moves in the book.

 

If you are not satisfied with the 65% result download 20.epd and 30.epd seperately and install them in the EPD folder. Using 20.epd will already produce a 95% match.

 

Analyze - Analyze the (just created) EPD collection with Stockfish splitting the analysis process over a user defined number of threads. Press "E" to change the Stockfish engine, with "T" to change the number of threads, with "L" change the time control level, a value < 100 will analyze on depth, a value >= 100 defines the fixed search time in millieseconds. After the analysis is finished a file NEW.EPD is created which is the base for the last step.

 

Import Analysis - Import the Stockfish 10 results (NEW.EPD) into the Polyglot book.

20.epd

12 million positions

Approx 85 Mb

30.epd

25 million positions

Approx 190 Mb

The term Saturation needs an explanation. It means the percentage of analyzed positions. Since Polyglot opening books are made from PGN collections the origins of the above books are not known, except for ProDeo, hence all positions are analyzed.

______________________________________________________________________________________________

 

How to analyze a Polyglot opening book?

 

_______________________________________________________________________________________________

The learning byte

 

Update Learning - With this function via a PGN (for instance an ENG-ENG match) the learning info (byte) in a Polyglot book is updated. The learning byte may fluctuate between a value of -125 | +125. The basic thought is simple, you don't want to play a book move with a value of -75 (meaning that move lost 75 games more than it won), instead if a move has a +50 value (meaning it won 50 games more than it lost) it's wise to play that move.

 

The experienced reader immediately will realize the drawback of the system, one byte is much to small to handle large eng-eng matches with ten thousands of games. In an ideal system the learning should function via a corresponding W|L|D file with 32 bit counters, maybe something for the future, but for the moment outside the scope of the project, sticking to the orginal Polyglot format.

 

Nevertheless if you keep it small there will be immediate gain. After 3 (self-play) book matches of each 2000 games at 40m/60s we got a gain of 57.0% representing 49 elo. We used a simple (first wild guess) formula, from the table above we multiply the learn value *4 and add it to the weight percentage and the highest value becomes the move.

 

Table-2

Round

Games

Result

ELO

LOS

Remark

1

2.000

50.5%

- - -

73.8%

Initial round with a clean learning file.

2

2.000

54.2%

+29

100%

Learning updated with the PGN of round-1.

3

2.000

57.0%

+49

100%

Learning updated with the PGN of round-2.

And it made no sense any longer to play further because many of the learn counters already exceeded the maximum of -125 | +125 limit.

 

_______________________________________________________________________________________________

 

UPDATE

version 1.1

 

Added 2 essential functions which allows the user to immediately make use of the new

features without programmer support.

 

Weights on score - Set the weight of moves on 100% with the highest Stockfish score. The book will only play those 100% moves. Using this function basically emulates the results in table-1 above.

 

Weights on learn - Set the weight of moves on 100% with the highest learn value. The book will only play those 100% moves. Using this function basically emulates the results in table-2 above.

____________________________________________________________________________________

 

Programmer Stuff

 

Making use of the new features it requires changes is the Polyglot search code, but these are only a few. Using the source code example of Michel Van den Bergh here and here the changes are marked with "@ed" here.

 

Remark

Instead of updating the learning AFTER the match a better result might be possible updating the learn value DURING the match, for instance when it's clear your engine is winning or losing adding one to learn in case of a win, subtract one in case of a loss.