Books

                   Polyglot opening books

The polyglot book format is the most wide spread format in use by chess engines. The format contains 4 unused bytes. We are going to use them as follows, see table on your right. Score contains the Stockfish 10 evaluation based on Depth.


With ProDeo 2.91 a book search will look like:

Score

2 bytes

Depth

1 byte

Learn

1 byte

Book : books\prodeo.bin

Positions  : 267.342

Move

Weight

Score

Depth

Learn

d2d4

51.60%

0.06

20

14

e2e4

41.96%

0.19

20

34

c2c4

4.13%

0.05

20

60

g1f3

2.31%

-0.01

20

0

Table-1

Since Polyglot opening books are made from PGN collections which may contain inaccuracies (and even blunders) the extra information can serve as an extra filter to play the best move from the alternatives available.


For instance, playing a book match reliying on the highest Stockfish score, thus playing e2e4 (0.19) already gave an 24 elo improvement.

________________________________________________________________________________________________


Besides the ProDeo book we analyzed other books as well and those who are in the public domain you can download.

Book

Positions

Depth

Saturation

Reference

ProDeo.bin

267.342

20

100%


31.467

20

88.0%

6.900

20

88.6%

92.954

20

95.6%

92.229

20

95.7%

Cerebellum_Light

7.859.767

16

15.7%

Poly 1.2

First of all download and install the POLY utility and copy the Polyglot book(s) you want to analyze in the BOOKS folder. Run Prepare Analysis from the menu, thereafter Analyze and last Import Analysis. That's it. View the result with Poly Statistics.


In detail:


Prepare Analysis - It's an unfortunate fact while you can store a position as a 64-bit hashkey you can not retrieve it back and so the util is clueless about the positions that are in the book unless you have the original PGN from which the book is made.


To identify most of the positions in a Polyglot book we created an EPD database with 25 million (opening) positions to catch most of the positions in a Polyglot book, expressed as the Saturation Percentage, see above.


For example, choose "varied.bin" as opening book and then "10.epd" as reference database for the first 10 moves and it will create "varied.epd" to be analyzed, catching ± 65% of the moves in the book.


If you are not satisfied with the 65% result download 20.epd and 30.epd seperately and install them in the EPD folder. Using 20.epd will already produce a 95% match.


Analyze - Analyze the (just created) EPD collection with Stockfish splitting the analysis process over a user defined number of threads. Press "E" to change the Stockfish engine, with "T" to change the number of threads, with "L" change the time control level, a value < 100 will analyze on depth, a value >= 100 defines the fixed search time in millieseconds. After the analysis is finished a file NEW.EPD is created which is the base for the last step.


Import Analysis - Import the Stockfish 10 results (NEW.EPD) into the Polyglot book.

20.epd

12 million positions

Approx 85 Mb

30.epd

25 million positions

Approx 190 Mb


_______________________________________________________________________________________________

The term Saturation needs an explanation. It means the percentage of analyzed positions. Since Polyglot opening books are made from PGN collections the origins of the above books are not known, except for ProDeo, hence all positions are analyzed.

______________________________________________________________________________________________


How to analyze a Polyglot opening book?

The learning byte


Update Learning - With this function via a PGN (for instance an ENG-ENG match) the learning info (byte) in a Polyglot book is updated. The learning byte may fluctuate between a value of -125 | +125. The basic thought is simple, you don't want to play a book move with a value of -75 (meaning that move lost 75 games more than it won), instead if a move has a +50 value (meaning it won 50 games more than it lost) it's wise to play that move.


The experienced reader immediately will realize the drawback of the system, one byte is much to small to handle large eng-eng matches with ten thousands of games. In an ideal system the learning should function via a corresponding W|L|D file with 32 bit counters, maybe something for the future, but for the moment outside the scope of the project, sticking to the orginal Polyglot format.


Nevertheless if you keep it small there will be immediate gain. After 3 (self-play) book matches of each 2000 games at 40m/60s we got a gain of 57.0% representing 49 elo. We used a simple (first wild guess) formula, from the table above we multiply the learn value *4 and add it to the weight percentage and the highest value becomes the move.


Table-2

Round

Games

Result

ELO

LOS

Remark

1

2.000

50.5%

- - -

73.8%

Initial round with a clean learning file.

2

2.000

54.2%

+29

100%

Learning updated with the PGN of round-1.

3

2.000

57.0%

+49

100%

Learning updated with the PGN of round-2.

And it made no sense any longer to play further because many of the learn counters already exceeded the maximum of -125 | +125 limit.


_______________________________________________________________________________________________


UPDATE

version 1.2


Added 4 essential functions which allows the user to immediately make use of the new

features without programmer support.


Weights on score

1. Option 'B' (best move) sets the weight of moves on 100% with the highest Stockfish score. The book will only play those 100% moves. Using this function basically emulates the results in table-1 above.

2. Option 'V' (variation) based on the Stockfish scores will calculate new weights.


Weights on learn

1.  Option 'B' (best move) sets the weight of moves on 100% with the highest learn value. The book will only play those 100% moves. Using this function basically emulates the results in table-2 above.

2. Option 'V' (variation) based on the learn scores will calculate new weights.


Import Score and Depth from PGN

Import Score & Depth from an annotated (thus with Score and Depth) ENG-ENG match.


Import Score and Depth from EPD

. Required tags: 'bm' (move) and 'ce' (score).

. Preferable (although not obliged) tag: 'acd' (depth)


The advantage of the two last options is that you can add the analysis (score & depth) of any engine to Polyglot books.


Check the new Polyglot weights with ProDeo 2.9 (book.txt) or with Scid.


______________________________________________________________________________________________


UPDATE

version 1.3


Added 3 new powerful features with as goal - add elo to your Polyglot opening book. By using annotated engine-engine games (with score & depth) to create a Polyglot opening book with score, depth and an elo indication of the engine that played the move and then merge the new created book into an existing Polyglot opening book adding elo to that book. And with these new tools we created a stronger ProDeo book. It's best explained how the new created ProDeo book (or any other Polyglot book) is organized.

Table 1

Book : books\prodeo.bin

Positions  : 267.342

Move

Weight

Score

Depth

Learn

d2d4

51.60%

0.03

26

115

e2e4

41.96%

0.06

28

121

c2c4

4.13%

-0.02

29

124

g1f3

2.31%

-0.03

28

121

Table 2

Book : books\prodeo.bin

Positions : 5.762.655

Move

Weight

Score

Depth

Learn

c1f4

27.03%

0.19

20

64

c1g5

18.92%

0.07

18

70

e2e4

16.22%

0.19

30

93

d1d3

13.51%

0.16

19

68

f1e1

10.81%

0.15

28

82

h2h3

8.11%

0.21

20

81

b2b3

5.41%

0.12

27

78

First we have analyzed the ProDeo book with this tool using Stockfish 10 at 60 seconds per move, see table 1.


Secondly with Make Polyglot Book we created an opening book from 1 million CCRL 40/40 games, 25 moves deep and only from games with a minimum elo rating of 3000 of both engines.


Then with Merge Book we combined the 2 books.


Table 2 is an example when the original ProDeo is out of book but then finds a lot of playable moves after all because of the CCRL games. The position is after -


1.c4 Nf6 2.Nc3 g6 3.g3 Bg7 4.Bg2 O-O 5.d4 d6 6.Nf3 Nc6 7.O-O a6


r1bq1rk1/1pp1ppbp/p1np1np1/8/2PP4/2N2NP1/PP2PPBP/R1BQ1RK1 w - -


Now we can see how the learn byte is organized when books are merged -


A value between 100 and 125 comes from the original ProDeo book and is sorted on depth (table 1).


A value below 100 comes from the added book and is based on elo using the formula 2500 + (learn * 10). Thus the move c1f4 (table 2) is played by an engine of 3140 (2500+640), the move e2e4 is based on an egine of 3430 (2500+930).

With these extra informations, for a user, a book can be better tuned with the Weights on score and Weights on learn functions and for a programmer to make better choices depending on the strength of his engine.


The 3 new functions in a nutshell -


1. Pre-fill learn byte - calculates the learn value of the original book (100-125).

2. Make Polyglot Book - from a PGN create a new book called polyglot.bin with the learn (elo) values below 100.

3. Merge Books - merge the original book and polyglot.bin, the result is called book.bin.


Closing remarks

1. The freeware books (see above) can be used following the above 3 steps and use the pre-installed ccrl-3400.pgn as an example.


2. Using PGN with filled elo rating tags is important for the programmer hence it's impossible to calculate the elo rating and the learn value (in table 2) of all positions will be zero. For the user it will have no effect. If a PGN has no filled elo rating tags select "0" when prompted for the elo rating in Make Polyglot Book.


3. Make sure that the PGN collection you use contains only engine-engine games at least 100 elo higher than your own engine. For the creation of the new ProDeo book we have chosen for an minimum elo of 3000 roughly 250 higher than its rating to be on the safe side.


4. When using Merge Books always (emphasis added) select the original book first and thereafter polyglot.bin because merge doesn't recalculate the weights of the first book while it in principle should. We are not sure if that's a bug or a feature, the latter certainly has its merits. As long as you are aware.


Technical

5. With the new Make Polyglot Book feature opening books can be easily made bigger than 256Mb which may cause problems (even a crash) with some of the functions of the POLY util. The cure is to start POLY via P.BAT which allocates 1.3Gb of memory instead of the default of 256Mb.


6. Polyglot books made larger than 2Gb will crash due to the limitation of the 32 bit code. That's still 131.072.000 positions!


______________________________________________________________________________________________


New features version 1.3b

Press F12 for overview


[ F1 ] - From an EPD delete bad or too good Positions, default margin is -1.50 / +1.50


[ F2 ] - Quickly count EPD records.


[ F3 ] - Quickly count games PGN.


[ F4 ] - Convert an analyzed EPD to PGN for the use of making Polyglot books with score and depth.

            Tags like "bm" | "ce" | "acd" must be present else the function makes no sense.

            After the conversion create a book from the PGN with the new Polyglot 1.5.

            Example: polyglot.exe make-book -pgn file.pgn -bin file.bin -max-ply 100 -min-game 1


[ F11 ] - Help, moves you to this page.

______________________________________________________________________________________________

Poly 1.3b

12 Mb

Downloads

ProDeo.bin

52.8 Mb

____________________________________________________________________________________


POLY 1.4

What's new

What's improved

Poly statistics - extended information among that a fictional rating of the quality of the book, see example


Weights on score has been greatly improved. The "Best" and "Vary" options have been replaced with a flexible margin of centi pawns, see picture on your left. A margin of 0.00 represents "Best" while greater margins will exclude moves that fall outside the margin of the highest Stockfish score. Advised margin is 0.10 Use "...." to set the margin of your choice, in case you prefer a margin of 0.05 centi pawns press "enter", type "5" and press "enter" again.


[ F4 ] - Work around on a Polyglot EP bug when using FEN strings.

            Convert an analyzed EPD to PGN for the use of making Polyglot books with score and depth.

            Tags like "bm" | "ce" | "acd" must be present else the function makes no sense.

            After the conversion create a book from the PGN with the new Polyglot 1.5.

            Example: polyglot.exe make-book -pgn file.pgn -bin file.bin -max-ply 100 -min-game 1


This function is extremely useful to port the results of SOMU Analyze EPD, Analyze EPD++ or Analyze EPD+++ to PGN ready to be converted to a Polyglot opening book.

[ F5 ] - Check for suspect moves in an opening book. Using the default settings (margin -0.50 | depth=5) it will check the first 5 moves out of book for both engines and if the 4th move out of book has a score <= -0.50 it is reported, see the example of the ProDeo book versus the Cerebellum light book. Double openings are skipped.

[ F6 ] - An alternative for Make Poly Book from the main menu. It's faster and better. Steps:

1. First it will check the reliability of the PGN using pgn-extract.

2. You are prompted to make a selection which engine to use in the PGN, if you leave it blank all engines are used.

3. Define the Book Depth.

4. Assign a fictional Elo that represents the quality of the book you are going to make.

5. From here on everything goes automatic, the parsing, the sorting, the removal of doubles while keeping the highest depths and finally the book making.


The "green" was lacking (impossible) in the Make Poly Book from the main menu hence this new approach.

Poly 1.4

12 Mb

_______________________________________________________________________________________________


POLY 1.5

What's new

July 2019


The next step in the evolution of the Polyglot book is to add a WDL (won | draw | loss) statistic to book moves offering the following advantages:


1. Make the WDL visible, see the example made with the latest ProDeo;


2. Tuning the weight using the WDL instead of the

Stockfish evaluation score.


For the purpose of demonstration we extracted a quality PGN of ±200.000 human-human games from MillionBase 2.9 both players with a minimum elo rating of 2500. From the PGN (called elo2500.pgn) we created a Polyglot book called elo2500.bin.


[F8] - Creates a W|D|L statistic from the PGN.


[F9] - Import (connect) the W|D|L statistic to the Polyglot book creating the file elo2500.WDL.


[F10] - Recalculate the Polyglot weights using the W|D|L statistics.


You are prompted to input the WDL parameter which makes it possible to tune a Polyglot book. The WDL parameter works like a harmonica, a high value (like 300 or higher) is like stretching the standard top moves to even higher weights while lowering or even zero-ing lesser weight moves. A low value (say 25) does the opposite, it moves lesser weight moves up often close to the weight of the best move.


Said in other words, a low value makes a book very random, a high value narrows the book. The following example makes the effect visible and concentrate what happens to the weights of the big-four 1. e4 | 1. d4 | 1. Nf3 | 1. c4


Results - We tested the elo2500.bin book against 2 other books to find out the best WDL parameter setting.

Elo2500 book versus ProDeo book

WDL

Games

Perc

Elo

Not used

5000

50.7%

+5

0.25

5000

45.4%

-33

0.50

5000

47.6%

-16

0.75

5000

49.9%

-1

1.00

5000

51.3%

+9

1.50

5000

52.7%

+18

2.00

5000

54.3%

+30

2.50

5000

55.1%

+35

3.00

5000

54.2%

+29

Elo2500 book versus Performance book

WDL

Games

Perc

Elo

Not used

5000

50.5%

+3

0.25

5000

42.4%

-53

0.50

5000

46.7%

-23

0.75

5000

48.7%

-8

1.00

5000

50.4%

+2

1.50

5000

52.8%

+19

2.00

5000

54.1%

+28

2.50

5000

54.2%

+29

3.00

5000

54.5%

+31

Poly 1.5

123 Mb

As one can see the optimal WDL parameter for this book is around 2.50


Remarks

. The elo2500 book (.bin and .wdl) is included in the download so you can experiment with [F10] yourself.

. Also included is elo2500.epd which you can try on other Polyglot books with [F9].

. To view the WDL statistics use the latest ProDeo.


Programmers only

If you want to make use of the WDL statistic the info (for speed reasons) is in the same offset of the *.BIN in the corresponding *.WDL file. fseek the WDL and then do a 4 x 32-bit fread (won | draw | loss | percentage).


_________________________________________________


3. Insight in your personal opening repertoire, what opening

to play or avoid.


When you are a club player and have maintained your games in PGN the new WDL feature offers you to study your opening reportoire. Let's take the games of Gary Kasparov as an example. From kasparov.pgn (included in the download) we create an opening book called kasparov.bin (also included in the download).


Then with [F8] (Player = kasparov) and [F9] we create the kasparov.wdl file (also included in the download) and then we are ready to browse through Gary's favorite openings with the latest ProDeo.


Example-1 shows that after 1.e4 Kasparov's best result is 1..c5 and other moves are very poor. On the other hand after 1.e4 e5 Kasparov with the white pieces has an incredible high score, see example-2.


________________________________________________________________________________________________


Polyglot 64-bit

June 2019


Polyglot (as far as I can tell) always has been 32-bit which limited the working space during --make-book to 1Gb. Compiling Polyglot as a 64-bit program takes this obstacle away. I could make a Polyglot book of 1.5Gb in one run while Polyglot workspace increased to 5.5Gb of memory.


Not only is this more comfortable it also has the advantage you don't need --merge-book any longer which enables you to produce better books because --merge-book has the (nasty) habit not to recalculate the weights.

Polyglot 1.5

64-bit

Caution - Now that you can make books as big as your memory allows you take note that when a created Polyglot book exceeds the size of 2Gb it can't be used by engines that access a Polyglot book with traditional 32-bit instructions, some of the Polyglot code needs an update first.

___________________________________________________________________________________


Programmer Stuff


Making use of the new features it requires changes is the Polyglot search code, but these are only a few. Using the source code example of Michel Van den Bergh here and here the changes are marked with "@ed" here.


Remarks

1. Instead of updating the learning AFTER the match a better result might be possible updating the learn value DURING the match, for instance when it's clear your engine is winning or losing adding one to learn in case of a win, subtract one in case of a loss.


2. To make version 1.5 possible the Polyglot source code needed changes, the source code can be downloaded seperately, changes are marked with "@ed".


_______________________________________________________________________________________________


Credits

Thanks to the Stockfish team for the use of

Stockfish 8, 9 and 10

and

Fabien Letouzey

for Polyglot

and

David J. Barnes for Pgn-Extract.