What previous tries already indicated there is ELO in big data and perhaps more we thought at first sight. In the first YAT edition ProDeo 2.0 scored an 53 ELO improvement with as only change a computer generated extended opening book made from CCRL and CEGT games.
In YAT-2 we (in the last 3 rounds) included an *.EBF book generated from the Dann Corbit and Les Fernandez analysis project which gave such good results we became curious and decided to replay the first 5 rounds. The result: an 102 ELO improvement.
The YAT edition of ProDeo 2.1 comes ready to use in the Arena 3.0 GUI with the extended books installed:
There is ELO in big data.
ProDeo first looks into its regular opening book, if nothing found it will consult the 111 million dc.ebf book and last the 40 million ccrlcegt.cht book. For YAT-2 (replayed) the 240 games has given the following statistic.
Moves played from regular book
2625 / 240 = 10.9 moves
Moves played from dc.ebf book
749 / 240 = 3.1 moves
Moves played from ccrlcegt.cht book
89 / 240 = 0.4 moves
Overall Book Moves
2625 + 749 +89 = 3463
3463 / 240 = 14.4 moves
Extra gained moves by EBF & CHT
749 + 89 = 838
838 / 240 = 3.5 moves
If you are not interested in the usage of the big data then don't download, the download size is 1.0 Gb (unzipped 1.6 Gb) and the engine is the unchanged ProDeo 2.0 chess engine. The latter is deliberate. An improved engine would contaminate the result of the ELO gain there is in opening books | book learning | position learning and opponent learning. There is still potential I would like to explore and thus requires an unchanged engine.
CHT and EBF books are controlled by parameters as found in the personal/prodeo.eng configuration file.
[EBF File = books\dc.ebf] * database location 111 million positions
[EBF Depth = 80] * consult EBF database till 40 moves
[EBF Priority = EBF] * EBF or CHT (search which database first)
[CHT File = books\ccrlcegt.cht] * database location 40 million positions
[CHT Depth = 60] * consult CHT database till 30 moves
[CHT Minimum = 7] * engine classification (0-12) 7 = ELO >= 2900
The [EBF Priority = ] parameter decides which book should be used as first one. For the moment the EBF book performs better and thus the EBF setting is used. When a move is found in the EBF book it is played, if nothing is found it will consult the CHT book. Using the setting CHT will reverse the search order.
The [CHT Minimum = ] controls the minimum (CCRL) elo rating. Each position that is stored in the database is classified by the elo rating of the engine that played the move by a value variying from 0-12. For an engine like ProDeo that is rated around 2700 the minimum elo is set to 2900 (represented by 7) as a safe rating that moves from other programs can be used as book move. The following conversion table is in order to set this parameter to a suitable value that suits your program.
[CHT Minimum = ?]
As can be seen from the source code in case there are multiple moves available the CHT search will always put the move with the highest elo rating on top.
As the table shows it's important to carefully consider what number to use. If you have a 2900 elo rated engine and you use 11 as classification that might turn out not so wise because it might play moves from 2700 rating engines also as book move. It makes more sense to use 3 or 4 even maybe 5 which guarantees at least moves from 3000 elo rated programs.
For top-engines (although not impossible) it will be much harder to profit from the CCRL and CEGT data, for average and lower rated engine there is a lot to gain, funny enough, the weaker your engine the more gain there is.