Creating datasets yourself
MEA wasn't created for simex but for other purposes such as OKE or creating opening books and for that reason requires a special EPD tag. SOMU 1.5a will do that job for you, see the [F9] and [F10] options marked with "new" on that page.
In a nutshell:
[F9] - from a PGN create a suitable simex EPD.
[f10] - converts an EPD that contains the "bm" tag for the use in SIMEX.
Differences between SIM03 and SIMEX
1. SIM03 sends the whole game history to the engine while SIMEX uses EPD. This might cause differences.
2. The time control is fundamental different. SIM03 is in control, it sends a stop command to the engine when time is up. MEA leaves it to the engine programmer and how he has programmed the fixed move time. Unfortunately not every engine has programmed this accurately.
An extreme example is Rybka1. With SIM03 it uses 17 minutes to finish the 8238 position at 100ms (already 3½ minutes too much!) but with SIMEX it notable takes 1 hour and 2 minutes to finish. One can check the end of the log file to check the sanity of the time an engine has used. For Rybka1 we got:
Time allocation : BAD!! spending more time
ActualTime > ExpectedTime + MarginTime
ExpectedTime : 823.8s
ActualTime : 3669.8s
However Rybka1 is a big exception, the engines we tested stay in reasonable margins but reason (2) explains why the SIMEX similarity percentages are somewhat higher than with SIM03.
1. SIMEX parameters to manipulate the data for better results see the README file.
2. Add comments to HTML reports. Store them into legend.txt, example.
3. Make a dendrogram from *.data files for *.png visualization, example. During the creation of an HTML SIMEX also creates an Excel file called dendrogram.csv which can be used by the dendrogram tool of Ferdinand Mosca. Just double click dendrogram.bat in case you want such a picture.
Syntax: dendrogram --input dendrogram.csv --output sim.png