Testing Team Topalov's Claims of Cheating at Chess

Still Under Construction---but with data fit to examine.

Silvio Danailov's Oct. 4 letter accusing Vladimir Kramnik of cheating specifically with Fritz 9. The ChessBase posting has become the version of record, e.g. for the Wikipedia article on the match, because the letter seems never to have been entered into the official match records, as evidenced here. ChessBase corrects two typos as shown there, but I further believe that the "(86% of matches)" refers to the fraction 24/28 = 6/7 = .857... as relevant to Game 3, not to any typos carried over from Game 2 which says "(87% of matches)."

Of course the total lack of data, logs, reproducibility assertions, or any statement of methodology from Team Topalov should be castigated from all directions, but the accusations continue to reverberate seemingly beyond any reach of science. This page is an effort to apply science, after many hours of meticulous observation of Fritz 9 (with default parameters as recommended by ChessBase here and plenty of hash, usually 700MB) recorded in logs below. There is enough here for me to give the preliminary conclusion of no determinative statistical evidence of cheating by Kramnik. This does not prove the absence of cheating, but does represent both that these runs find the apparent high match rates explainable, and an assertion that further runs in the future would yield the same conclusion with high statistical probability. Otherwise this work should be viewed not as "taking the A and B samples" itself, but rather as the start of developing this kind of test. Sadly, recent events in India and as discussed by USCF officials here make the need to research this kind of test (absent the timing data considered vital to catch cheaters in speed chess online) all the more pressing.

Note: All while compiling my data, I had the impression that e.g. "12/12" meant the end of the 12-ply search, but it actually means the beginning, and so corresponds to "11->" from Bob Hyatt's Crafty. I've demonstrated this point in the Rybka forums here comparing two versions of Crafty 19.19. Although the logs thus err, this does not affect the data, and the "11-12-13-14" description on this page has been correct since November 3rd (2006).

Methodology

Original explanation of the multi-line mode methodology used for these tests. See discussion in the updated general methodology section, especially here.

Data

Long test file of Kramnik-Topalov Game 1, Kramnik's moves only, with results and conclusions in this file.

Long test file of Topalov-Kramnik Game 2, Kramnik's moves only, with results and conclusions in this file. Although the 85% match rate (75% when weighted 1-2-4-8 for matches at 11-12-13-14 ply) seems impressive at first sight, 25 of Kramnik's 46 moves at issue were basically forced (not counting one blunder on a forced move!). From the other 21, the match rate seems highly random.

Jason Buczyna's Rybka test of Game 2, with results and conclusions in this file. It seems very striking that 30 of the last 32 moves by Kramnik are matches to Rybka, but all but 11 were quite forced, and 5 of those 11 involve ties, plus 2 inferior moves by Kramnik.

Comparison run on E. Sukhovsky-V. Mihalevsky, 1-0, Ashdod International, Sept. 10--17, 2006, related by Irina Krush in her USCF blog. 66% matches to Fritz 9, 54% in non-forcing situations, but maybe a higher fidelity score because White had many more nearly-equal choices in the non-forcing situations that matched.

Long test file of Kramnik-Topalov Game 3, tabulating Fritz 9 evaluations at the close of the 11, 12, 13, and 14-ply searches, and giving evals at 16-18 ply for key moves. Stats and Summary Conclusions from this run. This is the only game for which my run produced a markedly lower match rate than Danailov's claim for this game---indeed, based on my observations of variances while operating Fritz 9 in different ways, I estimate a less-than-2\% chance that a given run will produce Danailov's reported match rate (indeed less than 0.5\% or 1-in-200 if my correction of the typos on this game in his letter is right). Moreover, Topalov scores a much higher raw match rate to Fritz 9 than Kramnik in this game! But under 1-2-4-8 weighting of matches at 11-12-13-14 ply, both players are in the 45%--50% range, Topalov 4% lower. (No results on this page use my ideas for a fidelity metric yet.)

Long test file of Topalov-Kramnik Game 4, stats and results in this file from this run.

Long test file of Topalov-Kramnik Game 6, stats and results in this file from this run.

Earlier Game 6 test file at 14 ply.

If you have research-relevant ideas which may shed more light, by all means please e-mail them to me! My home page has full contact info, plus I am listed.