Cheating Accusations at Corus 2007, and Before

Primary Sources and Discussion Forums (data is below)

Sueddeutsche Zeitung article (in German) by M. Breutigam detailing suspicious behavior by Silvio Danailov during Topalov's games against van Wely in round 2 and Karjakin in round 3.

ChessBase translation of Breutigam's article.

Jan 27 ChessNinja thread, "Foul Play In Chess"

Feb 3 ChessNinja thread, "Recrimination du Jour". Has links to other relevant articles (I'm trying not to clutter this page)...

New 2/9/07: Kommersant article on possible official investigation into cheating claims about several tournaments. Video by amateur Dutch filmer from Corus 2006. Kommersant article's link to the video (no longer works?).

Main Methodology

Since the allegations are clearly about confederate cheating, because current engines in single-line mode seem to get to high search depth faster than their forebears(?), and because quad-core machines are now common, I have extended the testing window up to the 17 or 18-ply round. These tests are on single-core Intel and AMD machines; tests on multi-core machines will come later.

It is still necessary to get a readout of (at least) 10 choices per move at some high search depth in order to assess the significance of a match or non-match, and for similarity metrics under development. It may be important to have multi-line verdicts for every search depth from 11 ply in order to support a notion of the "swing" or "criticality" of a move, along lines of "complexity" in the June 2006 Guid-Bratko study (PDF of paper). My Elista testing did 10-line mode only for the 11,12,13,14 ply rounds. However, here we mainly try a reasonable "shortcut" methodology:

Steps (b) and (c) give ample time for being away from the machine---a considerable factor for busy people since especially (c) may run over an hour depending on your hardware---until this kind of procedure can be scripted. In cases where all but some number k < 10 of lines are catastrophic, perhaps putting the engine into slow "mate-find" mode, you can reduce the # of lines in step (c) from 10 to k. Or if the move played does not show in the top 10, you can up the # of lines until it shows.

It is also interesting to ask whether doing only single-line mode (to 17 or 18 ply) at every move, then re-starting from the beginning and doing every move in 10-line mode, will give notably different results. The shortcutted version of this involves clipping the multi-line analysis only at 17/17 (or 18/18 or 19/19 if you can; 16/16 if you can't wait). This "two-stage serial shortcut" method is in fact used in Buczyna's data.

Data and Reports

Note in Rd 2 by KWR in Jan 27 thread. (Now I wish I'd done this more formally, and I am doing so with Fritz 9---the full test on one engine takes hours, however! NEW 2/14: see note below.)

Long test file by Jason Buczyna of Junior 10 (and other engines) on the 2nd-round game Topalov-van Wely, from where theory was left at move 17. Tabulated results in this file. This expanded results file shows that Fritz 10 matches all but 2 moves left unmatched by Junior 10; tests on Fritz 10 alone are forthcoming...

Rybka 2.2n2 test file (long) by KWR of the 3rd-round game, with results in this file (not yet summarized).

Log of Fritz 9 on moves 28--36 by KWR, with results in this file. Note that moves 33, 35, and 36 are significantly inferior according to Fritz 9, though they were much closer for Rybka 2.2n2.

New 2/7/07: Based on the above, I gave my initial summary, and maintained it until 2/7: "The results of these runs do not confirm the accusations in the article. Of course they do not prove the absence of cheating---and possibly an engine not among those tested here was used---but in my opinion these results must be considered a counter to the article." However, the following run by Jason Buczyna on Fritz 10 makes a totally striking contrast to his results on Junior 10...

Long test file by Jason Buczyna of Fritz 10 (contrasted with his Rybka 2.2; my "2.2n2" is said to be significantly different) on the 2nd-round game Topalov-van Wely, from where theory was left at move 17. Tabulated results in this file. This expanded results file shows that Fritz 10 matches all but 2 moves. The 15/17 raw match rate will also score high in our metrics because more than half of the matches were in close situations. JB has begun a test of Game 3 with Fritz 10, to compare to my runs which use Fritz 9 and the contemporary Rybka 2.2n2 (note that the slightly-older Rybka 2.2 in his expanded file matches far less often than Fritz 10).

Long Fritz 9 test file by KWR of the same Rd. 2 game, with results and comments in this file and tabulated results. This largely confirms my informal observations in my initial Jan 27 comments at ChessNinja, and contrasts with the Fritz 10 results, as compared here.

Long test file of Round 3, Karjakin-Topalov, by Jason Buczyna. Moves and comments, and tabulated results. Results from this game are highly inconclusive, even excluding moves before move 20, and moves 23 and 27-28 (or 26-27?) when Breutigam reports Danailov as having been "interrupted". Moreover, the moves "shortly before the time control...had become hectic"; all 4 of moves 37--40 are matches, but were seemingly unsignalled.

Guessing at what the formal statistical methods will show, this seems a higher match rate than what one could have called an exoneration, but lower than what one would (together with Round 2) have called a "smoking gun". The run also highlights methodological difficulties caused by differing verdicts between "single-line" and "multi-line" modes.

Conclusions, Opinions, and Further Discussion

This topic and further allegations by Nigel Short (as reported 1/30/07 in this DNA India article and 2/1/07 in this Leonard Barden column.) and others continue to reverberate. I have tried to furnish concrete suggestions and analysis at the following places---search for "KWRegan" for my comments:

If you have research-relevant ideas which may shed more light, by all means please e-mail them to me! My home page has full contact info, plus I am listed.