File "KTlog6.txt", by Kenneth W. Regan on 10/4/06. Game 6 Test with Fritz 9, 2Ghz "Pentium-M Yonah" laptop with 2GB RAM, several Firefox and Internet Explorer windows open, plus SSH Secure Shell and Notepad. I have 10 lines showing in Fritz 9. [Event "WCh"] [Site "Elista RUS"] [Date "2006.10.02"] [Round "6"] [White "Topalov,V"] [Black "Kramnik,V"] [Result "1/2-1/2"] [WhiteElo "2813"] [BlackElo "2743"] [EventDate "2006.09.23"] [ECO "D17"] 1. d4 d5 2. c4 c6 3. Nf3 Nf6 4. Nc3 dxc4 5. a4 Bf5 6. Ne5 e6 7. f3 c5 8. e4 Bg6 9. Be3 cxd4 10. Qxd4 Qxd4 11. Bxd4 Nfd7 12. Nxd7 Nxd7 13. Bxc4 a6 12-ply a6 0.32...Rc8 0.35 just changed to 32...Rc8 +0.31, and the first time I ran Fritz 13...Rc8 was always second, but probably the display is too quick to be meaningful until the end of the 13-ply search. 13/38 13...a6 0.33 13...Rc8 0.45 13...e5 0.59 (not consider) 13...h5 0.62 (nah). 6th choice 13...h6 0.64 is only other move I'd consider, aside from 8th choice, 13...Rg8 (0.75). 1287kN/s. 14/38 13...a6 now +0.41, 13...Rc8 now +0.48, 13...h6 now +0.60. MATCH, delta2 = 0.07 14. Ke2 Rg8 Right here I realized the importance of logging the actual time, so I stepped ahead 13...a6 14 Ke2 (real quick, two hits of "->" key) at: 8.59.10?pm. 13/13 9:00:04pm 14...Nb8 +0.08 14...Rc8 +0.26, 14...h5 + 0.39 5th choice 14...Rg8 (played) is +0.44. This may still be K's preparation. 14/14 shown at 9:01:27 (my seconds typing is off by 3-5 sec.) Here I also settled on the methodology of using the evals between "13/13" and "14/14", since about 2-3 minutes elapse to then. 14/38 at 9:02.25 done with 5 lines, 14...Nb8 +0.18, 14...Rc8 +0.29, now 14...Rg8 climbs to 3rd at +0.40. Steady at 1287kN/s, hmmm... NON-MATCH, delta = -0.22 15. Rhd1 Rc8 9.04.15pm start 9:05.35 13/13 (was 12/35) 15...Bd6 +0.41 (but who would dare play it!?) 15..Rc8 +0.48 15..Bb4 +0.48 15...h6,Rd8 next 15...Be7 +0.56 9:08-something, 14/14 15...Bd6 0.39, /now/ 15...Rc8 flips to 1st with +0.34, about 9:09.45. (5 minutes in---and my family called me to a window about 9:07pm). 15...Bb4 +0.49 15...h6 +0.59. Since ...Bd6 *looks* suspect, I'll call this a MATCH even though it was a step beyond 14/14, with delta2 = 0.15 (rather than 0.05 for Bd6, I completely discard that move!) (geez, there is some subjectivity right away, and I already varied from my between 13/13 and 14/14 policy...though in the direction of "MATCH".) 16. b3 Bc5 start 9.12.00 exactly 12/34 4 lines at 9:13.05pm. 16...Bc5 +0.24 16...Bd6 + 0.29 16...Bb4 +0.30 16...Nb8 +0.33 13/13 @9:13.50 16...Bc5 +0.23 16...Bd6 +0.23 gee that move again. 16...Nb8 +0.24 a very reasonable move 16...Bb4 +0.34 ditto 16...h5 +0.40 like Korchnoi played in Baguio in his comeback. 16...Be7 +0.44 10th line done at 9:15.57pm. MATCH, but with delta2 = 0.01 only (again discounting ...Bd6) Basis---yes use the lines as they appear between 13/13 and 14/14. 17. a5 Ke7 9.18.00 start (when I stepped to 17. a5.) 13/13 at 9.18.55 now blacked lines give 17...Nb8 +0.19 17...Bxd4 +0.26 17...Bd6 +0.30 17...Ke7 +0.44 17...Bb4 +0.46 14/14 at 9:20.12 Since a wide choice of reasonable moves here, look one more ply... ...no change in evals except 17...Ke7 now +0.43. NON-MATCH, delta = -0.25 18. Na4 Bb4 9:22.00 11/34 12/12 at 9:22.23 13/13 9:23.00 exactly, now blacked-in lines: 18...Bxd4 +0.35 18...Bb4 +0.39 18...Bd6 +0.51 18...alternatives lose quickly, so ignore. 14/14 at 9:24.19. another ply 18...Bxd4 +0.36 18...Bb4 +0.39 Well, if I were playing "Active Chess", I'd be happy that the non-exchanging move suffered only a 0.04 or 0.03 dropoff, so I'd play it. But by Danailov's rules, this is a NON-MATCH, delta = -0.04. step 2 more at 9:27.00 (I'm going on the minute now) 19. Nb6 Nxb6 12/12 at 9:27.29 13/13 at 9:28.04 19...Nxb6 +0.44 19...Rcd8 +0.61 19...Rc7 +0.74 alternatives lose, so MATCH, delta2 = 0.17 (odd, I'd have expected it to be higher) (meaning that 19...Nxb6 seems obviously the move to me) 14/14 at 9:29.45. 15 sec. later, 19...Nxb6 went to +0.43. 20. Bxb6 f6 step at 9:31.00 11/30, 11/31, 12/27, 13/13 at (1033kN/s now) 9:32.00 blacked lines 20...Bc5 +0.43 20...Rc6 +0.45 20...f5 +0.62 20...h6 +0.63 20...f6 14/14 at 9:33.02, 20...f6 was 10th at some point, now 5th +0.71 20...Rg38 +0.72 7...h5 +0.73 going 1 more ply, 1249kN/s, 20...Bc5 +0.37 20...Rc6 +0.43 20...f5 +0.65 20...h6 +0.66 20...f6 +0.76 now 20...Rge8, Rh8, h5 get ahead of it 15/15 at 9:35.33. NON-MATCH, emphatic with delta = -0.28 (-0.46 at 15 ply!) I remember remarking on the server chat that I didn't like ...f6. 21. Rd3 Rc6 stepped at 9:37.00 12/12 at 9:37:20. 13/13 at 9:37.39 blacked lines now 21...Rc6 +0.66 21...Bd6 +0.81 21...Be8 +1.07 21...Rgf8 +1.13 OK getting silly now. 14/14 at 9:38.41 go 1 more ply 21...Rc6 +0.59 21...Bd6 +0.79 21...Be8 +1.00 MATCH, delta2 = 0.20 (but we regarded this as the obvious required followup in the server chat) stepping again at 9:42.00 (some seconds to breather...) 22. h4 Rgc8 11/27, 11/28, 12/12 at 9:42.23 13/13 at 9:42.43 22...Rgc8 +0.56 22...Rd6 +0.57 22...Rh8 +0.59 22...Bf7 +0.54 just shot to 1st at 9:43.23 22...h5 +0.61 14/14 at 9:43.51 1193kN/s Another case where "Hamming Metric" is not robust. This was a "MATCH" until 22...Bf7 shot to 1st, though only by a .02 margin. By my controlled rules (lines between 13/13 and 14/14) this is a NON-MATCH, delta = -0.02 15/5 about 9:45.30, going to 16/16 on this one... 15/38 1310kN/s blacked lines coming in, 222Bf7 and 22...Rgc8 TIED at +0.47. stepping at 9:48.00 now Just before that, 22...h5 shot to 1st at +0.46 (I think) 23. g4 Bc5 13/13 at 9:48.42 23...Bc5 +0.23 (the move I wanted) 23...Bf7 +0.48 23...e5 +0.50 23...Be8 +0.57 14/14 at 9:49:32, Fritz is too fast! 23...Bc5 +0.21 23...Bf7 +0.46 23...e5 +0.50 23...Be8 +0.55 pretty steady MATCH, delta2 = +0.25 (higher delta2 means less-likely cheating?) 15/15 at 9:51.10 stepping at 9:52.00---time for quick potty break? :-) No. but 23...Bc5 improved to +0.09 just before I stepped. 24. Rad1 Bxb6 12/12 at 9:52.27? 14/14 already at 9:53.07 24...Bxb6 -0.07 (edge to Black!) 24...Bd6 +0.77 24...Be8 +0.96 MATCH, but high delta2 = +0.84 means this is meaningless, forced move. 25. Rd7+ Kf8 Forced move---ignore, i.e. MATCH but delta2 = +1.98. stepping at 9:56.00 26. axb6 Rxb6 Ditto, 13/13 at 9:56.25 26...Rxb6 -0.03 26...Rb8 +0.84 other moves ignore. MATCH, delta2 = +0.87 stepping at 9:58.00 27. R1d6 Rxd6 13/13 at 9:58.20? 14/14 at 9:58.46 27...Rxd6 -0.02 27...Be8 +0.80 27...Rb4 now +0.75. Another forced move, so Fritz was too fast for me to catch 13/13 lines. MATCH, delta2 = +0.82 At 10pm exact, the last interesting decision... 28. Rxd6 Rc6 13/13 at 10:00.34 28...e5 -0.01 28...Bf7 +0.18 28...a5 +0.35? 14/14 at 10.01.10, ...Rc6 wasn't in the top 10! 28...e5 +0.01 28...Bf7 +0.18 28...a5 +0.35 28...Rc7 +0.50 28...Rc6 is line 12 at +0.82. It would make a *nice* instructive note to explain the practical significance of 28...Rc6 a-la what Nigel Short said to me on the server chat---how even though Fritz jumps in eval (I saw only a +0.60 or so eval while kibitzing, hmm...), the positional factors ultimately favor Black in this B-ending, so Black only has to calculate that WK can't penetrate. Which is probably easier than evaluating White's 7th-rank and invasion ideas in the R-ending... Going full 16/16 ply: (15/59, are you really having some lines with 59 ply here???) 28...e5 +0.03 28...Bf7 +0.14 28...a5 +0.41 28...Be8 +0.55 28...Rc7 +0.55 16/16 at 10.06.40 28...Rc6 now 10th at 0.66, like the eval I saw while kibitzing. NON-MATCH delta = -0.63 29. Rxc6 bxc6 Forced. MATCH delta2 = +5 or so, so ignore step at 10.09 30. b4 e5 14/14 already at 10:09.45 about. 30...Ke7 +0.23 30..e5 +0.23 30...Bf7 +0.28 30...f5, etc. 15/15 at 10.10.30, or was that 16/16, now showing 16/44 1162kN/s. both 30...Ke7 and 30...e5 now at +0.20. TIE delta = delta2 = 0.00. 31. Bxa6 1/2-1/2 Scorecard: Moves 13--30, 18 moves total 11.5 MATCH (counting TIE as 1/2) 6.5 NON-MATCH Danailov claimed 14 matches. Here's a more-itemized breakdown, ordering the moves by "delta" and "delta2", with non-matches first. NON-MATCHES: 28. -0.63 20. -0.28 17. -0.25 14. -0.22 18. -0.04 22. -0.02 30. 0.00 TIE 16. 0.01 13. 0.07 15. 0.15 19. 0.17 21. 0.20 23. 0.25 27. 0.82 24. 0.84 26. 0.87 25. 1.98 29. 5.00 or so. I see where Danailov gets his "14"---the deltas of -0.04 and -0.02 might have shown up as +es on his run of Fritz. But I think also 5 of the matches have to be thrown out, as they were 0.82 or more better than the next-best move, and by a chess master would be regarded as "forced" or at least "clearly expected". This leaves really 6 matches, 6 non-matches, and 1 tie among the remaining moves. *NOW*, what's needed is some notion of how many matches here should be according to chance. Some illustrations: () If there were only TWO moves to give equal consideration in these 13 cases, then the results would be exactly the expected # of matches, i.e. half. () Now as it happens, sometimes there were 3 or 4 or 5 (or more?) moves to which a master would give full consideration. If it were 3 equal choices each time, then expectation would be 4.33 matches. But this is not more than one standard deviation from the actual 6.5 in the binomial distribution with p = 1/3, q = 2/3. I.e. we really need a larger sample size, and... () ...you can't assume the moves have equal weight. Rather, you have to use their "deltas" to the () top move and () played move as an objective---or at least calculational---measure of "how equal" the consideration would be. () Another scientific problem is that "delta/delta2" as presumed evidence for/against cheating is NOT monotone! I.e. higher negative and higher positive are both evidence of non-cheating. So the indicator function would have to be non-linear, with a peak in the middle. There are *really difficult* modeling issues here: (+) One could argue that the peak is at 0.00. By analogy with a standard ESP experiment where the subject has to say whether a given card is red or black, choosing the right color consistently is an anomaly. But in this case, there is a "right color". In the case of two or more moves tied with a 0.00 difference, consistently choosing "the computer's move" would be the same kind of anomaly. But, here "the computer's move" is UNDEFINED! (+) One could argue that the peak is a difference of 0.01, since then "the computer's move" is defined. But would a chessplayer seeing a small difference really care? And variation between ply-depth values and *even depending on what Fritz might have in its hashtable if you stepped BACKWARDS to the current position*---as you might do if you investigated a line and then went back. This seems to me to render a difference less than 0.10 as negligible, really undefined. (+) Chessically, I would guess the peak is about 0.20 to 0.25. I.e. consistently playing a move that is 1/4-of a pawn better than the next-best move would be the midpoint evidence of cheating. Once you get to a half-pawn difference, the downslope that the move played was inidcated starts to take over. In the range 0.15 to 0.30, Kramnik played the top move 4 times and didn't 3 times. Too small a sample size---looks pretty random. Interestingly, the "close moves" were all in sequence, 13--23, except for Move 30---which shouldn't count as it kinda didn't matter... Going back to my logging of to-me-reasonable alternatives: 16. 0.01 10 13. 0.07 2 (or 3 if ...Rg8 TN could come then, but this was prep IMHO) 15. 0.15 2 or 3 19. 0.17 3 (really 1 IMHO, you don't leave a beast on b6) 21. 0.20 3 (but on the server we felt 1) 23. 0.25 4 (but 23...Bc5 was consequential, no?) So the only case with a choice among more than 2 prominent alternatives as judged ("objectively") by closeness in score and (subjectively) by "chess sense" is move 16. But even here, the previous move 15...Rc8 seems to intend 16...Bc5 as *the* followup---and it came within 0.02 as not being a MATCH anyway! For each non-match move, we can count the # of moves ranked ahead of it, plus 1 for the move itself. 20. -0.28 5 17. -0.25 4 14. -0.22 5 (but prepared?) 18. -0.04 2 22. -0.02 2 Not much to conclude from this. There's also the "BRAVE MOVE" 28...Rc6! (-0.63 worse than top move), but it is explained readily in practical, grandmasterly terms, so no "brownie points" for playing it---? OK, it took me 2 hours and 20 minutes to generate the whole test and this detailed report. This is certainly feasible for others to do, with the longer games. More testing is needed to make statistically significant conclusions...but there's one more factor to evaluate: the "GIGO" factor. For a conclusion, IMHO the MOST TELLING stat is IMHO my analysis of the choices available at the 6 moves that give a MATCH with delta < 0.82. In real-time chess terms they were more restricted than even the Fritz scores seem to indicate. So, based on the claim for this game alone, I am NOT IMPRESSED. I.e., high "Garbage-In..." factor. ---Dr. Kenneth W. Regan, Buffalo (Amherst) New York, 10/4/2006.