Measuring Fidelity to a Computer Agent

How closely do a person's actions agree with recommendations made by one or more advisors? Which advisor(s) are preferred, or is the person marching to his/her own drummer? Which agreements with particular recommendations are the most significant?

In chess, the advisor is a computer chess program, and not marching to one's own drummer in a competitive game is cheating. This has regrettably become a real problem even at the highest levels of our beautiful game. The recommendations are evaluations of possible moves given by the program, saying which side would be how-many hundredths of a Pawn ahead.

The main statistical principle which these pages show has been misunderstood by the chess world is that a move that is given a clear standout evaluation by a program is much more likely to be found by a strong human player. And a match to any engine on such a move is much less statistically significant than one on a move given slight but sure preference over many close alternatives. NEW, 2/23/09: The mention of Rybka in GM Mamedyarov's protest letter is evidently this kind of misunderstanding, as demonstrated here (data viewable here).

The main scientific challenge is how to translate from evaluations into prior probabilities that recommendations would be followed by (non-colluding!) players---or whether this issue can be skirted and how. Estimating "priors" is a fundamental problem in Bayesian statistics and scientific inference, but the chess case lacks "repeatability" of experiment and some numerical properties that promote easier success in other applications. Even simpler problems such as the best choice of a distributional distance measure of (dis)agreement, of which (classical) fidelity is just one, are subjects of current debate in professional literature.

Controversies and Tests

These pages provide what still (4/16/07) seems to be the only public and scientifically presented quantitative testing of cheating allegations that have rocked the chess world since summer 2006. Primary source links are given for allegations and their coverage, in these major cases tested so far:



Main Public-Service Implications

This site is doing both theory and experiment---and currently the experimental methodology must be both painstaking and flexible in order to be realistic for the alleged activities being modeled. Current status (4/16/07) is that the theory is in early stages, but data on this site already seems to speak for itself, when gathered carefully and exhibited fully.

Kenneth W. Regan is an Associate Professor with tenure in the Department of Computer Science and Engineering, University at Buffalo (SUNY). He works in Computational Complexity Theory and other fields of Information Theory and (Pure) Mathematics that are relevant to this work. He also holds the title of International Master from the World Chess Federation (FIDE), and according to this list can claim to be the highest chess-rated active professional in these fields. GM Jonathan Mestel is in Applied Math :-). Regan's non-confidential assistants are named on relevant pages.