CSE712 Warmup Ideas to Program
These are not in any particular order. Most can be done entirely without
resort to my own Perl scripts; the few that would need or conveniently use
them are marked by a mention of AIFdata.pm or Position.pm, the
latter being more involved.
Addendum: I was just sent this link by someone in our department:
Graph the frequency of playing capture moves against the rating of the players,
from 1500 to 2750.
Combine that with the frequency with which the engine recommends a
Compare the frequency of players playing moves with K,Q,R,B,N, or pawn (pawn moves
do not use a letter P), versus how often moves with each piece are recommended by the
Do the last item for various rating levels and see if there is any
Compare the frequency of playing moves that are forward, sideways (for
Rook or Queen), and retreating for each rating level, and versus
how often the engine recommends such moves. Hypothesis: good retreating moves
are "harder to see", especially for lower-rated players. (Harder because
determining whether a move is forward or backwards may require using
the Position.pm module to get "Long Algebraic Notation" for the moves,
but this can easily grow into a main project and even a paper when combined with
the methods in the Biswas-Regan 2013 paper I gave out.)
Gather a histogram of how many positions in the files have X number of legal moves.
Graph the frequency of having fewer than 10 legal moves against rating---this is
the one that was illustrated for beginning in the seminar.
Graph the frequency of playing moves that give check versus rating---this is maybe
subsumed by the previous one.
Hypothesis: lower-rated players give check more often. (Determining a check
could use the Position.pm module, but the game notation in the [GameID] block
has + signs on checking moves that can be counted instead, and the "fewer than 10
legal moves" count may have much the same effect.)
Gather a histogram of how often the second-listed move is X worse than the first-listed one, where X falls into intervals 0--9, 10--19, 20--29, and so on (or in pawn units, 0.00--0.09 pawns, then 0.10--0.19, 0.20--0.29, and so on) (uses
Gather a histogram of MM% at each numbered turn going 9,10,11,... up to 60 say. Or maybe better, group the tallies into blocks of 4, that is turns 9--12, 13--16, 17--20, ..., 57-60.
Gather a histogram of how often there is a change in best move at depth d (uses AIFdata.pm), for d = 1 to 20. (Can also be done without
AIFdata.pm by looking at the "ChangePVs" section of each
move record in the AIF file.)
Check whether the second digits of the 3-digit evaluation numbers obey Benford's Law (for which see my blog article
---since Benford's Law is a major statistical fraud-catching device this could really grow into a presentation.
Does "Zipf's Law" hold in any form? It would say that the Nth-best move is played about 1/(2N) of the time.
Graph the frequency of moving the same piece twice, versus the
number of times the engine recommends it and rating.
Graph the number of times a player matches the engine 10 times in a row,
Same with moves that do not drop in value after the move is played. Note that this can
be done with the Eval and NextEval lines,
not needing AIFdata.pm to read entries from the matrix.
Do lower-rated players play with more disconnected pawns? This needs some way
to convert a FEN code into an 8x8 grid---the Position.pm module has such a
method where you could just grab the code textually and maybe convert it to Python.
Do lower-rated players have games with higher or lower
NodeCount-for-depth? As I derived on the board at the start (on Fri. 3/4),
to get a common measure based on depth 20,
take the NodeCount N at the final depth D and compute
C = N^(20/D).
This becomes a measure of the complexity of the position---at least
for the computer to analyze.
Graph the frequency of consecutive blunders against rating, say using
a drop in value of played move of 150 (that is, 1.50 pawns) or more as the
threshold of "blunder". Per Mike Wehar's query,
cases of 3 or more consecutive blunders may be
hints of a mistake in the recorded gamescore.
For something completely different, hunt for and find games in which the average
drop in value of a played move is high---say over 30 centipawns. Well the
MRAIF.pl file already tabulates this for each game in the resulting
.sc3 or .r3 (or etc.) report files, but you can write a much
shorter script to do this just via the Eval and NextEval entries.
It might be interesting to see if any of these stats vary with rating. The material
here depends only on the moves of the game, not on the engine's analysis of how
good and bad certain moves are. And ah! it links to
which overlaps some of the ideas I thought up above---and hints that we may get some
interesting positive results from them after all.