Identifying Cheating in Small-N Situations

The New York Times has an article about the research of University of Buffalo’s Kenneth Regan to identify cheating in chess by using software. The article is okay, but the research is interesting and led me to a few thoughts:

* Chess is a difficult activity in which to detect cheating. Besides the fact that there is no purely experimental way to approach the issue, much like there is no purely experimental way to approach most social science issues, chess is played under a small-N and highly-malleable path. The former is because of the small number of moves per game and the small number of games per match and small number of matches per tournament and the small number of tournaments during any period/life of a given chess player. The latter factor is because a given chess player’s performance is determined by the her opponent’s performance.

* Using chess software to identify the sets of best moves strikes me as problematic, as the software–at least in the non-brute force category–makes use of historical data to suggest best moves. Furthermore, chess players use chess software to train. Each of these two connections form a feedback loop and creates methodological issues regarding the non-independence between the player being assessed and the software being used to do the assessment.

* The article lacks serious discussion of how Regan uses the software and I did not read his publications on the matter, but his caveat emptor page provides a quick and somewhat lay-person’s account of what he is doing how. The short version is that individual moves are compared against what a computer would recommend, and flags are raised when moves within the set of best moves but not the best move itself are used. The more flags are raised as a percentage of the overall moves, the more likely it is that the player is cheating. This is an oversimplification, but makes the point.

While making it clear that this is cool and awesome, and Regan deserves more respect than I can provide given my knowledge and capabilities, I wonder how this approach compares against a historically-grounded pattern-based analysis. By historically-grounded, I mean that the player’s past moves are weighted in the assessment. To be clear, the entire universe of past chess-move data is needed given the small-n character of the entire situation, but weighting a player’s moves provides a way to compare move selection in a given situation (e.g., game, set, match, period in life) against that same player’s moves in the past. If a player moves significantly different than how she moved in the past, then this may be a sign of cheating (it could also be a sign of other things, such as throwing the opponent for a loop).

Second, and connected to the last part of the previous point, by including a pattern-based analysis, comparisons of patterns of move can be compared to past play and overall patterns. An unusual move that is part of a pattern may trigger a flag in Regan’s approach, but may be entirely consistent with the player’s knowledge, training, experience, and approach (i.e., not cheating). Thus, pattern-based assessments–while making a small-n problem worse–adds a degree of robustness that Regan’s model may not handle (again, I am almost entirely ignorant with how he handles these issues).

* On the surface, this research is highly specialized and too mathematically consumed to be of much direct use. But I found it a useful starting point to think about cheating and pattern-identification and/or coordination-identification exercises. For example, how do you know when two groups (e.g., terrorist groups, hacking groups, advocacy groups, etc.) are coordinating on a given action? Given the small number of terrorist attacks (or cyber attacks or advertisements), particularly by any given terrorist group (this is a methodological issue many people don’t like to talk about), identifying responsibility and whether there was coordination with another terrorist group is difficult. Regan’s approach, or variants of it, could certainly help.

As an end note, I like D. T. Max’s New Yorker article The Prince’s Gambit as a basic pleasant overview of chess software and its limitations.

Jason R. Koepke

The Digital Flagship Location

Identifying Cheating in Small-N Situations