Suppose I have a completed game. I don't have Elo ratings of the players. My purpose is to evaluate a player's performance in the game based solely on his moves. Can this can be done automatically using a chess program?
The result can be his approximate Elo rating, or just some value indicating his strength or error rate.
If it helps, a database of the player's games can be given. Again, with no Elo ratings.
My motivation is simple. I play chess over the internet and would like to automatically track my progress, based on the games themselves, not on the rating on the sites. I'm a (upper)-beginner level.
A simple solution is to annotate the game using any computer engine and track number of ?!, ? and ?? marks. However, it's not very accurate, and I'd like to get more ideas :)
The Site ratings at slow time controls can be quite reliable for servers where strong players congregate (ICC, FICS to name a few) as the ratings VERY closely reflect your true playing strength if you've played enough games. For very standardized rating systems such as USCF and FIDE/ELO, you will notice that the different rating classes tend to point to the types of mistakes those players are still making. NM Dan Heisman's Improving Chess Thinker does an excellent job discussing the types of errors players make across the rating classes.
Have you tried the many self-test books out there? Igor Khmelnitsky's Chess Rating Exam and Danny Kopec's Test, Evaluate and Improve your chess are excellent books that allow you to track your progress by seeing how you perform against graded test positions.
Your compare-my-moves-with-an-engine approach is another way to do this but once again, the ??/? moves are really only indicating tactical errors, not strategic or positional or even behavioral or time-management mistakes you might be making.
That's why playing slow time-control OTB/online games against equal-to-stronger opposition and getting them reviewed + critiqued by stronger players is an efficient way to improve. Your mistakes in every category (tactics, knowledge, thought process, time management etc.) get highlighted and you can simply measure progress in terms of the mistakes you've stopped making.
Though one fun variant you can try with an engine at home: Why not extend your engine-evaluation method to visually observe a player's quality/performance via evaluation graphs? In other words, take engine evaluation scores per move and plot them (some free software like SCID does this for you) over the moves.
For example: Two rank beginners would have a game that looks like:
Notice how jagged these are. Both sides make many terrible mistakes (slopes of the spikes!) and also how often they fail to exploit the other person's terrible mistakes.
The spikes are always fun to look at :
Two intermediate (USCF 1400-1600) players might have games that look like:
It does look jagged, but notice how the y-axis (engine evaluation) is way smaller ... indicating that these players are more seasoned and play higher quality chess than the novices.
For a final comparison, a 1911 Grandmaster game would look like this:
No comments necessary here :) These guys really don't make many mistakes, do they?
If you could devise your own heuristic for mapping the slopes + scale of an evaluation graph to player skill/performance, perhaps this is one way to go? :)Tweet