Name Last modified Size DescriptionSome data mining on the relation between Rookie 2.0's evaluation and the game result, expressed as game outcome (won, drawn, lost) or as rating delta. The data was scraped from the final evaluations of each move played by either blik/Rookie in all logged rated games on FICS and ICC.
Parent Directory - eval-vs-rating.gif 05-Sep-2009 13:36 7.2K eval-vs-result.gif 05-Sep-2009 13:29 9.3K run 05-Sep-2009 14:13 2.3K run.out 05-Sep-2009 13:30 25K
In the first plot, the won and lost lines cross each other at an eval of approximately -1.0 pawn. The peak in drawn games is also around that point. This can be explained by the opponent pool: most opponents are weaker than Rookie 2.0 (on average by about 1 pawn).
Plotting the evaluation vs. rating adjustment shows the crossing at approximately 0. This means that the rating system recognizes the strength difference quite accurately. It rewards weaker players for drawing against Rookie 2.0, and it punishes Rookie harder for losing than it gives points for winning.
This data represents 14,660,393 evaluations taken from 534,097 games.