## Thursday, January 29, 2015

### Is it possible to emperically infer the value of each chess piece based on an analysis of it's participation in many winning/losing/drawing games?

Classical chess theory recommends relative values for each chess piece - 1 for pawns, 3 for knights and bishops etc. Unfortunately however, these values are based on the opinion of experts (chess grandmasters). One way to infer "true" values for pieces rather than values based on "expert opinion" is to statistically study and "average" the contribution of each piece to win/loss over many games. A practical way to do this is to train a machine learning model with board positions as features and the result of the game that the position came from (win/loss/draw) as target. We can then query the model for the importance of each piece. What follows is an attempt to do this using python sklearn.

### Data

The raw data for this effort comes from scraping data from here, (processed and mirrored here). A total of 84302 games found in this set were used for the analysis.

### Preprocessing

The following processing steps were performed for each game (processed data ready for use here):

1. Each game was parsed to get a sequence of board positions, one for each move

2. For each board position, crafty (the chess engine) was used to calculate a score

3. The final, fully processed, dataframe looks like this:

```import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np

sep=' ',
compression='gzip')
```
r n b q k b.1 n.1 r.1 p p.1 ... K B.1 N.1 R.1 turn count w_win b_win draw score
0 0 0 0 0 0 0 0 0 0 0 ... 38 0 0 0 0 1 0 0 1 0.02
1 0 0 0 0 0 0 0 0 0 0 ... 12 0 0 0 0 1 1 0 0 24.24
2 0 0 0 0 0 0 0 0 0 0 ... 37 0 0 0 0 1 0 1 0 -12.68
3 0 0 0 0 0 0 0 0 0 0 ... 37 0 0 0 1 1 0 1 0 -12.84
4 0 0 0 0 0 0 0 0 0 0 ... 30 50 0 0 1 1 0 0 1 1.57

5 rows — 38 columns

```chess_df.shape
```
```(4674871, 38)
```

The dataframe format

Columns:

1. Column names are chess pieces Lower case for black pieces (e.g., n = black knight), upper case for white (R = white rook) When there is more than one piece, each piece is numbered (e.g., p.6 is the 7th black pawn, B.1 is the second white bishop)

2. 'turn' = who's turn is it to move (0=white, 1=black)

3. 'count' = number of times this board position was encountered in the dataset

4. 'w_win' = number of times white won when this board was encountered

5. 'b_win' = number of times black won when this board was encountered

6. 'draw' = number of draws when this board was encountered 7. 'score' = the score crafty assigned to this board

Rows:

1. Each row represents one board position along with the extra data noted above

Cells:

1. The cell value is the number of the chess board square on which that piece is found. The squares are numbered from the top left (a8 = 0) to the bottom right (h1 = 63) as a linear array (see figure below)

For example

``chess_df.loc[0, 'K']  # returns 38.0``

i.e., in board number 0, the white king ('K') is on square 38 (g4)

The board numbering is shown in the figure below

```# Code for the chess board modified from
# http://stackoverflow.com/questions/10194482/custom-matplotlib-plot-chess-board-like-table-with-colored-cells

def checkerboard_table(data, fmt='{:.2f}', bkg_colors=['grey', 'white']):
fig, ax = plt.subplots()
ax.set_axis_off()
tb = Table(ax, bbox=[0,0,1,1])

nrows, ncols = data.shape
width, height = 1.0/ncols, 1.0/nrows

for (i,j), val in np.ndenumerate(data):
# Index either the first or second item of bkg_colors based on
# a checker board pattern
idx = [j % 2, (j + 1) % 2][i % 2]
color = bkg_colors[idx]

loc='center', facecolor=color)

# Row Labels...
for i, label in enumerate(data.index):
tb.add_cell(i, -1, width, height, text=(8-label), loc='right',
edgecolor='none', facecolor='none')
# Column Labels...
for j, label in enumerate(data.columns):
tb.add_cell(-1, j, width, height/2, text=label, loc='center',
edgecolor='none', facecolor='none')
return fig

data = pd.DataFrame(np.arange(0,64).reshape(8,8),
columns=['a','b','b','d','e','f','g','h'])
checkerboard_table(data)
plt.show()
```

### Training a decision tree regression model on board score

We will first train a decision tree model. To start with, we will train it on the board evaluation 'score' provided by crafty, rather than win/loss.

```from sklearn.tree import DecisionTreeRegressor

features = chess_df.columns[:33]   # the 32 pieces and 'turn'
target = 'score'      # the board 'score' provided by crafty
model1 = DecisionTreeRegressor(max_depth=70)
model1.fit(chess_df[features], chess_df[target])
model1.score(chess_df[features], chess_df[target])
```
```0.9999812089822121
```

The model fits the data well; actually overfits the data (R^2 of 0.999). For our purposes though, overfitting is not a concern. We only want to infer variable importance from this data. If generalization is necessary, we could tune this model, guided by cross-validation.

Now, we retrieve feature importance, i.e., the importance of the various pieces, by accessing the 'feature_importances_' attribute of the model (scaled to the lowest value piece):

```pd.DataFrame(zip(features, model1.feature_importances_/model1.feature_importances_[8]))
```
0 1
0 r 1.980954
1 n 2.786199
2 b 2.234911
3 q 3.823624
4 k 1.388084
5 b.1 1.796774
6 n.1 1.742865
7 r.1 3.522390
8 p 1.000000
9 p.1 1.406889
10 p.2 1.038084
11 p.3 1.377806
12 p.4 1.297784
13 p.5 0.858814
14 p.6 0.454960
15 p.7 0.104892
16 P 1.811296
17 P.1 1.142458
18 P.2 1.101726
19 P.3 0.951071
20 P.4 0.873358
21 P.5 0.692127
22 P.6 0.359597
23 P.7 0.059979
24 R 2.625228
25 N 2.832997
26 B 2.432010
27 Q 7.343026
28 K 1.257644
29 B.1 1.066833
30 N.1 1.279959
31 R.1 1.811157
32 turn 1.937530

The average scores for the pawns is less than the pieces, which is less than the queen.

The average importance of the black pieces is less then the white pieces.

Overall, the actual values are in line with what classical chess theory predicts.

### Training a decision tree classification model on 'win/loss'

Note that the previous model was trained on board evaluation scores provided by crafty, a chess engine which uses piece weights in calculating the board evaluation score. It has not trained on win/loss (the '"truth"). The results obtained are probably a reflection of this fact. So, this time, we shall train the model on the ground truth - using "white wins" (column 'w_win' in our dataframe).

We will first process the w_win column by converting it to "white wins as a fraction of total", then binarizing the result - if a board was won by white over 50% of the time, label it as a win for white. Then, we will train a decision tree classifier.

```chess_df['white_win_binary'] = chess_df['w_win']/chess_df['count'] > 0.5

from sklearn.tree import DecisionTreeClassifier

features = chess_df.columns[:33]   # the 32 pieces and 'turn'
target = 'white_win_binary'
model2 = DecisionTreeClassifier(max_depth=70)
model2.fit(chess_df[features], chess_df[target])
model2.score(chess_df[features], chess_df[target])
```
```0.99999764699389571
```
```predicted_classes = model2.predict(chess_df[features])
pd.crosstab(chess_df[target], predicted_classes, rownames=['White win'], colnames=['Predicted'])
```
Predicted False True
White win
False 2989035 2
True 9 1685825

This model overfits too. The 'feature_importance_' values for the pieces, however, are completely different.

```pd.DataFrame(zip(features, model2.feature_importances_/np.mean(model2.feature_importances_[17:23])))
```
0 1
0 r 1.183964
1 n 1.443030
2 b 1.517433
3 q 1.198660
4 k 1.164395
5 b.1 0.812256
6 n.1 0.662847
7 r.1 0.982471
8 p 1.241384
9 p.1 1.256570
10 p.2 1.254353
11 p.3 1.320920
12 p.4 1.191160
13 p.5 1.091353
14 p.6 0.727144
15 p.7 0.218774
16 P 1.882447
17 P.1 1.601374
18 P.2 1.346524
19 P.3 1.148129
20 P.4 0.929727
21 P.5 0.627370
22 P.6 0.346875
23 P.7 0.117512
24 R 1.511778
25 N 1.339097
26 B 1.590355
27 Q 1.255038
28 K 1.109622
29 B.1 0.780831
30 N.1 0.654167
31 R.1 0.729700
32 turn 0.069270

These are values aggregated over many different chess boards. Many of the boards in the dataset are only seen once in the dataset. Not all pieces are present in all the boards.

```# Fraction of chess boards in which each of the pieces are present
for i in features:
print i, "\t:\t",float((chess_df[i] > 0).sum())/float(chess_df.shape[0])
```
```r  : 0.527911251455
n  : 0.675296708722
b  : 0.731863831109
q  : 0.638411626759
k  : 0.997306449739
b.1  : 0.324361891483
n.1  : 0.269364865897
r.1  : 0.613436606058
p  : 0.991219650767
p.1  : 0.9688089789
p.2  : 0.925571422185
p.3  : 0.853068459001
p.4  : 0.739480297959
p.5  : 0.571836527682
p.6  : 0.322804415352
p.7  : 0.0843967673119
P  : 0.992084273555
P.1  : 0.971450763026
P.2  : 0.928900711913
P.3  : 0.858552888411
P.4  : 0.74619000182
P.5  : 0.576605215417
P.6  : 0.322984313364
P.7  : 0.0817641813004
R  : 0.874121446346
N  : 0.663003535285
B  : 0.736831026995
Q  : 0.644438531031
K  : 0.939267842899
B.1  : 0.337640546659
N.1  : 0.257087307864
R.1  : 0.48930633594
turn  : 0.499698708264

```

### Conclusions

Chess engines (crafty) assign values for the various pieces in line with classical chess theory. However, when peices are evaluated based on win/loss, the valuations are dramatically different from classical chess theory. One explaination for this is that piece values are dynamic and depend on other factors, such as the stage of the game, tactical opportunities, etc. It would be interesting to anayze this dataset further, stratifying it by the number of pieces, location of some pieces, different piece combinations (e.g., Q vs. r & r.1) etc.

#### 1 comment:

1. This comment has been removed by a blog administrator.