Atomic Piece Values, Again

Mon 31 May 2021 by Steven E. Pav

In a previous blog post I used logistic regression to estimate the values of pieces in Atomic chess. In that study I computed material differences between the two players using a snapshot 8 plies before the end of the match. (A "ply" is a move by a single player.) That choice of snapshot was arbitrary, but it is typically late enough in the match so there is some material difference to measure, and also near enough to the end to estimate the "power" of each piece to bring victory. However, this valuation is rather late in the game, and is probably not representative of the average value of the pieces. That is, a knight advantage early in the game could be parlayed into a queen advantage later, which could then prove decisive.

To fix that issue, I will re-perform that analysis on other snapshots. Recall that I am working from 9 million rated Atomic games that I downloaded from Lichess. For each match I selected a pseudo-random ply after the second and before the last ply of each game, uniformly. (There is no material difference before the third ply.) I also selected pseudo-random snapshots in the first third, the second third, and the last third of each match. I compute the difference in material as well as differences in passed pawn counts for each snapshot. You can download v2 of the data, and the code.

Recall that I am using logistic regression to estimate coefficients in the model

$$ \operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_K \Delta K + c_B \Delta B + c_R \Delta R + c_Q \Delta Q \right], $$

where $\Delta e$ is the difference in Elo, and $\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q$ are the differences in pawn, knight, bishop, rook and queen counts. Here $p$ is the probability that White wins the match. By putting the weird constant $\operatorname{log}(10)/400$ in front of the expression, the constants $c_P, c_K$ etc. are denominated in Elo equivalent units. Similarly, I fit a model with terms for passed pawn counts in various ranks.

As previously, I subselect to games where each player has already recorded at least 50 games in the database, where both players have pre-game Elo at least 1500, and games which are at least 10 ply in length.

Here is a table of the estimated coefficients for the four regressions, with coefficients and standard errors in Elo equivalents, as well as Wald statistics. The p-values all underflow to zero. The intercept term can be interpreted as White's tempo advantage.

snapshot	term	Estimate	Std.Error	Statistic
random	Elo	0.957	0.001	665.0
random	White Tempo	53.136	0.281	188.9
random	Pawn	31.301	0.329	95.0
random	Knight	47.058	0.471	99.9
random	Bishop	57.368	0.484	118.4
random	Rook	105.767	0.718	147.4
random	Queen	244.058	0.890	274.2
first third	Elo	1.006	0.001	710.2
first third	White Tempo	62.620	0.270	232.1
first third	Pawn	58.935	0.697	84.6
first third	Knight	83.380	0.878	95.0
first third	Bishop	76.827	0.986	77.9
first third	Rook	80.067	1.754	45.7
first third	Queen	148.910	1.768	84.2
second third	Elo	0.955	0.001	666.7
second third	White Tempo	50.868	0.283	179.5
second third	Pawn	38.063	0.322	118.3
second third	Knight	57.589	0.449	128.2
second third	Bishop	59.246	0.462	128.1
second third	Rook	107.305	0.742	144.7
second third	Queen	210.759	0.917	229.8
last third	Elo	0.915	0.001	619.6
last third	White Tempo	45.537	0.292	155.9
last third	Pawn	26.008	0.264	98.5
last third	Knight	32.499	0.397	81.8
last third	Bishop	55.957	0.396	141.2
last third	Rook	109.550	0.545	201.2
last third	Queen	283.669	0.716	396.1

The data are hard to digest in table form, so below I plot the coefficients for the four different snapshots, with standard error bars. We see that a queen is generally worth around 150-250 Elo points, a rook around 100 (though somewhat less early in the match), a bishop around 60 (more early in the match), a knight 30-80, and a pawn 15-60. As one progresses along the match (from first to second to last third), the queen and rook gain value, while the bishop, knight and pawn lose value.

plot of chunk plot_est_one

The top axis is denominated in 'pawn' units, where I eyeballed a pawn as worth around 30 Elo, but this is so variable over the different match snapshots it is hard to quote a consistent valuation scheme in pawn units. This is in contrast with the previous blog post where we suggested a 1:2.5:4:8:22 valuation for pawn, knight, bishop, rook, queen; that scheme is only appropriate for the very end of the match (and it can be hard to tell you are at the end of the match while playing). Below I denominate piece values relative to the estimated pawn values. (I drop the error bars because I am too lazy to code up the delta method.) For the random snapshot the pawn value estimates are as below

Piece	Pawn value
Knight	1.5
Bishop	1.8
Rook	3.4
Queen	7.8

plot of chunk plot_est_one_b

As in the previous blog post, I ran the regressions again with filters for minimum Elo. The reasoning is that better players will exhibit higher quality play, rather than average play. Here are plots by minimum Elo, snapshot and piece. We see that better players are better able to capitalize on bishops and perhaps knights, but otherwise the valuations are largely consistent across player ability.

plot of chunk plot_est_two

Passed pawns

As in the previous blog post, I computed the difference in counts of passed pawns. I classified passed pawns as belonging to ranks 2, 3 or 4, to rank 5, to rank 6, or to rank 7; any pawn on rank 7 is automatically a passed pawn. When computing the material difference from White's point of view, the ranks are mirror image for Black in the obvious way. Again, I fit a model of the form

$$ \operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_{234} \Delta PP_{234} + c_{5} \Delta PP_{5} + c_{6} \Delta PP_{6} + c_{7} \Delta PP_{7} \right]. $$

Here are the estimated regression coefficients and standard errors denominated in Elo, along with the Wald statistics.
Below I plot the coefficients. It should seem odd to you that high rank passed pawns can sometimes have negative value. For example, a pawn on the seventh rank in the first third of a match appears to be worth around -15 Elo. The reason for this apparent contradiction is that this valuation is conditional on taking a snapshot in the first third of a match, but in most situations if you have a pawn on the 7th rank, it can often quickly lead to a victory. We are, however, looking at those cases where it does not. For this analysis it probably makes more sense to look at the random snapshot to get an "average" value.

snapshot	term	Estimate	Std.Error	Statistic
random	Elo	1.01	0.001	715.29
random	White Tempo	64.87	0.261	248.86
random	P.P. Rank234	59.01	0.955	61.81
random	P.P. Rank5	39.98	1.415	28.25
random	P.P. Rank6	27.96	1.316	21.24
random	P.P. Rank7	69.49	1.408	49.34
first third	Elo	1.02	0.001	719.17
first third	White Tempo	66.06	0.260	254.11
first third	P.P. Rank234	147.94	5.674	26.07
first third	P.P. Rank5	68.32	7.697	8.88
first third	P.P. Rank6	-8.71	7.618	-1.14
first third	P.P. Rank7	-15.68	11.592	-1.35
second third	Elo	1.01	0.001	715.28
second third	White Tempo	64.50	0.261	247.28
second third	P.P. Rank234	106.15	1.174	90.42
second third	P.P. Rank5	73.42	1.692	43.39
second third	P.P. Rank6	46.71	1.655	28.22
second third	P.P. Rank7	51.19	1.955	26.19
last third	Elo	1.01	0.001	711.41
last third	White Tempo	63.95	0.261	244.61
last third	P.P. Rank234	44.47	0.651	68.36
last third	P.P. Rank5	33.59	0.972	34.55
last third	P.P. Rank6	25.85	0.888	29.10
last third	P.P. Rank7	77.31	0.926	83.50

plot of chunk plot_ppest_one

Spline Regressions

It is hard to interpret the coefficients because the value of pieces appears to depend on the phase of play. To remedy this, I interacted the material difference with some spline terms. That is, I compute some spline functions of the ply at which the snapshot is taken, the multiply those by the material differences. I then estimate the coefficients of the interaction terms, and combine them with the spline functions. This gives material value as a function of the ply, which I plot below. I did this two different ways: once computing splines over the raw ply, and then using the ply divided by the total ply. In the latter formulation you can view the valuation as percent progress in the match. The problem with this formulation is that you typically do not know how far along in the match you are. On the other hand, because matches progress at different speeds, basing value on the raw ply also seems flawed.

For the raw ply plot we plot in a single facet. For the percentage regression, this results in a visually unreadable plot, so we use separate facets for the different pieces. For the raw ply regression we see near equal values of the bishop and knight through most of the match; rooks increase in value after the 20th ply; queens are valuable from early in the match, and increase in value as the match progresses. This phenomenon is intuitive, as the queen is less likely to be captured later in the match when there are fewer pieces on the board.

plot of chunk plot_spline_one

plot of chunk plot_spline_two

Below I express those estimates relative to the estimated value of a pawn. Again we lose the standard error bars. For the raw ply regression, we see a bulge at around 25 ply where pawns have very low value, and queens peak. A different pattern emerges in the percent ply regressions, where queens increase in value steadily over the course of the match.

plot of chunk plot_spline_rel_one

Future work

The analysis here indicates we need a better measure of match progress, one which can be computed in real time, but which matches the tempo of the particular match. It would seem that something like total material on the board would be a good measure. This is intuitive, as crowded positions are dangerous in Atomic and can quickly lead to large changes in the material difference. I also want to perform an analysis using survival analysis.

Gilgamath

Atomic Piece Values, Again

Passed pawns

Spline Regressions

Future work

Comments