Gilgamath

Antichess Elo Problems

Steven — Sun, 19 Sep 2021 21:50:52 -0700

Antichess Elo on lichess are miscalibrated.

Antichess Piece Values

Steven E. Pav — Fri, 10 Sep 2021 21:24:59 -0700

In a previous blog post I used logistic regression on games played data to estimate the piece value of pieces in Atomic chess. Since then I have been playing less Atomic and more Antichess. In Antichess, you win by losing all your pieces. To facilitate this, capturing is compulsory when possible; when multiple captures are possible you may select among them. There is no castling, and a pawn may promote to a king, but otherwise it is like traditional chess. (For more on antichess, I highly recommend Andrejić's book, The Ultimate Guide to Antichess.)

A king is a relatively powerful piece in Antichess: it can not as easily be turned into a "loose cannon", yet it can move in any direction. In general you want to keep your king on the board and remove your opponent's king. In that spirit, I wanted to estimate piece values in Antichess. I will use logistic regression for the analysis, as I did in my analysis of atomic chess.

For the analysis I pulled rated games data from Lichess. I wrote some code that will download and parse this data, turning it into a CSV file. I am sharing v1 of this file, but please remember that Lichess is the ultimate copyright holder.

The games in the dataset end in one of three conditions: Normal, Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, and I will consider only those. Also, games are played at various time controls, and players can make suboptimal moves when pressed for time, so I will restrict to games played with at least two minutes per side.

The game data includes Elo scores (well, Glicko scores, but …

Atomic Piece Values, Again

Steven E. Pav — Mon, 31 May 2021 21:46:54 -0700

In a previous blog post I used logistic regression to estimate the values of pieces in Atomic chess. In that study I computed material differences between the two players using a snapshot 8 plies before the end of the match. (A "ply" is a move by a single player.) That choice of snapshot was arbitrary, but it is typically late enough in the match so there is some material difference to measure, and also near enough to the end to estimate the "power" of each piece to bring victory. However, this valuation is rather late in the game, and is probably not representative of the average value of the pieces. That is, a knight advantage early in the game could be parlayed into a queen advantage later, which could then prove decisive.

To fix that issue, I will re-perform that analysis on other snapshots. Recall that I am working from 9 million rated Atomic games that I downloaded from Lichess. For each match I selected a pseudo-random ply after the second and before the last ply of each game, uniformly. (There is no material difference before the third ply.) I also selected pseudo-random snapshots in the first third, the second third, and the last third of each match. I compute the difference in material as well as differences in passed pawn counts for each snapshot. You can download v2 of the data, and the code.

Recall that I am using logistic regression to estimate coefficients in the model

$$ \operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_K \Delta K + c_B \Delta B + c_R \Delta R + c_Q \Delta Q \right], $$

where $\Delta e$ is the difference in Elo, and $\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q$ are the …

Atomic Piece Values

Steven E. Pav — Mon, 10 May 2021 21:53:59 -0700

Most chess playing computer programs use forward search over the tree of possible moves. Because such a search cannot examine every branch to termination of the game, usually "static" evaluation of leaf nodes in the tree is via the combination of a bunch of scoring rules. These typically include a term for the material balance of the position.
In traditional chess the pieces are usually assigned scores of 1 point for pawns, around 3 points for knights and bishops, 5 for rooks, and 9 for queens. Human players often use this heuristic when considering exchanges.

I recently started playing a chess variant called Atomic chess. In Atomic, when a piece captures another, both are removed from the board, along with all non-pawn pieces in the up to eight adjacent squares. The idea is that a capture causes an 'explosion'. Lichess plays a delightful explosion noise when this happens.

The traditional scoring heuristic is apparently based on mobility of the pieces. While movement of pieces is the same in the Atomic variant, I suspect that traditional scoring is not well calibrated for Atomic: A piece can capture only once in Atomic; a piece can remove multiple pieces from the board in one capture; pieces have value as protective 'chaff'; Kings cannot capture pieces, so solo mates are possible; pawns on the seventh rank can trap high-value pieces by threatening promotion; there are numerous fools' mates involving knights, etc. Can we create a scoring heuristic calibrated for Atomic?

The problem would seem intractable from first principles, because piece value is so different from average piece mobility. Instead, perhaps we can infer a kind of average value for pieces. In a previous blog post I performed a quick analysis of Atomic openings on a database of around 9 million games played on Lichess …

Atomic Openings

Steven E. Pav — Fri, 07 May 2021 21:51:52 -0700

I've started playing a variant of chess called Atomic. The pieces move like traditional chess, and start in the same position. In this variant, however, when a piece takes another piece, both are removed from the board, as well as any non-pawn pieces on the (up to eight) adjacent squares. As a consequence of this one change, the game can end if your King is 'blown up' by your opponent's capture. As another consequence, Kings cannot capture, and may occupy adjacent squares.

For example, from the following position White's Knight can blow up the pawns at either d7 or f7, blowing up the Black King and ending the game.

I looked around for some resources on Atomic chess, but have never had luck with traditional chess studies. Instead I decided to learn about Atomic statistically.

As it happens, Lichess (which is truly a great site) publishes their game data which includes over 9 million Atomic games played. I wrote some code that will download and parse this data, turning it into a CSV file. You can download v1 of this file, but Lichess is the ultimate copyright holder.

First steps

The games in the dataset end in one of three conditions: Normal (checkmate or what passes for it in Atomic), Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, as tabulated here:

termination	n
Normal	8426052
Time forfeit	1257295

The game data includes Elo scores for players, as computed prior to the game. As a first check, I wanted to see if Elo is properly calibrated. To do this, I compute the empirical win rate of White over Black, grouped by bins of the difference in their Elo …

Nonparametric Market Timing

Steven — Sat, 04 Jan 2020 21:07:06 -0800

Market timing a single instrument with a single feature

ohenery!

Steven — Wed, 25 Sep 2019 21:32:52 -0700

ohenery package to CRAN

Discrete State Market Timing

Steven — Sun, 30 Jun 2019 10:22:58 -0700

Market timing with a discrete feature

Conditional Portfolios with Feature Flattening

Steven E. Pav — Wed, 19 Jun 2019 21:04:21 -0700

Conditional Portfolios

When I first started working at a quant fund I tried to read about portfolio theory. (Beyond, you know, "Hedge Funds for Dummies.") I learned about various objectives and portfolio constraints, including the Markowitz portfolio, which felt very natural. Markowitz solves the mean-variance optimization problem, as well as the Sharpe maximization problem, namely

$$ \operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}. $$

This is solved, up to scaling, by the Markowitz portfolio $\Sigma^{-1}\mu$.

When I first read about the theory behind Markowitz, I did not read anything about where $\mu$ and $\Sigma$ come from. I assumed the authors I was reading were talking about the vanilla sample estimates of the mean and covariance, though the theory does not require this.

There are some problems with the Markowitz portfolio. For us, as a small quant fund, the most pressing issue was that holding the Markowitz portfolio based on the historical mean and covariance was not a good look. You don't get paid "2 and twenty" for computing some long term averages.

Rather than holding an unconditional portfolio, we sought to construct a conditional one, conditional on some "features". (I now believe this topic falls under the rubric of "Tactical Asset Allocation".) We stumbled on two simple methods for adapting Markowitz theory to accept conditioning information: Conditional Markowitz, and "Flattening".

Conditional Markowitz

Suppose you observe some $l$ vector of features, $f_i$ prior to the time you have to allocate into $p$ assets to enjoy returns $x_i$. Assume that the returns are linear in the features, but the covariance is a long term average. That is

$$ E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma. $$

Note that Markowitz theory never really said how to estimate mean …

No Parity like a Risk Parity.

Steven E. Pav — Sun, 09 Jun 2019 22:53:04 -0700

Portfolio Selection and Exchangeability

Consider the problem of portfolio selection, where you observe some historical data on $p$ assets, say $n$ days worth in an $n\times p$ matrix, $X$, and then are required to construct a (dollarwise) portfolio $w$. You can view this task as a function $w\left(X\right)$. There are a few different kinds of $w$ function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.

How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.

One way to consider these different methods of portfolio construction is via the lens of exchangeability. Roughly speaking, how does the function $w\left(X\right)$ react under certain systematic changes in $X$ that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of $X$ alphabetically, so now you must reorder your $X$. Assuming no new data has been observed, shouldn't $w\left(X\right)$ simply reorder its output in the same way?

Put another way, suppose a method $w$ systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where AAPL happened to be the first stock in the universe. Your $w$ might seem to outperform other methods for no good reason.

Equivariance to order is a kind of exchangeability condition. The 'right' kind of $w$ is 'order …