Gilgamathhttps://www.gilgamath.com/Mon, 31 May 2021 21:46:54 -0700Atomic Piece Values, Againhttps://www.gilgamath.com/atomic-three.html<p>In a <a href="atomic-two">previous blog post</a> I used logistic regression to estimate
the values of pieces in Atomic chess. In that study I computed material
differences between the two players using a snapshot 8 plies before the end
of the match. (A "ply" is a move by a single player.) That choice of snapshot
was arbitrary, but it is typically late enough in the match so there is some
material difference to measure, and also near enough to the end to estimate the "power"
of each piece to bring victory.
However, this valuation is rather late in the game, and is probably not
representative of the average value of the pieces.
That is, a knight advantage early in the game could be parlayed into a queen
advantage later, which could then prove decisive. </p>
<!-- PELICAN_END_SUMMARY -->
<p>To fix that issue, I will re-perform that analysis on other snapshots.
Recall that I am working from 9 million rated Atomic games that I downloaded from Lichess.
For each match I selected a pseudo-random ply after the second and before the
last ply of each game, uniformly. (There is no material difference before the third ply.)
I also selected pseudo-random snapshots in
the first third, the second third, and the last third of each match. I
compute the difference in material as well as differences in passed pawn counts
for each snapshot.
You can download <a href="https://drive.google.com/file/d/1RPMJZ4MvCf1VqkZUNpqcR85isK0cgDdU/view?usp=sharing">v2 of the data</a>,
and the <a href="https://github.com/shabbychef/lichess-data">code</a>.</p>
<p>Recall that I am using logistic regression to estimate coefficients in the
model
</p>
<div class="math">$$
\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e +
c_P \Delta P +
c_K \Delta K +
c_B \Delta B +
c_R \Delta R +
c_Q \Delta Q
\right],
$$</div>
<p>
where <span class="math">\(\Delta e\)</span> is the difference in Elo, and
<span class="math">\(\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q\)</span> are the …</p>Steven E. PavMon, 31 May 2021 21:46:54 -0700tag:www.gilgamath.com,2021-05-31:/atomic-three.htmlanalysisdatachessatomicAtomic Piece Valueshttps://www.gilgamath.com/atomic-two.html<p>Most chess playing computer programs use forward search over the tree of possible moves.
Because such a search cannot examine every branch to termination of the game,
usually "static" evaluation of leaf nodes in the tree is via the combination of
a bunch of scoring rules.
These typically include a term for the material balance of the position.<br>
In traditional chess the pieces are usually assigned scores of 1 point for pawns,
around 3 points for knights and bishops, 5 for rooks, and 9 for queens.
Human players often use this heuristic when considering exchanges.</p>
<p>I recently started playing a chess variant called <a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic chess</a>.
In Atomic, when a piece captures another, both are removed from the board,
along with all non-pawn pieces in the up to eight adjacent squares.
The idea is that a capture causes an 'explosion'.
Lichess plays a delightful explosion noise when this happens.</p>
<p>The traditional scoring heuristic is apparently based on mobility of the pieces.
While movement of pieces is the same in the Atomic variant,
I suspect that traditional scoring is not well calibrated for Atomic:
A piece can capture only once in Atomic;
a piece can remove multiple pieces from the board in one capture;
pieces have value as protective 'chaff';
Kings cannot capture pieces, so solo mates are possible;
pawns on the seventh rank can trap high-value pieces by threatening promotion;
there are numerous fools' mates involving knights, <em>etc.</em>
Can we create a scoring heuristic calibrated for Atomic?</p>
<!-- PELICAN_END_SUMMARY -->
<p>The problem would seem intractable from first principles,
because piece value is so different from average piece mobility.
Instead, perhaps we can infer a kind of average value for pieces.
In a <a href="atomic-one">previous blog post</a> I performed a quick analysis of Atomic
openings on a database of around 9 million games played on Lichess …</p>Steven E. PavMon, 10 May 2021 21:53:59 -0700tag:www.gilgamath.com,2021-05-10:/atomic-two.htmlanalysisdatachessatomicAtomic Openingshttps://www.gilgamath.com/atomic-one.html<p>I've started playing a variant of chess called
<a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic</a>.
The pieces move like traditional chess, and start in the same position.
In this variant, however, when a piece takes another piece, <em>both</em> are
removed from the board, as well as any non-pawn pieces on the (up to eight)
adjacent squares. As a consequence of this one change, the game can
end if your King is 'blown up' by your opponent's capture. As another
consequence, Kings cannot capture, and may occupy adjacent squares.</p>
<p>For example, from the following position White's Knight can
blow up the pawns at either d7 or f7, blowing up the Black King
and ending the game.</p>
<p><img src="https://www.gilgamath.com/figure/atomic_one_blowup-1.png" title="plot of chunk blowup" alt="plot of chunk blowup" width="700px" height="560px" /></p>
<!-- PELICAN_END_SUMMARY -->
<p>I looked around for some resources on Atomic chess, but have never
had luck with traditional chess studies.
Instead I decided to learn about Atomic statistically.</p>
<p>As it happens, Lichess (which is truly a great site)
<a href="https://database.lichess.org/#variant_games">publishes their game data</a>
which includes over 9 million Atomic games played.
I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a>
that will download and parse this data, turning it into a CSV file.
You can <a href="https://drive.google.com/file/d/1YqOFKZlCQoWJBUT9B3HI7q0i6DHvT9or/view?usp=sharing">download</a>
v1 of this file, but Lichess is the ultimate copyright holder.</p>
<h2>First steps</h2>
<p>The games in the dataset end in one of three conditions: Normal (checkmate or
what passes for it in Atomic), Time forfeit, and Abandoned (game terminated
before it began). The last category is very rare, and I omit these from my processing.
The majority of games end in the Normal way, as tabulated here:</p>
<table>
<thead>
<tr>
<th align="left">termination</th>
<th align="right">n</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Normal</td>
<td align="right">8426052</td>
</tr>
<tr>
<td align="left">Time forfeit</td>
<td align="right">1257295</td>
</tr>
</tbody>
</table>
<p>The game data includes Elo scores for players, as computed prior to the game.
As a first check, I wanted to see if Elo is properly calibrated.
To do this, I compute the empirical win rate of White over Black,
grouped by bins of the difference in their Elo …</p>Steven E. PavFri, 07 May 2021 21:51:52 -0700tag:www.gilgamath.com,2021-05-07:/atomic-one.htmlanalysisdatachessatomicNonparametric Market Timinghttps://www.gilgamath.com/nonparametric_market_timing.html<p>Market timing a single instrument with a single feature</p>StevenSat, 04 Jan 2020 21:07:06 -0800tag:www.gilgamath.com,2020-01-04:/nonparametric_market_timing.htmlquant-financeRanalysisstatisticsMarkowitzportfoliotactical-asset-allocationohenery!https://www.gilgamath.com/ohenery.html<p>ohenery package to CRAN</p>StevenWed, 25 Sep 2019 21:32:52 -0700tag:www.gilgamath.com,2019-09-25:/ohenery.htmlRpackageDiscrete State Market Timinghttps://www.gilgamath.com/market-timing.html<p>Market timing with a discrete feature</p>StevenSun, 30 Jun 2019 10:22:58 -0700tag:www.gilgamath.com,2019-06-30:/market-timing.htmlquant-financeRanalysisstatisticsMarkowitzportfoliotactical-asset-allocationConditional Portfolios with Feature Flatteninghttps://www.gilgamath.com/portfolio-flattening.html<h2>Conditional Portfolios</h2>
<p>When I first started working at a quant fund I tried to read
about portfolio theory. (Beyond, you know, "<em>Hedge Funds for Dummies</em>.")
I learned about various objectives and portfolio constraints,
including the Markowitz portfolio, which felt very natural.
Markowitz solves the mean-variance optimization problem, as
well as the Sharpe maximization problem, namely
</p>
<div class="math">$$
\operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}.
$$</div>
<p>
This is solved, up to scaling, by the Markowitz portfolio <span class="math">\(\Sigma^{-1}\mu\)</span>.</p>
<p>When I first read about the theory behind Markowitz, I
did not read anything about where <span class="math">\(\mu\)</span> and <span class="math">\(\Sigma\)</span> come from.
I assumed the authors I was reading were talking about the
vanilla sample estimates of the mean and covariance,
though the theory does not require this.</p>
<p>There are some problems with the Markowitz portfolio.
For us, as a small quant fund, the most pressing issue
was that holding the Markowitz portfolio based on the
historical mean and covariance was not a good look.
You don't get paid "2 and twenty" for computing some
long term averages.</p>
<!-- PELICAN_END_SUMMARY -->
<p>Rather than holding an <em>unconditional</em> portfolio,
we sought to construct a <em>conditional</em> one,
conditional on some "features".
(I now believe this topic falls under the rubric of "Tactical Asset
Allocation".)
We stumbled on two simple methods for adapting
Markowitz theory to accept conditioning information:
Conditional Markowitz, and "Flattening".</p>
<h2>Conditional Markowitz</h2>
<p>Suppose you observe some <span class="math">\(l\)</span> vector of features, <span class="math">\(f_i\)</span> prior
to the time you have to allocate into <span class="math">\(p\)</span> assets to enjoy
returns <span class="math">\(x_i\)</span>. Assume that the returns are linear in the features,
but the covariance is a long term average. That is
</p>
<div class="math">$$
E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma.
$$</div>
<p>Note that Markowitz theory never really said how to estimate
mean …</p>Steven E. PavWed, 19 Jun 2019 21:04:21 -0700tag:www.gilgamath.com,2019-06-19:/portfolio-flattening.htmlquant-financeanalysisstatisticsMarkowitzportfoliotactical-asset-allocationNo Parity like a Risk Parity.https://www.gilgamath.com/risk-parity.html<h2>Portfolio Selection and Exchangeability</h2>
<p>Consider the problem of <em>portfolio selection</em>, where you observe
some historical data on <span class="math">\(p\)</span> assets, say <span class="math">\(n\)</span> days worth in an <span class="math">\(n\times p\)</span>
matrix, <span class="math">\(X\)</span>, and then are required to construct a (dollarwise)
portfolio <span class="math">\(w\)</span>.
You can view this task as a function <span class="math">\(w\left(X\right)\)</span>.
There are a few different kinds of <span class="math">\(w\)</span> function: Markowitz,
equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'),
and so on.</p>
<p>How are we to choose among these competing approaches?
Their supporters can point to theoretical underpinnings,
but these often seem a bit shaky even from a distance.
Usually evidence is provided in the form of backtests
on the historical returns of some universe of assets.
It can be hard to generalize from a single history,
and these backtests rarely offer theoretical justification
for the differential performance in methods.</p>
<!-- PELICAN_END_SUMMARY -->
<p>One way to consider these different methods of portfolio
construction is via the lens of <em>exchangeability</em>.
Roughly speaking, how does the function <span class="math">\(w\left(X\right)\)</span> react
under certain systematic changes in <span class="math">\(X\)</span> that "shouldn't" matter.
For example, suppose that the ticker changed on
one stock in your universe. Suppose you order the columns of
<span class="math">\(X\)</span> alphabetically, so now you must reorder your <span class="math">\(X\)</span>.
Assuming no new data has been observed, shouldn't
<span class="math">\(w\left(X\right)\)</span> simply reorder its output in the same way?</p>
<p>Put another way, suppose a method <span class="math">\(w\)</span> systematically
overweights the first element of the universe
(This seems more like a bug than a feature),
and you observe backtests over the 2000's on
U.S. equities where <code>AAPL</code> happened to be the
first stock in the universe. Your <span class="math">\(w\)</span> might
seem to outperform other methods for no good reason.</p>
<p>Equivariance to order is a kind of exchangeability condition.
The 'right' kind of <span class="math">\(w\)</span> is 'order …</p>Steven E. PavSun, 09 Jun 2019 22:53:04 -0700tag:www.gilgamath.com,2019-06-09:/risk-parity.htmlquant-financeanalysisstatisticsMarkowitzportfolioRfromo 0.2.0https://www.gilgamath.com/fromo-two.html<p>I recently pushed version 0.2.0 of my <code>fromo</code> package to
<a href="https://cran.r-project.org/package=fromo">CRAN</a>.
This package implements (relatively) fast, numerically robust
computation of moments via <code>Rcpp</code>.
<!-- PELICAN_END_SUMMARY --></p>
<p>The big changes in this release are:</p>
<ul>
<li>Support for weighted moment estimation.</li>
<li>Computation of running moments over windows defined
by time (or some other increasing index), rather
than vector index.</li>
<li>Some modest improvements in speed for the 'dangerous'
use cases (no checking for <code>NA</code>, no weights, <em>etc.</em>)</li>
</ul>
<p>The time-based running moments are supported via the <code>t_running_*</code> operations,
and we support means, standard deviation, skew, kurtosis, centered and
standardized moments and cumulants, z-score, Sharpe, and t-stat. The
idea is that your observations are associated with some increasing
index, which you can think of as the observation time, and you wish
to compute moments over a fixed time window. To bloat the API, the
times from which you 'look back' can optionally be something other
than the time indices of the input, so the input and output size
can be different.</p>
<p>Some example uses might be:</p>
<ul>
<li>Compute the volatility of an asset's returns over the previous 6 months,
on every trade day.</li>
<li>Compute the total monthly sales of a company at month ends.</li>
</ul>
<p>Because the API also allows you to use weights as implicit time deltas, you can
also do weird and unadvisable things like compute the Sharpe of an asset
over the last 1 million shares traded.</p>
<p>Speed improvements come from my random walk through c++ design idioms.
I also implemented a 'swap' procedure for the running standard deviation
which incorporates a Welford's method addition and removal into a single
step. I do not believe that Welford's method is the fastest algorithm
for a summarizing moment computation: probably a two pass solution to
compute the mean first, then the centered moments is faster. However,
for the …</p>Steven E. PavSun, 13 Jan 2019 10:23:39 -0800tag:www.gilgamath.com,2019-01-13:/fromo-two.htmlRpackageTwelve Dimensional Chess is Stupidhttps://www.gilgamath.com/twelve_dimensional_chess.html<p>Chess and the Curse of Dimensionality</p>StevenTue, 16 Oct 2018 22:24:30 -0700tag:www.gilgamath.com,2018-10-16:/twelve_dimensional_chess.htmlanalysischess