Gilgamathhttps://www.gilgamath.com/2021-09-19T21:50:52-07:00Antichess Elo Problems2021-09-19T21:50:52-07:002021-09-19T21:50:52-07:00Steventag:www.gilgamath.com,2021-09-19:/antichess-two.html<p>Antichess Elo on lichess are miscalibrated.</p><p>In a <a href="antichess-one">previous blog post</a> I looked at opening moves and piece values in Antichess based on games data downloaded from Lichess. One peculiarity I noted there was that the Elo values (well, they are Glicko-2) on Lichess are <em>miscalibrated</em> in the sense that they exaggerate the probability of a win. Usually an Elo difference of 400 points is supposed to translate to 10-to-1 odds of the higher rated player winning. However, I found that the odds were somewhat less, maybe like 9-to-1. While this seems like a minor point, it also means that the highest rated players effectively have overinflated Elo scores (or at least compared to the <em>hoi polloi</em>, as only difference in Elo is meaningful).</p> <!-- PELICAN_END_SUMMARY --> <p>In this blog post, I will examine the miscalibration of Antichess Elo. As in previous blog posts I view Elo through the lense of logistic regression. Letting <span class="math">$$p$$</span> be the probability that White wins an Antichess match, and let <span class="math">$$\Delta e$$</span> be the difference in Antichess Elo (White's minus Black's). Then the Elos are properly calibrated if </p> <div class="math">$$\left(\frac{p}{1-p}\right) = 10^\frac{\Delta e}{400}.$$</div> <p> That is the odds scale as 10x for each 400 difference in Elo. The overall level of Elo is arbitrary, though I have schemes for fixing that.</p> <p>Taking the natural log of both sides, we have </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e.$$</div> <p> This is a statistician's logistic regression. And while we are using logistic regression, we can add terms. Because Antichess has been shown to be a winning game for White, it would seem that there should be some boost to White's odds that are independent of the Elo difference. So we change the equation to </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e + b,$$</div> <p> for some unknown <span class="math">$$b$$</span>, which represents White's tempo advantage. If instead we write this as </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e + \frac{\operatorname{log}(10)}{400}c,$$</div> <p> then the constant <span class="math">$$c$$</span> is in 'units' of Elo.</p> <p>Given some data of pre-game Elos and outcomes, I will use logistic regression to estimate the constants <span class="math">$$c_1$$</span> and <span class="math">$$c_2$$</span> in the equation </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left(c_1 \Delta e + c_2\right),$$</div> <p> where <span class="math">$$\Delta e$$</span> are the measured Elos prior to the game. If <span class="math">$$c_1=1$$</span>, or is reasonably near it, then the Elos are properly calibrated. If <span class="math">$$c_1 &lt; 1$$</span> then the Elos have too much spread, if <span class="math">$$c_1 &gt; 1$$</span>, they have too little spread.</p> <p>I had a number of theories on why the data might appear miscalibrated, including:</p> <ul> <li>This is only an effect on tight time control matches.</li> <li>The miscalibration is only for low Elo players.</li> <li>The miscalibration is due to a bad implementation of Elo which has since been corrected.</li> </ul> <p>To examine questions like these, I break out the regressions by groups. That is, to examine <em>e.g.</em> time controls, I classify the matches into four groups, then perform the regression with <span class="math">$$c_1, c_2$$</span> for the "reference class", and then deltas to the <span class="math">$$c_1, c_2$$</span> for the other classes. The reference class for time controls might be games played at 3+ minutes, while the other classes are like &lt;15 second games, or 30-60 second games. This gives a way to see how far off <span class="math">$$c_1$$</span> is for the different classes. As it turns out, there is not much difference across these dimensions, and I find poor calibration is not due to any of these putative explanations.</p> <h2>Results</h2> <p>As before, I pulled <a href="https://database.lichess.org/#variant_games">rated games data</a> from Lichess using <a href="https://github.com/shabbychef/lichess-data">code</a> I wrote to download and parse the data, turning it into a CSV file. My analysis here is based off of <a href="https://drive.google.com/file/d/15lTi5PAtnkeGIvuSP2vT_nSi1u06oHg_/view?usp=sharing">v1 of this file</a>, but please remember that Lichess is the ultimate copyright holder.</p> <p>As in my previous analysis, I restrict attention to cases where the players already have 50 games in the database, to avoid burn-in issues. Except for the study on time controls, I will only look at matches played at 2+ minutes per side. I will generally restrict attention to matches between players with at least 1500 Elo pre-game.</p> <p>First the regressions for time controls. I classify matches as based on their initial time, as being on time control of 15 seconds or less; 30 to 60 seconds; 90 to 120 seconds; or 180 seconds up to 600 seconds per side. The longest time controls are the reference class. In the table below the "estimate" is the estimated value of the <span class="math">$$c_1$$</span> or <span class="math">$$c_2$$</span>, in Elo units. The "std.error" is the standard error, and the statistic is a Wald statistic. The p-values are all exceedingly small. White's Tempo advantage (<span class="math">$$c_2$$</span>) is equal to around 15-20 Elo points. Note that the "Elo" here refers to the pre-game Elo difference and corresponds to <span class="math">$$c_1$$</span> for the reference class. We see that it falls rather short of the value 1. Terms like "Elo:time_control&lt;=15" are the deltas to that reference class value for pre-game Elo. Thus we see that for the ultrashort time control matches, the <span class="math">$$c_1$$</span> is around 0.931 plus around 0.0171 for a total value of around 0.9481.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> </tr> </thead> <tbody> <tr> <td align="left">Tempo</td> <td align="right">15.90000</td> <td align="right">0.2300</td> <td align="right">70.026856</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.93100</td> <td align="right">0.0013</td> <td align="right">690.521161</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:time_control&lt;=15</td> <td align="right">1.82000</td> <td align="right">0.6000</td> <td align="right">3.059462</td> <td align="right">0.002217</td> </tr> <tr> <td align="left">Tempo:time_control30-60</td> <td align="right">1.59000</td> <td align="right">0.3800</td> <td align="right">4.153711</td> <td align="right">0.000033</td> </tr> <tr> <td align="left">Tempo:time_control90-120</td> <td align="right">0.90900</td> <td align="right">0.4200</td> <td align="right">2.172289</td> <td align="right">0.029834</td> </tr> <tr> <td align="left">Elo:time_control&lt;=15</td> <td align="right">0.01710</td> <td align="right">0.0035</td> <td align="right">4.825546</td> <td align="right">0.000001</td> </tr> <tr> <td align="left">Elo:time_control30-60</td> <td align="right">0.02830</td> <td align="right">0.0023</td> <td align="right">12.462710</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:time_control90-120</td> <td align="right">-0.00184</td> <td align="right">0.0024</td> <td align="right">-0.753743</td> <td align="right">0.451003</td> </tr> </tbody> </table> <p>I thought that perhaps the issue was due to matches where there is a large difference in pre-game Elo. Perhaps a low skill player can get lucky, and thus throw off the probablities. I perform the same regression as above, grouping matches by the absolute difference in Elo between them. A difference of 0 to 100 Elo is taken as the reference class. However, we see that Elo is still miscalibrated in this case. The effect is slightly worse when there is 600+ difference in pre-game Elo, but still, the original hypothesis is not valid.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> </tr> </thead> <tbody> <tr> <td align="left">Tempo</td> <td align="right">16.1000</td> <td align="right">0.2200</td> <td align="right">71.720779</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.9230</td> <td align="right">0.0040</td> <td align="right">229.052970</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:delta_elo(100,200]</td> <td align="right">0.3540</td> <td align="right">0.3600</td> <td align="right">0.977188</td> <td align="right">0.328476</td> </tr> <tr> <td align="left">Tempo:delta_elo(200,400]</td> <td align="right">1.8100</td> <td align="right">0.4200</td> <td align="right">4.347764</td> <td align="right">0.000014</td> </tr> <tr> <td align="left">Tempo:delta_elo(400,600]</td> <td align="right">1.2000</td> <td align="right">1.1000</td> <td align="right">1.137562</td> <td align="right">0.255304</td> </tr> <tr> <td align="left">Tempo:delta_elo(600,Inf]</td> <td align="right">10.0000</td> <td align="right">4.1000</td> <td align="right">2.422381</td> <td align="right">0.015419</td> </tr> <tr> <td align="left">Elo:delta_elo(100,200]</td> <td align="right">0.0162</td> <td align="right">0.0045</td> <td align="right">3.623650</td> <td align="right">0.000290</td> </tr> <tr> <td align="left">Elo:delta_elo(200,400]</td> <td align="right">0.0218</td> <td align="right">0.0042</td> <td align="right">5.164092</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:delta_elo(400,600]</td> <td align="right">0.0129</td> <td align="right">0.0046</td> <td align="right">2.811665</td> <td align="right">0.004929</td> </tr> <tr> <td align="left">Elo:delta_elo(600,Inf]</td> <td align="right">-0.0250</td> <td align="right">0.0074</td> <td align="right">-3.363104</td> <td align="right">0.000771</td> </tr> </tbody> </table> <p>Perhaps the <em>average</em> Elo can explain the effect: maybe luck plays a greater role among lower skilled players. I group matches by the average pre-game Elo of the players and run the regressions again. Here 2000+ is the reference class. Looking at the coefficients below we see that we still have miscalibration. In fact, the effect is more muted for low skill players, who have closer to nominal value of Elo.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> </tr> </thead> <tbody> <tr> <td align="left">Tempo</td> <td align="right">23.0000</td> <td align="right">0.2800</td> <td align="right">82.3523</td> <td align="right">0</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.9100</td> <td align="right">0.0016</td> <td align="right">556.0643</td> <td align="right">0</td> </tr> <tr> <td align="left">Tempo:avg_elo(1500,1750]</td> <td align="right">-16.3000</td> <td align="right">0.4300</td> <td align="right">-37.5902</td> <td align="right">0</td> </tr> <tr> <td align="left">Tempo:avg_elo(1750,2000]</td> <td align="right">-5.9000</td> <td align="right">0.3600</td> <td align="right">-16.4339</td> <td align="right">0</td> </tr> <tr> <td align="left">Elo:avg_elo(1500,1750]</td> <td align="right">0.0355</td> <td align="right">0.0028</td> <td align="right">12.5513</td> <td align="right">0</td> </tr> <tr> <td align="left">Elo:avg_elo(1750,2000]</td> <td align="right">0.0460</td> <td align="right">0.0021</td> <td align="right">22.2081</td> <td align="right">0</td> </tr> </tbody> </table> <p>Maybe this is a problem that has already been addressed by Lichess, some bug that affected how Elo (Glicko2, really) was being computed, and is no longer an issue. I classify games by the year they were played, with 2021 as the reference class. Indeed we see now that the value of Elo is near 1, while it was much lower in 2014 and 2015. So my leading theory is that something in the computation was previously off, but has perhaps been fixed?</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> </tr> </thead> <tbody> <tr> <td align="left">Tempo</td> <td align="right">1.72e+01</td> <td align="right">0.4000</td> <td align="right">42.480949</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo</td> <td align="right">9.68e-01</td> <td align="right">0.0026</td> <td align="right">372.815587</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:play_year2014</td> <td align="right">-1.68e+01</td> <td align="right">6.3000</td> <td align="right">-2.658131</td> <td align="right">0.007858</td> </tr> <tr> <td align="left">Tempo:play_year2015</td> <td align="right">-5.95e+00</td> <td align="right">0.7800</td> <td align="right">-7.625736</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:play_year2016</td> <td align="right">-4.94e+00</td> <td align="right">0.6100</td> <td align="right">-8.099793</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:play_year2017</td> <td align="right">3.09e-02</td> <td align="right">0.5800</td> <td align="right">0.053356</td> <td align="right">0.957449</td> </tr> <tr> <td align="left">Tempo:play_year2018</td> <td align="right">7.33e-01</td> <td align="right">0.5700</td> <td align="right">1.284990</td> <td align="right">0.198796</td> </tr> <tr> <td align="left">Tempo:play_year2019</td> <td align="right">1.67e+00</td> <td align="right">0.5700</td> <td align="right">2.958669</td> <td align="right">0.003090</td> </tr> <tr> <td align="left">Tempo:play_year2020</td> <td align="right">9.34e-04</td> <td align="right">0.5100</td> <td align="right">0.001816</td> <td align="right">0.998551</td> </tr> <tr> <td align="left">Elo:play_year2014</td> <td align="right">-9.07e-02</td> <td align="right">0.0400</td> <td align="right">-2.294870</td> <td align="right">0.021741</td> </tr> <tr> <td align="left">Elo:play_year2015</td> <td align="right">-1.03e-01</td> <td align="right">0.0046</td> <td align="right">-22.212239</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:play_year2016</td> <td align="right">-8.32e-02</td> <td align="right">0.0036</td> <td align="right">-23.427809</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:play_year2017</td> <td align="right">-3.15e-02</td> <td align="right">0.0035</td> <td align="right">-8.991584</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:play_year2018</td> <td align="right">-3.76e-02</td> <td align="right">0.0035</td> <td align="right">-10.798132</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:play_year2019</td> <td align="right">-2.48e-02</td> <td align="right">0.0035</td> <td align="right">-7.147420</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo:play_year2020</td> <td align="right">1.52e-02</td> <td align="right">0.0033</td> <td align="right">4.620739</td> <td align="right">0.000004</td> </tr> </tbody> </table> <p>Going back to the original regression, if we now restrict our attention to matches in 2020 and later, we see that Elo seems well calibrated at longer time controls, and is perhaps even more so at shorter time controls.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> </tr> </thead> <tbody> <tr> <td align="left">Tempo</td> <td align="right">16.7000</td> <td align="right">0.3700</td> <td align="right">45.46794</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.9700</td> <td align="right">0.0023</td> <td align="right">418.37293</td> <td align="right">0.000000</td> </tr> <tr> <td align="left">Tempo:time_control&lt;=15</td> <td align="right">-0.0320</td> <td align="right">1.2000</td> <td align="right">-0.02778</td> <td align="right">0.977838</td> </tr> <tr> <td align="left">Tempo:time_control30-60</td> <td align="right">0.6470</td> <td align="right">0.5900</td> <td align="right">1.10287</td> <td align="right">0.270085</td> </tr> <tr> <td align="left">Tempo:time_control90-120</td> <td align="right">1.6400</td> <td align="right">0.6900</td> <td align="right">2.38268</td> <td align="right">0.017187</td> </tr> <tr> <td align="left">Elo:time_control&lt;=15</td> <td align="right">0.0316</td> <td align="right">0.0076</td> <td align="right">4.17463</td> <td align="right">0.000030</td> </tr> <tr> <td align="left">Elo:time_control30-60</td> <td align="right">0.0118</td> <td align="right">0.0037</td> <td align="right">3.17373</td> <td align="right">0.001505</td> </tr> <tr> <td align="left">Elo:time_control90-120</td> <td align="right">0.0133</td> <td align="right">0.0044</td> <td align="right">3.05286</td> <td align="right">0.002267</td> </tr> </tbody> </table> <p>One implication of this is that my study of piece values and opening values should be re-run with an adjustment for pre-game Elo from prior to 2020. I don't think this will have a huge effect on the outcomes, however.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Antichess Piece Values2021-09-10T21:24:59-07:002021-09-10T21:24:59-07:00Steven E. Pavtag:www.gilgamath.com,2021-09-10:/antichess-one.html<p>In a <a href="atomic-three">previous blog post</a> I used logistic regression on games played data to estimate the piece value of pieces in Atomic chess. Since then I have been playing less Atomic and more <a href="https://en.wikipedia.org/wiki/Losing_chess">Antichess</a>. In Antichess, you win by losing all your pieces. To facilitate this, capturing is compulsory when possible; when multiple captures are possible you may select among them. There is no castling, and a pawn may promote to a king, but otherwise it is like traditional chess. (For more on antichess, I highly recommend Andrejić's book, <a href="http://perpetualcheck.com/antichess/"><em>The Ultimate Guide to Antichess</em></a>.)</p> <p>A king is a relatively powerful piece in Antichess: it can not as easily be turned into a "loose cannon", yet it can move in any direction. In general you want to keep your king on the board and remove your opponent's king. In that spirit, I wanted to estimate piece values in Antichess. I will use logistic regression for the analysis, as I did in my analysis of <a href="atomic-one">atomic chess</a>.</p> <!-- PELICAN_END_SUMMARY --> <p>For the analysis I pulled <a href="https://database.lichess.org/#variant_games">rated games data</a> from Lichess. I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a> that will download and parse this data, turning it into a CSV file. I am sharing <a href="https://drive.google.com/file/d/15lTi5PAtnkeGIvuSP2vT_nSi1u06oHg_/view?usp=sharing">v1 of this file</a>, but please remember that Lichess is the ultimate copyright holder.</p> <p>The games in the dataset end in one of three conditions: Normal, Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, and I will consider only those. Also, games are played at various time controls, and players can make suboptimal moves when pressed for time, so I will restrict to games played with at least two minutes per side.</p> <p>The game data includes Elo scores (well, Glicko scores, but …</p><p>In a <a href="atomic-three">previous blog post</a> I used logistic regression on games played data to estimate the piece value of pieces in Atomic chess. Since then I have been playing less Atomic and more <a href="https://en.wikipedia.org/wiki/Losing_chess">Antichess</a>. In Antichess, you win by losing all your pieces. To facilitate this, capturing is compulsory when possible; when multiple captures are possible you may select among them. There is no castling, and a pawn may promote to a king, but otherwise it is like traditional chess. (For more on antichess, I highly recommend Andrejić's book, <a href="http://perpetualcheck.com/antichess/"><em>The Ultimate Guide to Antichess</em></a>.)</p> <p>A king is a relatively powerful piece in Antichess: it can not as easily be turned into a "loose cannon", yet it can move in any direction. In general you want to keep your king on the board and remove your opponent's king. In that spirit, I wanted to estimate piece values in Antichess. I will use logistic regression for the analysis, as I did in my analysis of <a href="atomic-one">atomic chess</a>.</p> <!-- PELICAN_END_SUMMARY --> <p>For the analysis I pulled <a href="https://database.lichess.org/#variant_games">rated games data</a> from Lichess. I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a> that will download and parse this data, turning it into a CSV file. I am sharing <a href="https://drive.google.com/file/d/15lTi5PAtnkeGIvuSP2vT_nSi1u06oHg_/view?usp=sharing">v1 of this file</a>, but please remember that Lichess is the ultimate copyright holder.</p> <p>The games in the dataset end in one of three conditions: Normal, Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, and I will consider only those. Also, games are played at various time controls, and players can make suboptimal moves when pressed for time, so I will restrict to games played with at least two minutes per side.</p> <p>The game data includes Elo scores (well, Glicko scores, but I will call them Elo) for players, as computed prior to the game. A piece in the hands of a more skilled player will have higher value, and I want to control for this. To avoid cases where the player is new to the system and their Elo has not yet converged to a good estimate, I select only matches where each player has already recorded at least 50 games in the database when the game is played. I also select for Normal terminations, since these are unambiguous victories. The remaining set contains 'only' 5654755 games.</p> <p>Checking on the data, here is a plot of empirical log odds, versus Elo difference, in bins 25 points wide. This is in natural log odds space, so we expect the line to have slope of <span class="math">$$\operatorname{log}(10) / 400$$</span> which is around 0.006.<br> While the empirical data is nearly linear in Elo difference, large differences in Elo overestimate the certainty of the victory for the stronger player. That is, the Elo is not well calibrated and exaggerates differences in ability.</p> <p><img src="https://www.gilgamath.com/figure/antichess_one_calibration-1.png" title="plot of chunk calibration" alt="plot of chunk calibration" width="1000px" height="800px" /></p> <p>In competitive traditional chess, draws are quite common. I was curious if a similar effect is seen in Antichess. Here is the ratio of odds of a tie, as a function of the average Elo of the two players. The log odds are perhaps increasing in average Elo, but then plateau or decrease at higher levels of play.</p> <p><img src="https://www.gilgamath.com/figure/antichess_one_tie_prob-1.png" title="plot of chunk tie_prob" alt="plot of chunk tie_prob" width="1000px" height="800px" /></p> <h2>Openings</h2> <p>Let us briefly consider openings. It is actually known that Antichess is a winning game for White (!) with opening <em>e2e3</em>. However, the <a href="http://magma.maths.usyd.edu.au/~watkins/LOSING_CHESS/ICGA2016.pdf">"proof"</a> relies on massive search, and no person could possibly remember all the lines. Do we find that players mostly play this opening? And when they do, does it boost their chances of winning?</p> <p>To answer that, here are bar plots of the most frequent opening moves by White, grouped by White's Elo.<br> I am not filtering on the outcome in these plots, only Elo, number of games and time controls. We see that indeed <em>e2e3</em> is the most commonly played opening, and is played more by higher Elo players. The second most popular opening, <em>g2g3</em> is favored more by lower Elo players.</p> <p><img src="https://www.gilgamath.com/figure/antichess_one_openings_plot-1.png" title="plot of chunk openings_plot" alt="plot of chunk openings_plot" width="1000px" height="800px" /></p> <p>Recall that Elo is defined so that the log odds of victory from White's perspective is 10 to the Elo difference (White's minus Black's) divided by 400. We expanded this in previous blog posts to include other terms, for example, playing a certain opening move, or having certain pieces still on the board.</p> <p>Here I perform a logistic regression analysis on our data under the basic model </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e + \frac{\operatorname{log}(10)}{400}c,$$</div> <p> where <span class="math">$$c$$</span> is in 'units' of Elo. Here is a table of the regression coefficients from that logistic regression, along with standard errors, the Wald statistics and p-values. I also convert the coefficient estimate into Elo equivalent units. White's tempo advantage is equivalent to around 18 extra Elo points. Note that the pre-game Elo is only equivalent to 0.93 Elo points. This is consistent with the plot above, where the pre-game Elo difference exaggerates probability of winning.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">(Intercept)</td> <td align="right">0.101</td> <td align="right">0.001</td> <td align="right">84.5</td> <td align="right">0</td> <td align="right">18.00</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.000</td> <td align="right">719.5</td> <td align="right">0</td> <td align="right">0.93</td> </tr> </tbody> </table> <h2>Logistic analysis of opening moves</h2> <p>Of course, this is only if White does not squander their tempo. Some opening moves are likely to result in a higher boost to White than this average value, and some lower. I will now fit a model of the form </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_{e2e3} 1_{e2e3} + c_{g2g3} 1_{g2g3} + \ldots \right],$$</div> <p> where <span class="math">$$1_{e2e3}$$</span> is a 0/1 variable for whether White played <em>e2e3</em> as their first move, and <span class="math">$$c_{e2e3}$$</span> is the coefficient which we will fit by logistic regression. By rescaling by the magic number <span class="math">$$\operatorname{log}(10) / 400$$</span>, the coefficients <span class="math">$$c$$</span> are denominated in 'Elo units'. Here is a table from that fit:</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.000</td> <td align="right">711.13</td> <td align="right">0</td> <td align="right">0.92</td> </tr> <tr> <td align="left">c2c4</td> <td align="right">0.161</td> <td align="right">0.005</td> <td align="right">31.50</td> <td align="right">0</td> <td align="right">28.00</td> </tr> <tr> <td align="left">g2g4</td> <td align="right">0.159</td> <td align="right">0.006</td> <td align="right">26.11</td> <td align="right">0</td> <td align="right">28.00</td> </tr> <tr> <td align="left">g1h3</td> <td align="right">0.137</td> <td align="right">0.005</td> <td align="right">25.93</td> <td align="right">0</td> <td align="right">24.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="right">0.132</td> <td align="right">0.002</td> <td align="right">73.41</td> <td align="right">0</td> <td align="right">23.00</td> </tr> <tr> <td align="left">g2g3</td> <td align="right">0.099</td> <td align="right">0.003</td> <td align="right">37.57</td> <td align="right">0</td> <td align="right">17.00</td> </tr> <tr> <td align="left">b2b3</td> <td align="right">0.020</td> <td align="right">0.004</td> <td align="right">5.38</td> <td align="right">0</td> <td align="right">3.40</td> </tr> <tr> <td align="left">Other</td> <td align="right">-0.018</td> <td align="right">0.004</td> <td align="right">-4.68</td> <td align="right">0</td> <td align="right">-3.10</td> </tr> </tbody> </table> <p>Oddly the provably winning move <em>e2e3</em> does not show as large an effect size as some of the other moves, although it does display greater statistical significance. This is likely because it is played far more often than the other moves, we have more evidence that it gives a non-zero boost to winning chances.</p> <h2>Second moves</h2> <p>The coefficients found here are for <em>average</em> play against these opening moves. Here I perform an analysis of Black's replies to White's opening moves. I filter for White's move, then perform a logistic regression, only considering the most frequent replies. Here is a table of the regression coefficients for three of White's best opening moves. Because the equations are still from White's point of view, Black would choose to play the move with the <em>lowest</em> Elo equivalent. Below we plot the coefficient estimates of Black's replies, again grouping statistically non-distinguishable coefficient estimates.</p> <table> <thead> <tr> <th align="left">move1</th> <th align="left">move2</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">c2c4</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.000</td> <td align="right">163.221</td> <td align="right">0.000</td> <td align="right">0.90</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">e7e6</td> <td align="right">-0.063</td> <td align="right">0.025</td> <td align="right">-2.489</td> <td align="right">0.013</td> <td align="right">-11.00</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">g7g6</td> <td align="right">0.013</td> <td align="right">0.017</td> <td align="right">0.757</td> <td align="right">0.449</td> <td align="right">2.20</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">c7c6</td> <td align="right">0.066</td> <td align="right">0.014</td> <td align="right">4.739</td> <td align="right">0.000</td> <td align="right">11.00</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">Other</td> <td align="right">0.121</td> <td align="right">0.017</td> <td align="right">7.210</td> <td align="right">0.000</td> <td align="right">21.00</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">b7b5</td> <td align="right">0.162</td> <td align="right">0.007</td> <td align="right">22.357</td> <td align="right">0.000</td> <td align="right">28.00</td> </tr> <tr> <td align="left">c2c4</td> <td align="left">d7d5</td> <td align="right">0.446</td> <td align="right">0.014</td> <td align="right">32.091</td> <td align="right">0.000</td> <td align="right">78.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.000</td> <td align="right">478.357</td> <td align="right">0.000</td> <td align="right">0.93</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">e7e6</td> <td align="right">0.095</td> <td align="right">0.007</td> <td align="right">13.920</td> <td align="right">0.000</td> <td align="right">16.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">b7b5</td> <td align="right">0.099</td> <td align="right">0.002</td> <td align="right">48.491</td> <td align="right">0.000</td> <td align="right">17.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">c7c5</td> <td align="right">0.252</td> <td align="right">0.007</td> <td align="right">36.701</td> <td align="right">0.000</td> <td align="right">44.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">b7b6</td> <td align="right">0.281</td> <td align="right">0.012</td> <td align="right">22.719</td> <td align="right">0.000</td> <td align="right">49.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">Other</td> <td align="right">0.417</td> <td align="right">0.009</td> <td align="right">46.487</td> <td align="right">0.000</td> <td align="right">72.00</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">a7a6</td> <td align="right">0.442</td> <td align="right">0.015</td> <td align="right">28.780</td> <td align="right">0.000</td> <td align="right">77.00</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.000</td> <td align="right">129.990</td> <td align="right">0.000</td> <td align="right">0.87</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">b7b6</td> <td align="right">-0.035</td> <td align="right">0.018</td> <td align="right">-1.966</td> <td align="right">0.049</td> <td align="right">-6.00</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">g7g6</td> <td align="right">-0.014</td> <td align="right">0.013</td> <td align="right">-1.078</td> <td align="right">0.281</td> <td align="right">-2.30</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">c7c5</td> <td align="right">0.056</td> <td align="right">0.035</td> <td align="right">1.591</td> <td align="right">0.112</td> <td align="right">9.60</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">Other</td> <td align="right">0.225</td> <td align="right">0.019</td> <td align="right">12.147</td> <td align="right">0.000</td> <td align="right">39.00</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">f7f5</td> <td align="right">0.240</td> <td align="right">0.009</td> <td align="right">25.263</td> <td align="right">0.000</td> <td align="right">42.00</td> </tr> <tr> <td align="left">g2g4</td> <td align="left">h7h5</td> <td align="right">0.437</td> <td align="right">0.020</td> <td align="right">22.023</td> <td align="right">0.000</td> <td align="right">76.00</td> </tr> </tbody> </table> <p>We see that Black's best reply to <em>e2e3</em> is actually <em>e7e6</em>, but <em>b7b5</em>, the start to the "suicide defense", is also very good.</p> <h2>Piece value</h2> <p>As I did for <a href="atomic-three">Atomic chess</a>, I also computed a difference in piece counts at random points over the lifetime of each game. As before, I use logistic regression to estimate the coefficients in the model </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_N \Delta N + c_B \Delta B + c_R \Delta R + c_Q \Delta Q + c_K \Delta K \right],$$</div> <p> where <span class="math">$$\Delta e$$</span> is the difference in Elo, and <span class="math">$$\Delta P, \Delta N, \Delta B, \Delta R, \Delta Q, \Delta K$$</span> are the differences in pawn, knight, bishop, rook, queen and king counts. Here <span class="math">$$p$$</span> is the probability that White wins the match. Because this is Antichess, one expects that having pieces is associated with losing, so I expect to see all the constants <span class="math">$$c$$</span> negative. However, the least negative of them are the more valuable pieces in antichess.</p> <p>As previously, I subselect to games where each player has already recorded at least 50 games in the database, where both players have pre-game Elo at least 1500, and games which are at least 10 ply in length. As in my <a href="atomic-three">analysis of atomic</a> I have computed the differences in piece counts at some pseudo-random snapshot in each game. Thus the values below give a kind of 'average value' of the pieces over an antichess game. </p> <p>Here is a table of the estimated coefficients for the four regressions, with coefficients and standard errors in Elo equivalents, as well as Wald statistics. The intercept term can be interpreted as White's tempo advantage. We see that, indeed, the King is the least worst piece to have, then a knight, pawn, queen, bishop and rook. Note that the pre-game Elo score still shows an estimated value rather less than 1. There are issues with the Lichess Antichess Elo which should be examined further.</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">Estimate</th> <th align="right">Std.Error</th> <th align="right">Statistic</th> </tr> </thead> <tbody> <tr> <td align="left">Elo</td> <td align="right">0.856</td> <td align="right">0.001</td> <td align="right">715.4</td> </tr> <tr> <td align="left">White Tempo</td> <td align="right">16.989</td> <td align="right">0.211</td> <td align="right">80.6</td> </tr> <tr> <td align="left">Pawn</td> <td align="right">-47.034</td> <td align="right">0.138</td> <td align="right">-339.7</td> </tr> <tr> <td align="left">Knight</td> <td align="right">-44.921</td> <td align="right">0.264</td> <td align="right">-170.2</td> </tr> <tr> <td align="left">Bishop</td> <td align="right">-69.601</td> <td align="right">0.292</td> <td align="right">-238.6</td> </tr> <tr> <td align="left">Rook</td> <td align="right">-76.817</td> <td align="right">0.296</td> <td align="right">-259.6</td> </tr> <tr> <td align="left">Queen</td> <td align="right">-63.006</td> <td align="right">0.423</td> <td align="right">-148.8</td> </tr> <tr> <td align="left">King</td> <td align="right">-9.317</td> <td align="right">0.391</td> <td align="right">-23.8</td> </tr> </tbody> </table> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Atomic Piece Values, Again2021-05-31T21:46:54-07:002021-05-31T21:46:54-07:00Steven E. Pavtag:www.gilgamath.com,2021-05-31:/atomic-three.html<p>In a <a href="atomic-two">previous blog post</a> I used logistic regression to estimate the values of pieces in Atomic chess. In that study I computed material differences between the two players using a snapshot 8 plies before the end of the match. (A "ply" is a move by a single player.) That choice of snapshot was arbitrary, but it is typically late enough in the match so there is some material difference to measure, and also near enough to the end to estimate the "power" of each piece to bring victory. However, this valuation is rather late in the game, and is probably not representative of the average value of the pieces. That is, a knight advantage early in the game could be parlayed into a queen advantage later, which could then prove decisive. </p> <!-- PELICAN_END_SUMMARY --> <p>To fix that issue, I will re-perform that analysis on other snapshots. Recall that I am working from 9 million rated Atomic games that I downloaded from Lichess. For each match I selected a pseudo-random ply after the second and before the last ply of each game, uniformly. (There is no material difference before the third ply.) I also selected pseudo-random snapshots in the first third, the second third, and the last third of each match. I compute the difference in material as well as differences in passed pawn counts for each snapshot. You can download <a href="https://drive.google.com/file/d/1RPMJZ4MvCf1VqkZUNpqcR85isK0cgDdU/view?usp=sharing">v2 of the data</a>, and the <a href="https://github.com/shabbychef/lichess-data">code</a>.</p> <p>Recall that I am using logistic regression to estimate coefficients in the model </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_K \Delta K + c_B \Delta B + c_R \Delta R + c_Q \Delta Q \right],$$</div> <p> where <span class="math">$$\Delta e$$</span> is the difference in Elo, and <span class="math">$$\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q$$</span> are the …</p><p>In a <a href="atomic-two">previous blog post</a> I used logistic regression to estimate the values of pieces in Atomic chess. In that study I computed material differences between the two players using a snapshot 8 plies before the end of the match. (A "ply" is a move by a single player.) That choice of snapshot was arbitrary, but it is typically late enough in the match so there is some material difference to measure, and also near enough to the end to estimate the "power" of each piece to bring victory. However, this valuation is rather late in the game, and is probably not representative of the average value of the pieces. That is, a knight advantage early in the game could be parlayed into a queen advantage later, which could then prove decisive. </p> <!-- PELICAN_END_SUMMARY --> <p>To fix that issue, I will re-perform that analysis on other snapshots. Recall that I am working from 9 million rated Atomic games that I downloaded from Lichess. For each match I selected a pseudo-random ply after the second and before the last ply of each game, uniformly. (There is no material difference before the third ply.) I also selected pseudo-random snapshots in the first third, the second third, and the last third of each match. I compute the difference in material as well as differences in passed pawn counts for each snapshot. You can download <a href="https://drive.google.com/file/d/1RPMJZ4MvCf1VqkZUNpqcR85isK0cgDdU/view?usp=sharing">v2 of the data</a>, and the <a href="https://github.com/shabbychef/lichess-data">code</a>.</p> <p>Recall that I am using logistic regression to estimate coefficients in the model </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_K \Delta K + c_B \Delta B + c_R \Delta R + c_Q \Delta Q \right],$$</div> <p> where <span class="math">$$\Delta e$$</span> is the difference in Elo, and <span class="math">$$\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q$$</span> are the differences in pawn, knight, bishop, rook and queen counts. Here <span class="math">$$p$$</span> is the probability that White wins the match. By putting the weird constant <span class="math">$$\operatorname{log}(10)/400$$</span> in front of the expression, the constants <span class="math">$$c_P, c_K$$</span> <em>etc.</em> are denominated in Elo equivalent units. Similarly, I fit a model with terms for passed pawn counts in various ranks.</p> <p>As previously, I subselect to games where each player has already recorded at least 50 games in the database, where both players have pre-game Elo at least 1500, and games which are at least 10 ply in length.</p> <p>Here is a table of the estimated coefficients for the four regressions, with coefficients and standard errors in Elo equivalents, as well as Wald statistics. The p-values all underflow to zero. The intercept term can be interpreted as White's tempo advantage.</p> <table> <thead> <tr> <th align="left">snapshot</th> <th align="left">term</th> <th align="right">Estimate</th> <th align="right">Std.Error</th> <th align="right">Statistic</th> </tr> </thead> <tbody> <tr> <td align="left">random</td> <td align="left">Elo</td> <td align="right">0.957</td> <td align="right">0.001</td> <td align="right">665.0</td> </tr> <tr> <td align="left">random</td> <td align="left">White Tempo</td> <td align="right">53.136</td> <td align="right">0.281</td> <td align="right">188.9</td> </tr> <tr> <td align="left">random</td> <td align="left">Pawn</td> <td align="right">31.301</td> <td align="right">0.329</td> <td align="right">95.0</td> </tr> <tr> <td align="left">random</td> <td align="left">Knight</td> <td align="right">47.058</td> <td align="right">0.471</td> <td align="right">99.9</td> </tr> <tr> <td align="left">random</td> <td align="left">Bishop</td> <td align="right">57.368</td> <td align="right">0.484</td> <td align="right">118.4</td> </tr> <tr> <td align="left">random</td> <td align="left">Rook</td> <td align="right">105.767</td> <td align="right">0.718</td> <td align="right">147.4</td> </tr> <tr> <td align="left">random</td> <td align="left">Queen</td> <td align="right">244.058</td> <td align="right">0.890</td> <td align="right">274.2</td> </tr> <tr> <td align="left">first third</td> <td align="left">Elo</td> <td align="right">1.006</td> <td align="right">0.001</td> <td align="right">710.2</td> </tr> <tr> <td align="left">first third</td> <td align="left">White Tempo</td> <td align="right">62.620</td> <td align="right">0.270</td> <td align="right">232.1</td> </tr> <tr> <td align="left">first third</td> <td align="left">Pawn</td> <td align="right">58.935</td> <td align="right">0.697</td> <td align="right">84.6</td> </tr> <tr> <td align="left">first third</td> <td align="left">Knight</td> <td align="right">83.380</td> <td align="right">0.878</td> <td align="right">95.0</td> </tr> <tr> <td align="left">first third</td> <td align="left">Bishop</td> <td align="right">76.827</td> <td align="right">0.986</td> <td align="right">77.9</td> </tr> <tr> <td align="left">first third</td> <td align="left">Rook</td> <td align="right">80.067</td> <td align="right">1.754</td> <td align="right">45.7</td> </tr> <tr> <td align="left">first third</td> <td align="left">Queen</td> <td align="right">148.910</td> <td align="right">1.768</td> <td align="right">84.2</td> </tr> <tr> <td align="left">second third</td> <td align="left">Elo</td> <td align="right">0.955</td> <td align="right">0.001</td> <td align="right">666.7</td> </tr> <tr> <td align="left">second third</td> <td align="left">White Tempo</td> <td align="right">50.868</td> <td align="right">0.283</td> <td align="right">179.5</td> </tr> <tr> <td align="left">second third</td> <td align="left">Pawn</td> <td align="right">38.063</td> <td align="right">0.322</td> <td align="right">118.3</td> </tr> <tr> <td align="left">second third</td> <td align="left">Knight</td> <td align="right">57.589</td> <td align="right">0.449</td> <td align="right">128.2</td> </tr> <tr> <td align="left">second third</td> <td align="left">Bishop</td> <td align="right">59.246</td> <td align="right">0.462</td> <td align="right">128.1</td> </tr> <tr> <td align="left">second third</td> <td align="left">Rook</td> <td align="right">107.305</td> <td align="right">0.742</td> <td align="right">144.7</td> </tr> <tr> <td align="left">second third</td> <td align="left">Queen</td> <td align="right">210.759</td> <td align="right">0.917</td> <td align="right">229.8</td> </tr> <tr> <td align="left">last third</td> <td align="left">Elo</td> <td align="right">0.915</td> <td align="right">0.001</td> <td align="right">619.6</td> </tr> <tr> <td align="left">last third</td> <td align="left">White Tempo</td> <td align="right">45.537</td> <td align="right">0.292</td> <td align="right">155.9</td> </tr> <tr> <td align="left">last third</td> <td align="left">Pawn</td> <td align="right">26.008</td> <td align="right">0.264</td> <td align="right">98.5</td> </tr> <tr> <td align="left">last third</td> <td align="left">Knight</td> <td align="right">32.499</td> <td align="right">0.397</td> <td align="right">81.8</td> </tr> <tr> <td align="left">last third</td> <td align="left">Bishop</td> <td align="right">55.957</td> <td align="right">0.396</td> <td align="right">141.2</td> </tr> <tr> <td align="left">last third</td> <td align="left">Rook</td> <td align="right">109.550</td> <td align="right">0.545</td> <td align="right">201.2</td> </tr> <tr> <td align="left">last third</td> <td align="left">Queen</td> <td align="right">283.669</td> <td align="right">0.716</td> <td align="right">396.1</td> </tr> </tbody> </table> <p>The data are hard to digest in table form, so below I plot the coefficients for the four different snapshots, with standard error bars. We see that a queen is generally worth around 150-250 Elo points, a rook around 100 (though somewhat less early in the match), a bishop around 60 (more early in the match), a knight 30-80, and a pawn 15-60. As one progresses along the match (from first to second to last third), the queen and rook gain value, while the bishop, knight and pawn lose value.</p> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_est_one-1.png" title="plot of chunk plot_est_one" alt="plot of chunk plot_est_one" width="1000px" height="800px" /></p> <p>The top axis is denominated in 'pawn' units, where I eyeballed a pawn as worth around 30 Elo, but this is so variable over the different match snapshots it is hard to quote a consistent valuation scheme in pawn units. This is in contrast with the previous blog post where we suggested a 1:2.5:4:8:22 valuation for pawn, knight, bishop, rook, queen; that scheme is only appropriate for the very end of the match (and it can be hard to tell you are at the end of the match while playing). Below I denominate piece values relative to the estimated pawn values. (I drop the error bars because I am too lazy to code up the delta method.) For the random snapshot the pawn value estimates are as below</p> <table> <thead> <tr> <th align="left">Piece</th> <th align="right">Pawn value</th> </tr> </thead> <tbody> <tr> <td align="left">Knight</td> <td align="right">1.5</td> </tr> <tr> <td align="left">Bishop</td> <td align="right">1.8</td> </tr> <tr> <td align="left">Rook</td> <td align="right">3.4</td> </tr> <tr> <td align="left">Queen</td> <td align="right">7.8</td> </tr> </tbody> </table> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_est_one_b-1.png" title="plot of chunk plot_est_one_b" alt="plot of chunk plot_est_one_b" width="1000px" height="800px" /></p> <p>As in the <a href="atomic-two">previous blog post</a>, I ran the regressions again with filters for minimum Elo. The reasoning is that better players will exhibit higher quality play, rather than average play. Here are plots by minimum Elo, snapshot and piece. We see that better players are better able to capitalize on bishops and perhaps knights, but otherwise the valuations are largely consistent across player ability.</p> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_est_two-1.png" title="plot of chunk plot_est_two" alt="plot of chunk plot_est_two" width="1000px" height="800px" /></p> <h2>Passed pawns</h2> <p>As in the <a href="atomic-two">previous blog post</a>, I computed the difference in counts of passed pawns. I classified passed pawns as belonging to ranks 2, 3 or 4, to rank 5, to rank 6, or to rank 7; any pawn on rank 7 is automatically a passed pawn. When computing the material difference from White's point of view, the ranks are mirror image for Black in the obvious way. Again, I fit a model of the form </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_{234} \Delta PP_{234} + c_{5} \Delta PP_{5} + c_{6} \Delta PP_{6} + c_{7} \Delta PP_{7} \right].$$</div> <p>Here are the estimated regression coefficients and standard errors denominated in Elo, along with the Wald statistics.<br> Below I plot the coefficients. It should seem odd to you that high rank passed pawns can sometimes have <em>negative</em> value. For example, a pawn on the seventh rank in the first third of a match appears to be worth around -15 Elo. The reason for this apparent contradiction is that this valuation is conditional on taking a snapshot in the first third of a match, but in most situations if you have a pawn on the 7th rank, it can often quickly lead to a victory. We are, however, looking at those cases where it does not. For this analysis it probably makes more sense to look at the random snapshot to get an "average" value.</p> <table> <thead> <tr> <th align="left">snapshot</th> <th align="left">term</th> <th align="right">Estimate</th> <th align="right">Std.Error</th> <th align="right">Statistic</th> </tr> </thead> <tbody> <tr> <td align="left">random</td> <td align="left">Elo</td> <td align="right">1.01</td> <td align="right">0.001</td> <td align="right">715.29</td> </tr> <tr> <td align="left">random</td> <td align="left">White Tempo</td> <td align="right">64.87</td> <td align="right">0.261</td> <td align="right">248.86</td> </tr> <tr> <td align="left">random</td> <td align="left">P.P. Rank234</td> <td align="right">59.01</td> <td align="right">0.955</td> <td align="right">61.81</td> </tr> <tr> <td align="left">random</td> <td align="left">P.P. Rank5</td> <td align="right">39.98</td> <td align="right">1.415</td> <td align="right">28.25</td> </tr> <tr> <td align="left">random</td> <td align="left">P.P. Rank6</td> <td align="right">27.96</td> <td align="right">1.316</td> <td align="right">21.24</td> </tr> <tr> <td align="left">random</td> <td align="left">P.P. Rank7</td> <td align="right">69.49</td> <td align="right">1.408</td> <td align="right">49.34</td> </tr> <tr> <td align="left">first third</td> <td align="left">Elo</td> <td align="right">1.02</td> <td align="right">0.001</td> <td align="right">719.17</td> </tr> <tr> <td align="left">first third</td> <td align="left">White Tempo</td> <td align="right">66.06</td> <td align="right">0.260</td> <td align="right">254.11</td> </tr> <tr> <td align="left">first third</td> <td align="left">P.P. Rank234</td> <td align="right">147.94</td> <td align="right">5.674</td> <td align="right">26.07</td> </tr> <tr> <td align="left">first third</td> <td align="left">P.P. Rank5</td> <td align="right">68.32</td> <td align="right">7.697</td> <td align="right">8.88</td> </tr> <tr> <td align="left">first third</td> <td align="left">P.P. Rank6</td> <td align="right">-8.71</td> <td align="right">7.618</td> <td align="right">-1.14</td> </tr> <tr> <td align="left">first third</td> <td align="left">P.P. Rank7</td> <td align="right">-15.68</td> <td align="right">11.592</td> <td align="right">-1.35</td> </tr> <tr> <td align="left">second third</td> <td align="left">Elo</td> <td align="right">1.01</td> <td align="right">0.001</td> <td align="right">715.28</td> </tr> <tr> <td align="left">second third</td> <td align="left">White Tempo</td> <td align="right">64.50</td> <td align="right">0.261</td> <td align="right">247.28</td> </tr> <tr> <td align="left">second third</td> <td align="left">P.P. Rank234</td> <td align="right">106.15</td> <td align="right">1.174</td> <td align="right">90.42</td> </tr> <tr> <td align="left">second third</td> <td align="left">P.P. Rank5</td> <td align="right">73.42</td> <td align="right">1.692</td> <td align="right">43.39</td> </tr> <tr> <td align="left">second third</td> <td align="left">P.P. Rank6</td> <td align="right">46.71</td> <td align="right">1.655</td> <td align="right">28.22</td> </tr> <tr> <td align="left">second third</td> <td align="left">P.P. Rank7</td> <td align="right">51.19</td> <td align="right">1.955</td> <td align="right">26.19</td> </tr> <tr> <td align="left">last third</td> <td align="left">Elo</td> <td align="right">1.01</td> <td align="right">0.001</td> <td align="right">711.41</td> </tr> <tr> <td align="left">last third</td> <td align="left">White Tempo</td> <td align="right">63.95</td> <td align="right">0.261</td> <td align="right">244.61</td> </tr> <tr> <td align="left">last third</td> <td align="left">P.P. Rank234</td> <td align="right">44.47</td> <td align="right">0.651</td> <td align="right">68.36</td> </tr> <tr> <td align="left">last third</td> <td align="left">P.P. Rank5</td> <td align="right">33.59</td> <td align="right">0.972</td> <td align="right">34.55</td> </tr> <tr> <td align="left">last third</td> <td align="left">P.P. Rank6</td> <td align="right">25.85</td> <td align="right">0.888</td> <td align="right">29.10</td> </tr> <tr> <td align="left">last third</td> <td align="left">P.P. Rank7</td> <td align="right">77.31</td> <td align="right">0.926</td> <td align="right">83.50</td> </tr> </tbody> </table> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_ppest_one-1.png" title="plot of chunk plot_ppest_one" alt="plot of chunk plot_ppest_one" width="1000px" height="800px" /></p> <h2>Spline Regressions</h2> <p>It is hard to interpret the coefficients because the value of pieces appears to depend on the phase of play. To remedy this, I interacted the material difference with some spline terms. That is, I compute some spline functions of the ply at which the snapshot is taken, the multiply those by the material differences. I then estimate the coefficients of the interaction terms, and combine them with the spline functions. This gives material value as a function of the ply, which I plot below. I did this two different ways: once computing splines over the raw ply, and then using the ply divided by the total ply. In the latter formulation you can view the valuation as percent progress in the match. The problem with this formulation is that you typically do not know how far along in the match you are. On the other hand, because matches progress at different speeds, basing value on the raw ply also seems flawed.</p> <p>For the raw ply plot we plot in a single facet. For the percentage regression, this results in a visually unreadable plot, so we use separate facets for the different pieces. For the raw ply regression we see near equal values of the bishop and knight through most of the match; rooks increase in value after the 20th ply; queens are valuable from early in the match, and increase in value as the match progresses. This phenomenon is intuitive, as the queen is less likely to be captured later in the match when there are fewer pieces on the board.</p> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_spline_one-1.png" title="plot of chunk plot_spline_one" alt="plot of chunk plot_spline_one" width="1000px" height="800px" /></p> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_spline_two-1.png" title="plot of chunk plot_spline_two" alt="plot of chunk plot_spline_two" width="1000px" height="800px" /></p> <p>Below I express those estimates relative to the estimated value of a pawn. Again we lose the standard error bars. For the raw ply regression, we see a bulge at around 25 ply where pawns have very low value, and queens peak. A different pattern emerges in the percent ply regressions, where queens increase in value steadily over the course of the match.</p> <p><img src="https://www.gilgamath.com/figure/atomic_three_plot_spline_rel_one-1.png" title="plot of chunk plot_spline_rel_one" alt="plot of chunk plot_spline_rel_one" width="1000px" height="800px" /> <img src="https://www.gilgamath.com/figure/atomic_three_plot_spline_rel_two-1.png" title="plot of chunk plot_spline_rel_two" alt="plot of chunk plot_spline_rel_two" width="1000px" height="800px" /></p> <h2>Future work</h2> <p>The analysis here indicates we need a better measure of match progress, one which can be computed in real time, but which matches the tempo of the particular match. It would seem that something like total material on the board would be a good measure. This is intuitive, as crowded positions are dangerous in Atomic and can quickly lead to large changes in the material difference. I also want to perform an analysis using survival analysis.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Atomic Piece Values2021-05-10T21:53:59-07:002021-05-10T21:53:59-07:00Steven E. Pavtag:www.gilgamath.com,2021-05-10:/atomic-two.html<p>Most chess playing computer programs use forward search over the tree of possible moves. Because such a search cannot examine every branch to termination of the game, usually "static" evaluation of leaf nodes in the tree is via the combination of a bunch of scoring rules. These typically include a term for the material balance of the position.<br> In traditional chess the pieces are usually assigned scores of 1 point for pawns, around 3 points for knights and bishops, 5 for rooks, and 9 for queens. Human players often use this heuristic when considering exchanges.</p> <p>I recently started playing a chess variant called <a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic chess</a>. In Atomic, when a piece captures another, both are removed from the board, along with all non-pawn pieces in the up to eight adjacent squares. The idea is that a capture causes an 'explosion'. Lichess plays a delightful explosion noise when this happens.</p> <p>The traditional scoring heuristic is apparently based on mobility of the pieces. While movement of pieces is the same in the Atomic variant, I suspect that traditional scoring is not well calibrated for Atomic: A piece can capture only once in Atomic; a piece can remove multiple pieces from the board in one capture; pieces have value as protective 'chaff'; Kings cannot capture pieces, so solo mates are possible; pawns on the seventh rank can trap high-value pieces by threatening promotion; there are numerous fools' mates involving knights, <em>etc.</em> Can we create a scoring heuristic calibrated for Atomic?</p> <!-- PELICAN_END_SUMMARY --> <p>The problem would seem intractable from first principles, because piece value is so different from average piece mobility. Instead, perhaps we can infer a kind of average value for pieces. In a <a href="atomic-one">previous blog post</a> I performed a quick analysis of Atomic openings on a database of around 9 million games played on Lichess …</p><p>Most chess playing computer programs use forward search over the tree of possible moves. Because such a search cannot examine every branch to termination of the game, usually "static" evaluation of leaf nodes in the tree is via the combination of a bunch of scoring rules. These typically include a term for the material balance of the position.<br> In traditional chess the pieces are usually assigned scores of 1 point for pawns, around 3 points for knights and bishops, 5 for rooks, and 9 for queens. Human players often use this heuristic when considering exchanges.</p> <p>I recently started playing a chess variant called <a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic chess</a>. In Atomic, when a piece captures another, both are removed from the board, along with all non-pawn pieces in the up to eight adjacent squares. The idea is that a capture causes an 'explosion'. Lichess plays a delightful explosion noise when this happens.</p> <p>The traditional scoring heuristic is apparently based on mobility of the pieces. While movement of pieces is the same in the Atomic variant, I suspect that traditional scoring is not well calibrated for Atomic: A piece can capture only once in Atomic; a piece can remove multiple pieces from the board in one capture; pieces have value as protective 'chaff'; Kings cannot capture pieces, so solo mates are possible; pawns on the seventh rank can trap high-value pieces by threatening promotion; there are numerous fools' mates involving knights, <em>etc.</em> Can we create a scoring heuristic calibrated for Atomic?</p> <!-- PELICAN_END_SUMMARY --> <p>The problem would seem intractable from first principles, because piece value is so different from average piece mobility. Instead, perhaps we can infer a kind of average value for pieces. In a <a href="atomic-one">previous blog post</a> I performed a quick analysis of Atomic openings on a database of around 9 million games played on Lichess. I used logistic regression to generalize Elo scores. Here I will pursue the same approach.</p> <p>Suppose you took a snapshot of a game at some point, then computed the difference of White's pawn count minus Black's pawn count, White's knight count minus Black's, and so on. Let these be called <span class="math">$$\Delta P, \Delta K, \Delta B, \Delta R, \Delta Q.$$</span> Let <span class="math">$$p$$</span> be the probability that White wins the game. Let <span class="math">$$\Delta e$$</span> be White's Elo minus Black's. I will estimate a model of the form </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_P \Delta P + c_K \Delta K + c_B \Delta B + c_R \Delta R + c_Q \Delta Q \right].$$</div> <p> By putting the weird constant <span class="math">$$\operatorname{log}(10)/400$$</span> in front of the expression, the constants <span class="math">$$c_P, c_K$$</span> <em>etc.</em> are denominated in Elo equivalent units.</p> <p>On reflection, I probably should have selected a random point uniformly over the lifetime of each match in my database to compute the material difference. Instead I selected four points to snapshot the material difference: just prior to the last move, two 'ply' prior to the match end, as well as four ply and eight ply. (In computer chess a 'ply' is a piece move by one player, while 'move' apparently refers to two ply.) This choice will have consequences, and I will have to consider the random snapshot approach if I ever write this up for real.</p> <h2>Regressions</h2> <p>I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a> that will download and parse Lichess' public data, turning it into a CSV file. You can <a href="https://drive.google.com/file/d/1YqOFKZlCQoWJBUT9B3HI7q0i6DHvT9or/view?usp=sharing">download</a> v1 of this file, but Lichess is the ultimate copyright holder. Here I will consider games which end by the 'Normal' condition (checkmate or what passes for it in Atomic). I subselect to games where each player has already recorded at least 50 games in the database. I also only take games where both players have pre-game Elo at least 1500. I also subselect to games which are at least 10 ply, which excludes many fool's mates.</p> <p>Here is a table of the estimated coefficients, translated into Elo equivalents, as well as standard errors, Wald statistics and p-values (which all underflow to zero). The intercept term can be interpreted as White's tempo advantage, as outlined in the previous blog post.</p> <table> <thead> <tr> <th align="left">ply prior</th> <th align="left">term</th> <th align="right">Estimate</th> <th align="right">Elo equiv.</th> <th align="right">Std.Error</th> <th align="right">Statistic</th> <th align="right">P.value</th> </tr> </thead> <tbody> <tr> <td align="left">1 ply prior</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.919</td> <td align="right">0.000</td> <td align="right">604.1</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.283</td> <td align="right">49.220</td> <td align="right">0.002</td> <td align="right">163.4</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">Pawn</td> <td align="right">0.054</td> <td align="right">9.447</td> <td align="right">0.002</td> <td align="right">36.1</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">Knight</td> <td align="right">0.161</td> <td align="right">27.889</td> <td align="right">0.002</td> <td align="right">69.0</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">Bishop</td> <td align="right">0.329</td> <td align="right">57.106</td> <td align="right">0.002</td> <td align="right">143.8</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">Rook</td> <td align="right">0.581</td> <td align="right">100.961</td> <td align="right">0.003</td> <td align="right">198.5</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">Queen</td> <td align="right">1.792</td> <td align="right">311.275</td> <td align="right">0.004</td> <td align="right">490.7</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.910</td> <td align="right">0.000</td> <td align="right">595.8</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.277</td> <td align="right">48.169</td> <td align="right">0.002</td> <td align="right">158.9</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Pawn</td> <td align="right">0.074</td> <td align="right">12.882</td> <td align="right">0.002</td> <td align="right">48.2</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Knight</td> <td align="right">0.191</td> <td align="right">33.185</td> <td align="right">0.002</td> <td align="right">80.9</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Bishop</td> <td align="right">0.346</td> <td align="right">60.043</td> <td align="right">0.002</td> <td align="right">147.9</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Rook</td> <td align="right">0.611</td> <td align="right">106.072</td> <td align="right">0.003</td> <td align="right">202.7</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Queen</td> <td align="right">1.935</td> <td align="right">336.070</td> <td align="right">0.004</td> <td align="right">497.7</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.906</td> <td align="right">0.000</td> <td align="right">598.7</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.271</td> <td align="right">47.075</td> <td align="right">0.002</td> <td align="right">157.0</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Pawn</td> <td align="right">0.095</td> <td align="right">16.581</td> <td align="right">0.002</td> <td align="right">61.2</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Knight</td> <td align="right">0.196</td> <td align="right">34.109</td> <td align="right">0.002</td> <td align="right">82.3</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Bishop</td> <td align="right">0.345</td> <td align="right">59.930</td> <td align="right">0.002</td> <td align="right">145.1</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Rook</td> <td align="right">0.670</td> <td align="right">116.425</td> <td align="right">0.003</td> <td align="right">215.3</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Queen</td> <td align="right">1.979</td> <td align="right">343.745</td> <td align="right">0.004</td> <td align="right">469.7</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Elo</td> <td align="right">0.005</td> <td align="right">0.920</td> <td align="right">0.000</td> <td align="right">620.2</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.287</td> <td align="right">49.879</td> <td align="right">0.002</td> <td align="right">171.5</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Pawn</td> <td align="right">0.109</td> <td align="right">18.981</td> <td align="right">0.002</td> <td align="right">67.4</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Knight</td> <td align="right">0.318</td> <td align="right">55.225</td> <td align="right">0.003</td> <td align="right">126.3</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Bishop</td> <td align="right">0.371</td> <td align="right">64.415</td> <td align="right">0.002</td> <td align="right">148.5</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Rook</td> <td align="right">0.770</td> <td align="right">133.799</td> <td align="right">0.003</td> <td align="right">228.4</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Queen</td> <td align="right">1.915</td> <td align="right">332.664</td> <td align="right">0.005</td> <td align="right">403.1</td> <td align="right">0</td> </tr> </tbody> </table> <p>Below I plot these for the four different snapshots, with standard error bars. We see that a queen is generally worth around 300 Elo points (!), a rook around 100, a bishop around 60, a knight 30, and a pawn 15. As one gets closer to the end of the game (smaller ply prior), the pieces are generally worth less, which suggests there is typically some sacrifice of pieces for the winning side just prior to the last move.</p> <p>The top axis is denominated in 'pawn' units, where I eyeballed a pawn as worth around 15 Elo. From this it seems that a knight is worth about 2.5 pawns, a bishop 4, a rook 8, and a queen 22. Note that knights appear to have higher value earlier in the game.</p> <p><img src="https://www.gilgamath.com/figure/atomic_two_plot_est_one-1.png" title="plot of chunk plot_est_one" alt="plot of chunk plot_est_one" width="1000px" height="800px" /></p> <p>One problem I faced in the opening move analysis is that the regressions capture <em>average</em> play, not optimal play. That is, because of human error, tight time controls, and so on, there are many mistakes in the database. One way to try to control for that is to restrict to higher Elo players. This will add noise to the regressions because of the smaller sample size, but can perhaps give an indication of near-optimal value of pieces. So I re-ran the regressions, filtering on games where both players had greater than 1700 Elo, greater than 1900 and greater than 2100. I plot the regression coefficients below, with a different facet for each piece. We see that more skilled players are better able to take advantage of a material difference, and thus the pieces are worth more for the higher Elo matches. And perhaps bishops are more dangerous in the hands of a skilled player than an average player, worth perhaps 5 or 6 pawns. But otherwise the value of pieces under skilled play is close to that under average play, and certainly different from the piece values used in traditional chess.</p> <p><img src="https://www.gilgamath.com/figure/atomic_two_plot_est_two-1.png" title="plot of chunk plot_est_two" alt="plot of chunk plot_est_two" width="1000px" height="800px" /></p> <h2>Passed pawns</h2> <p>My data processing also computes the imbalance of passed pawns. A passed pawn is one which is not blocked by, or can be taken by an enemy pawn. In the figure below, the White pawn on <em>d5</em> is a passed pawn. The Black pawn on <em>b5</em> might be a passed pawn, depending on whether it can be taken <em>en passant</em> in the next move by White's pawn on <em>a5</em>.</p> <p><img src="https://www.gilgamath.com/figure/atomic_two_passed_pawn_fig-1.png" title="plot of chunk passed_pawn_fig" alt="plot of chunk passed_pawn_fig" width="700px" height="560px" /></p> <p>My processing does not take into account the <em>en passant</em> condition, as it was too tricky to implement, and unlikely to have much effect. I also group passed pawns as belonging to ranks 2, 3 or 4, to rank 5, to rank 6, or to rank 7. Any pawn on rank 7 is automatically a passed pawn. When computing the material difference from White's point of view, the ranks are mirror for Black in the obvious way. I fit a model of the form </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_{234} \Delta PP_{234} + c_{5} \Delta PP_{5} + c_{6} \Delta PP_{6} + c_{7} \Delta PP_{7} \right].$$</div> <p>Here are the estimated regression coefficients, also denominated in Elo, along with the supporting statistics. Below I plot the coefficients. Not surprisingly, passed pawns are more valuable in the earlier snapshots, when there are more moves remaining until the end of game. This is because a passed pawn by itself is less of a threat to the king, than whatever major piece it might promote into. The negative values for snapshots close to the end of the game are a function of our selection mechanism. On average I expect passed pawns to have a net positive value.</p> <table> <thead> <tr> <th align="left">ply prior</th> <th align="left">term</th> <th align="right">Estimate</th> <th align="right">Elo equiv.</th> <th align="right">Std.Error</th> <th align="right">Statistic</th> <th align="right">P.value</th> </tr> </thead> <tbody> <tr> <td align="left">1 ply prior</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">1.02</td> <td align="right">0.000</td> <td align="right">716.49</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.384</td> <td align="right">66.68</td> <td align="right">0.002</td> <td align="right">254.71</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">P.P. Rank234</td> <td align="right">0.112</td> <td align="right">19.49</td> <td align="right">0.003</td> <td align="right">33.76</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">P.P. Rank5</td> <td align="right">-0.211</td> <td align="right">-36.73</td> <td align="right">0.006</td> <td align="right">-38.03</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">P.P. Rank6</td> <td align="right">-0.427</td> <td align="right">-74.20</td> <td align="right">0.005</td> <td align="right">-86.12</td> <td align="right">0</td> </tr> <tr> <td align="left">1 ply prior</td> <td align="left">P.P. Rank7</td> <td align="right">-0.033</td> <td align="right">-5.77</td> <td align="right">0.005</td> <td align="right">-7.27</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">1.02</td> <td align="right">0.000</td> <td align="right">716.65</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.384</td> <td align="right">66.77</td> <td align="right">0.002</td> <td align="right">255.08</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">P.P. Rank234</td> <td align="right">0.092</td> <td align="right">15.95</td> <td align="right">0.003</td> <td align="right">27.20</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">P.P. Rank5</td> <td align="right">-0.261</td> <td align="right">-45.27</td> <td align="right">0.006</td> <td align="right">-46.76</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">P.P. Rank6</td> <td align="right">-0.407</td> <td align="right">-70.67</td> <td align="right">0.005</td> <td align="right">-82.20</td> <td align="right">0</td> </tr> <tr> <td align="left">2 ply prior</td> <td align="left">P.P. Rank7</td> <td align="right">0.068</td> <td align="right">11.78</td> <td align="right">0.005</td> <td align="right">14.73</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">1.01</td> <td align="right">0.000</td> <td align="right">715.26</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.379</td> <td align="right">65.92</td> <td align="right">0.002</td> <td align="right">252.11</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">P.P. Rank234</td> <td align="right">0.104</td> <td align="right">18.11</td> <td align="right">0.003</td> <td align="right">29.93</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">P.P. Rank5</td> <td align="right">-0.188</td> <td align="right">-32.70</td> <td align="right">0.006</td> <td align="right">-34.01</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">P.P. Rank6</td> <td align="right">-0.160</td> <td align="right">-27.87</td> <td align="right">0.005</td> <td align="right">-33.40</td> <td align="right">0</td> </tr> <tr> <td align="left">4 ply prior</td> <td align="left">P.P. Rank7</td> <td align="right">0.292</td> <td align="right">50.75</td> <td align="right">0.005</td> <td align="right">62.64</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">1.01</td> <td align="right">0.000</td> <td align="right">711.25</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">White Tempo</td> <td align="right">0.371</td> <td align="right">64.45</td> <td align="right">0.002</td> <td align="right">246.41</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">P.P. Rank234</td> <td align="right">0.158</td> <td align="right">27.47</td> <td align="right">0.004</td> <td align="right">41.53</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">P.P. Rank5</td> <td align="right">0.106</td> <td align="right">18.43</td> <td align="right">0.006</td> <td align="right">19.17</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">P.P. Rank6</td> <td align="right">0.215</td> <td align="right">37.42</td> <td align="right">0.005</td> <td align="right">43.94</td> <td align="right">0</td> </tr> <tr> <td align="left">8 ply prior</td> <td align="left">P.P. Rank7</td> <td align="right">0.560</td> <td align="right">97.25</td> <td align="right">0.005</td> <td align="right">109.48</td> <td align="right">0</td> </tr> </tbody> </table> <p><img src="https://www.gilgamath.com/figure/atomic_two_plot_ppest_one-1.png" title="plot of chunk plot_ppest_one" alt="plot of chunk plot_ppest_one" width="1000px" height="800px" /></p> <p>I perform the regressions on data filtered for minimum player Elo, as above. Below I plot the coefficients for a 1500, 1700, 1900 and 2100 minimum Elo limit. We see that passed pawns have somewhat lower value for games between skilled players, as presumably they can handle the threat better than average players.</p> <p><img src="https://www.gilgamath.com/figure/atomic_two_plot_ppest_two-1.png" title="plot of chunk plot_ppest_two" alt="plot of chunk plot_ppest_two" width="1000px" height="800px" /></p> <p>I also performed 'kitchen sink' regressions with piece count differences and passed pawn count differences. Filtering by minimum Elo, the coefficients are plotted below. These do not change the regression results we see above by much, but one should recognize that pawn promotion means that passed pawn count and major piece count are not uncorrelated, and the regressions probably should be performed like this, with all terms included.</p> <p><img src="https://www.gilgamath.com/figure/atomic_two_plot_ksest_two-1.png" title="plot of chunk plot_ksest_two" alt="plot of chunk plot_ksest_two" width="1000px" height="800px" /></p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Atomic Openings2021-05-07T21:51:52-07:002021-05-07T21:51:52-07:00Steven E. Pavtag:www.gilgamath.com,2021-05-07:/atomic-one.html<p>I've started playing a variant of chess called <a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic</a>. The pieces move like traditional chess, and start in the same position. In this variant, however, when a piece takes another piece, <em>both</em> are removed from the board, as well as any non-pawn pieces on the (up to eight) adjacent squares. As a consequence of this one change, the game can end if your King is 'blown up' by your opponent's capture. As another consequence, Kings cannot capture, and may occupy adjacent squares.</p> <p>For example, from the following position White's Knight can blow up the pawns at either d7 or f7, blowing up the Black King and ending the game.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_blowup-1.png" title="plot of chunk blowup" alt="plot of chunk blowup" width="700px" height="560px" /></p> <!-- PELICAN_END_SUMMARY --> <p>I looked around for some resources on Atomic chess, but have never had luck with traditional chess studies. Instead I decided to learn about Atomic statistically.</p> <p>As it happens, Lichess (which is truly a great site) <a href="https://database.lichess.org/#variant_games">publishes their game data</a> which includes over 9 million Atomic games played. I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a> that will download and parse this data, turning it into a CSV file. You can <a href="https://drive.google.com/file/d/1YqOFKZlCQoWJBUT9B3HI7q0i6DHvT9or/view?usp=sharing">download</a> v1 of this file, but Lichess is the ultimate copyright holder.</p> <h2>First steps</h2> <p>The games in the dataset end in one of three conditions: Normal (checkmate or what passes for it in Atomic), Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, as tabulated here:</p> <table> <thead> <tr> <th align="left">termination</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">Normal</td> <td align="right">8426052</td> </tr> <tr> <td align="left">Time forfeit</td> <td align="right">1257295</td> </tr> </tbody> </table> <p>The game data includes Elo scores for players, as computed prior to the game. As a first check, I wanted to see if Elo is properly calibrated. To do this, I compute the empirical win rate of White over Black, grouped by bins of the difference in their Elo …</p><p>I've started playing a variant of chess called <a href="https://en.wikipedia.org/wiki/Atomic_chess">Atomic</a>. The pieces move like traditional chess, and start in the same position. In this variant, however, when a piece takes another piece, <em>both</em> are removed from the board, as well as any non-pawn pieces on the (up to eight) adjacent squares. As a consequence of this one change, the game can end if your King is 'blown up' by your opponent's capture. As another consequence, Kings cannot capture, and may occupy adjacent squares.</p> <p>For example, from the following position White's Knight can blow up the pawns at either d7 or f7, blowing up the Black King and ending the game.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_blowup-1.png" title="plot of chunk blowup" alt="plot of chunk blowup" width="700px" height="560px" /></p> <!-- PELICAN_END_SUMMARY --> <p>I looked around for some resources on Atomic chess, but have never had luck with traditional chess studies. Instead I decided to learn about Atomic statistically.</p> <p>As it happens, Lichess (which is truly a great site) <a href="https://database.lichess.org/#variant_games">publishes their game data</a> which includes over 9 million Atomic games played. I wrote some <a href="https://github.com/shabbychef/lichess-data">code</a> that will download and parse this data, turning it into a CSV file. You can <a href="https://drive.google.com/file/d/1YqOFKZlCQoWJBUT9B3HI7q0i6DHvT9or/view?usp=sharing">download</a> v1 of this file, but Lichess is the ultimate copyright holder.</p> <h2>First steps</h2> <p>The games in the dataset end in one of three conditions: Normal (checkmate or what passes for it in Atomic), Time forfeit, and Abandoned (game terminated before it began). The last category is very rare, and I omit these from my processing. The majority of games end in the Normal way, as tabulated here:</p> <table> <thead> <tr> <th align="left">termination</th> <th align="right">n</th> </tr> </thead> <tbody> <tr> <td align="left">Normal</td> <td align="right">8426052</td> </tr> <tr> <td align="left">Time forfeit</td> <td align="right">1257295</td> </tr> </tbody> </table> <p>The game data includes Elo scores for players, as computed prior to the game. As a first check, I wanted to see if Elo is properly calibrated. To do this, I compute the empirical win rate of White over Black, grouped by bins of the difference in their Elo (again expressed from White's point of view). To avoid cases where the player is new to the system and their Elo has not yet converged to a good estimate, I select only matches where each player has already recorded at least 50 games in the database when the game is played. I also select for Normal terminations, since these are unambiguous victories. The remaining set contains 'only' 4692631 games.</p> <p>Here is a plot of empirical log odds, versus Elo difference, in bins 25 points wide. This is in natural log odds space, so we expect the line to have slope of <span class="math">$$\operatorname{log}(10) / 400$$</span> which is around 0.006. While the empirical data is nearly linear in Elo difference, it rides somewhat above the theoretical line. I believe this is due to a tempo advantage to White. If you want a boost, play White.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_calibration-1.png" title="plot of chunk calibration" alt="plot of chunk calibration" width="1000px" height="800px" /></p> <p>In competitive traditional chess, draws are quite common. In fact, the majority of championship games end in a draw. In a <a href="elo-ties">previous blog post</a> I looked at calibrating the <em>absolute</em> Elo to the probability of a draw. This would prevent drift over time, meaning one could compare players from different generations. I was curious if a similar effect is seen in Atomic. Here is the ratio of odds of a tie, as a function of the average Elo of the two players. The log odds are mostly linear in average Elo, except for a curious dip around 2200. Perhaps many of these games end in a Time forfeit, which I exclude here.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_tie_prob-1.png" title="plot of chunk tie_prob" alt="plot of chunk tie_prob" width="1000px" height="800px" /></p> <h2>Openings</h2> <p>My goal is to find good openings. However, there is a confounding variable of player ability. That is, a certain opening might look good only because good players tend to play it. (Perhaps the defining characteristic of good players is that they play good moves!) In traditional chess, a grand master could defeat me from <em>any</em> starting position, but in competition amongst themselves they most certainly avoid certain openings. So I wish to control for player ability, defined as the difference in Elo prior to the game.</p> <p>First let's look at the distribution of White's opening move. Here are bar plots of the most frequent moves, grouped by White's Elo.<br> Here I am only filtering on White's Elo and number of games in the database, and consider all games, whether they terminate normally or by time forfeit. We see that indeed, there are differences in choice of opening: higher Elo players play <em>g1f3</em> and <em>g1h3</em> more often and <em>e2e4</em> and <em>e2e3</em> less often than lower Elo players. That said, these appear to be the four most common openings across all players.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_openings_plot-1.png" title="plot of chunk openings_plot" alt="plot of chunk openings_plot" width="1000px" height="800px" /></p> <p>The Elo score is defined so that the log odds of victory from White's perspective is 10 to the Elo difference (White's minus Black's) divided by 400. The 400 is so that the units are palatable to humans. So if <span class="math">$$p$$</span> is the probability that White will win a game, we should have </p> <div class="math">$$\left(\frac{p}{1-p}\right) = 10^\frac{\Delta e}{400}.$$</div> <p>A statistician would view this as a logistic regression. Taking the natural log of both sides we have </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e.$$</div> <p> To the right hand side we are going to add terms. First as a warm-up, we estimate White's tempo advantage by fitting to the specification </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e + b,$$</div> <p> for some unknown <span class="math">$$b$$</span>. If instead we write this as </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\Delta e + \frac{\operatorname{log}(10)}{400}c,$$</div> <p> then the constant <span class="math">$$c$$</span> is in 'units' of Elo.</p> <p>I perform this regression on our data, removing Ties, players with few games and low Elo. Here is a table of the regression coefficients from the logistic regression, along with standard errors, the Wald statistics and p-values. I also convert the coefficient estimate into Elo equivalent units. White's advantage is equivalent to around 74 extra Elo points!</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">(Intercept)</td> <td align="right">0.425</td> <td align="right">0.002</td> <td align="right">270</td> <td align="right">0</td> <td align="right">74</td> </tr> <tr> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">0.000</td> <td align="right">675</td> <td align="right">0</td> <td align="right">1</td> </tr> </tbody> </table> <h2>Logistic analysis of opening moves</h2> <p>Of course, this is only if White does not squander their tempo. Some opening moves are likely to result in a higher boost to White than this average value, and some lower. I will now fit a model of the form </p> <div class="math">$$\operatorname{log}\left(\frac{p}{1-p}\right) = \frac{\operatorname{log}(10)}{400}\left[\Delta e + c_{g1f3} 1_{g1f3} + c_{e2e3} 1_{e2e3} + c_{g1h3} 1_{g1h3} + \ldots \right],$$</div> <p> where <span class="math">$$1_{g1f3}$$</span> is a 0/1 variable for whether White played <em>g1f3</em> as their first move, and <span class="math">$$c_{g1f3}$$</span> is the coefficient which we will fit by logistic regression. By rescaling by the magic number <span class="math">$$\operatorname{log}(10) / 400$$</span>, the coefficients <span class="math">$$c$$</span> are denominated in 'Elo units'. Here is a table from that fit:</p> <table> <thead> <tr> <th align="left">term</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">0.000</td> <td align="right">671.6</td> <td align="right">0</td> <td align="right">1</td> </tr> <tr> <td align="left">g1h3</td> <td align="right">0.518</td> <td align="right">0.004</td> <td align="right">116.0</td> <td align="right">0</td> <td align="right">90</td> </tr> <tr> <td align="left">g1f3</td> <td align="right">0.458</td> <td align="right">0.002</td> <td align="right">214.2</td> <td align="right">0</td> <td align="right">80</td> </tr> <tr> <td align="left">e2e3</td> <td align="right">0.361</td> <td align="right">0.004</td> <td align="right">100.9</td> <td align="right">0</td> <td align="right">63</td> </tr> <tr> <td align="left">e2e4</td> <td align="right">0.355</td> <td align="right">0.007</td> <td align="right">54.4</td> <td align="right">0</td> <td align="right">62</td> </tr> <tr> <td align="left">d2d4</td> <td align="right">0.352</td> <td align="right">0.009</td> <td align="right">40.3</td> <td align="right">0</td> <td align="right">61</td> </tr> <tr> <td align="left">b1c3</td> <td align="right">0.346</td> <td align="right">0.008</td> <td align="right">45.5</td> <td align="right">0</td> <td align="right">60</td> </tr> <tr> <td align="left">Other</td> <td align="right">0.092</td> <td align="right">0.010</td> <td align="right">8.8</td> <td align="right">0</td> <td align="right">16</td> </tr> </tbody> </table> <p>I also perform a multiple comparisons procedure on the opening move coefficients using Tukey's procedure to put them into equivalent groups. Here is a plot of the estimated coefficients, with error bars, colored by group. Again the coefficients are in Elo units. We see that <em>g1h3</em> has a coefficient of around 90, while <em>g1f3</em> is around 80. The four other openings considered are nearly equivalent, and all other moves are far inferior.</p> <p><img src="https://www.gilgamath.com/figure/atomic_one_reg_two_plot-1.png" title="plot of chunk reg_two_plot" alt="plot of chunk reg_two_plot" width="1000px" height="800px" /></p> <h2>Second moves</h2> <p>This seems odd, however, because <em>g1f3</em> is played much more often than <em>g1h3</em>, even by high Elo players. Moreover, just as White's apparent tempo advantage is the average value over a bunch of possible moves, some giving better results and some worse, so too is the boost from playing <em>g1h3</em>. Perhaps Black has a good refutation of this move, but it is not well known, and the effect we see here is an average over all of Black's replies.</p> <p>So now I perform an analysis of Black's replies, conditional on White's opening. I filter for White's move, then perform a logistic regression, only considering the most frequent replies. Here is a table of the regression coefficients for the three best looking opening moves for White. Because the equations are still from White's point of view, Black would choose to play the move with the <em>lowest</em> Elo equivalent. Below we plot the coefficient estimates of Black's replies, again grouping statistically non-distinguishable coefficient estimates.</p> <table> <thead> <tr> <th align="left">move1</th> <th align="left">move2</th> <th align="right">estimate</th> <th align="right">std.error</th> <th align="right">statistic</th> <th align="right">p.value</th> <th align="right">Elo equivalent</th> </tr> </thead> <tbody> <tr> <td align="left">e2e3</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">0.000</td> <td align="right">286.54</td> <td align="right">0</td> <td align="right">1.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">d7d6</td> <td align="right">0.311</td> <td align="right">0.021</td> <td align="right">14.55</td> <td align="right">0</td> <td align="right">54.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">e7e6</td> <td align="right">0.317</td> <td align="right">0.004</td> <td align="right">74.83</td> <td align="right">0</td> <td align="right">55.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">d7d5</td> <td align="right">0.334</td> <td align="right">0.022</td> <td align="right">15.01</td> <td align="right">0</td> <td align="right">58.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">g8f6</td> <td align="right">0.432</td> <td align="right">0.011</td> <td align="right">40.12</td> <td align="right">0</td> <td align="right">75.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">e7e5</td> <td align="right">0.592</td> <td align="right">0.022</td> <td align="right">26.55</td> <td align="right">0</td> <td align="right">100.0</td> </tr> <tr> <td align="left">e2e3</td> <td align="left">Other</td> <td align="right">0.763</td> <td align="right">0.016</td> <td align="right">48.37</td> <td align="right">0</td> <td align="right">130.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">0.000</td> <td align="right">490.63</td> <td align="right">0</td> <td align="right">1.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">e7e5</td> <td align="right">0.135</td> <td align="right">0.007</td> <td align="right">19.25</td> <td align="right">0</td> <td align="right">24.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">d7d6</td> <td align="right">0.145</td> <td align="right">0.017</td> <td align="right">8.46</td> <td align="right">0</td> <td align="right">25.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">f7f6</td> <td align="right">0.483</td> <td align="right">0.002</td> <td align="right">211.38</td> <td align="right">0</td> <td align="right">84.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">g8f6</td> <td align="right">0.834</td> <td align="right">0.048</td> <td align="right">17.50</td> <td align="right">0</td> <td align="right">140.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">e7e6</td> <td align="right">1.458</td> <td align="right">0.047</td> <td align="right">30.81</td> <td align="right">0</td> <td align="right">250.0</td> </tr> <tr> <td align="left">g1f3</td> <td align="left">Other</td> <td align="right">2.069</td> <td align="right">0.042</td> <td align="right">48.84</td> <td align="right">0</td> <td align="right">360.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">Elo</td> <td align="right">0.006</td> <td align="right">0.000</td> <td align="right">231.93</td> <td align="right">0</td> <td align="right">1.1</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">e7e6</td> <td align="right">0.092</td> <td align="right">0.022</td> <td align="right">4.23</td> <td align="right">0</td> <td align="right">16.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">e7e5</td> <td align="right">0.155</td> <td align="right">0.019</td> <td align="right">8.21</td> <td align="right">0</td> <td align="right">27.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">f7f6</td> <td align="right">0.503</td> <td align="right">0.010</td> <td align="right">52.54</td> <td align="right">0</td> <td align="right">87.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">h7h6</td> <td align="right">0.536</td> <td align="right">0.006</td> <td align="right">93.12</td> <td align="right">0</td> <td align="right">93.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">Other</td> <td align="right">0.554</td> <td align="right">0.024</td> <td align="right">22.75</td> <td align="right">0</td> <td align="right">96.0</td> </tr> <tr> <td align="left">g1h3</td> <td align="left">g7g5</td> <td align="right">1.837</td> <td align="right">0.037</td> <td align="right">49.63</td> <td align="right">0</td> <td align="right">320.0</td> </tr> </tbody> </table> <p><img src="https://www.gilgamath.com/figure/atomic_one_reg_three_plot-1.png" title="plot of chunk reg_three_plot" alt="plot of chunk reg_three_plot" width="1000px" height="800px" /></p> <p>We note that the apparent best reply to <em>g1f3</em> is <em>e7e5</em>, which yields an advantage to White of 24 Elo points; the best reply to <em>e2e3</em> is <em>d7d6</em>, which is worth 54 Elo points to White; and the best reply to <em>g1h3</em> is <em>e7e6</em>, which yields 16 Elo points. While our analysis of White's openings indicated that <em>g1h3</em> is the best opening, this appears to be due to suboptimal replies from Black. To maximize the minimum reply from Black, White should apparently play <em>e2e3</em>.</p> <p>I would take this analysis with a grain of salt. The response <em>d7d6</em> to White's <em>g1f3</em> looks very weak, with White taking Black's Queen and leaving Black's King very exposed. This suggests something very fishy with potential for big differences in optimal and average play in the responses. </p> <p>Of course, there is also a horizon problem here. Just as the coefficient fits average over suboptimal play in Black's reply, our two level analysis ignores White's second move. In a future blog post I hope to perform a proper minimax analysis of openings, truncating the analysis when the data no longer supports different coefficient fits among the various possible moves, then minimaxing the values up. I also plan on performing a material value analysis, which is quite simple from the data available here. My early exploration suggests that the usual 1:3:3:5:9 point value system of Pawn, Knight, Bishop, Rook, Queen in traditional chess is not well calibrated to Atomic chess. Finally, I hope to catalog the many fool's mates and their frequency.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Nonparametric Market Timing2020-01-04T21:07:06-08:002020-01-04T21:07:06-08:00Steventag:www.gilgamath.com,2020-01-04:/nonparametric_market_timing.html<p>Market timing a single instrument with a single feature</p><p>In a <a href="market_timing">previous blog post</a>, I looked at "market timing" for discrete states. There are a number of ways that result can be generalized. Here we consider a non-parametric view. Suppose you observe some scalar feature <span class="math">$$f_t$$</span> prior to the time required to invest and capture scalar returns <span class="math">$$x_{t+1}$$</span>. Let <span class="math">$$\mu\left(f\right)$$</span> and <span class="math">$$\alpha_2\left(f\right)$$</span> be the first and second moments of returns conditional on the feature: </p> <div class="math">$$E\left[\left.x_{t+1}\right|f_t\right] = \mu\left(f_t\right),\quad E\left[\left.x^2_{t+1}\right|f_t\right] = \alpha_2\left(f_t\right).$$</div> <p> Suppose that <span class="math">$$f_t$$</span> is random with density <span class="math">$$g\left(f\right)$$</span>.</p> <p>Suppose that in response to observing <span class="math">$$f_t$$</span> you allocate <span class="math">$$w\left(f_t\right)$$</span> proportion of your wealth long in the asset. The first and second moments of the returns of this strategy are </p> <div class="math">$$\int \mu\left(x\right) w\left(x\right) g\left(x\right)dx,\quad\mbox{and } \int \alpha_2\left(x\right) w^2\left(x\right) g\left(x\right)dx.$$</div> <!-- PELICAN_END_SUMMARY --> <p>Now we seek the strategy <span class="math">$$w\left(x\right)$$</span> that maximizes the signal-noise ratio, which is the ratio of the expected return to the standard devation of returns. We can transform this metric to the ratio of expected return to the square root of the second moment, by way of the monotonic 'tas' function (the tangent of the arcsine of the return to square root second moment is the signal-noise ratio). Now note that to maximize this ratio we can, without loss of generality, prespecify the denominator to equal some value. This works because the ratio is homogeneous of order zero, and we can rescale <span class="math">$$w\left(x\right)$$</span> by some arbitrary positive constant and get the same objective. This yields the problem </p> <div class="math">$$\max_{w\left(x\right)} \int \mu\left(x\right) w\left(x\right) g\left(x\right)dx,\quad\mbox{subject to }\, \int \alpha_2\left(x\right) w^2\left(x\right) g\left(x\right)dx = 1.$$</div> <p>This, it turns out, is a trivial problem in the calculus of variations. Trivial in the sense that the integrals do not involve the derivative <span class="math">$$w'\left(x\right)$$</span> and so the solution has a simple form which looks just like the finite dimensional Lagrange multiplier solution. After some simplification, the optimal solution is found to be </p> <div class="math">$$w\left(x\right) = \frac{c \mu\left(x\right)}{\alpha_2\left(x\right)}.$$</div> <p> Note this is fully consistent with what we saw for the case where <span class="math">$$f_t$$</span> took one of a finite set of discrete states in our <a href="market_timing">earlier blog post</a>. However, this doesn't quite look like Markowitz, because the denominator has the second moment function, and not the variance function. We will see this actually matters.</p> <h2>Exponential Heteroskedasticity</h2> <p>As an example, consider the case where <span class="math">$$f_t$$</span> takes an exponential distribution with parameter <span class="math">$$\lambda=1$$</span>. Moreover, assume the mean is constant, but the variance is proportional to the feature: </p> <div class="math">$$E\left[\left.x_{t+1}\right|f_t\right] = \mu,\quad Var\left(\left.x^2_{t+1}\right|f_t\right) = f_t \sigma^2.$$</div> <p> The optimal allocation is </p> <div class="math">$$w\left(f_t\right) = \frac{c \mu}{\sigma^2\left(f_t + \zeta^2\right)} = \frac{c'}{f_t + \zeta^2},$$</div> <p> where <span class="math">$$\zeta=\mu/\sigma$$</span>. We note that because <span class="math">$$E\left[f_t\right]=1$$</span> and the expected return is constant with respect to <span class="math">$$f_t$$</span>, the signal-noise ratio of the buy-and-hold strategy is simply <span class="math">$$\zeta$$</span>. The SNR of the optimal timing strategy can be quite a bit higher.</p> <p>To compute that SNR, first let </p> <div class="math">$$q=\zeta^2 \exp{\left(\zeta^2\right)}\int_{\zeta^2}^{\infty} \frac{\exp{\left(-x\right)}}{x}dx.$$</div> <p> (This integral is called the "exponential integral".) Then the SNR of the timing strategy is </p> <div class="math">$$\operatorname{sign}\left(c\right)\sqrt{\frac{q}{1-q}}.$$</div> <p> Here we confirm this empirically. We spawn a bunch of <span class="math">$$f_t$$</span> and <span class="math">$$x_{t+1}$$</span> under the model, then compute the returns of the buy-and-hold strategy, the optimal strategy, and the Markowitz equivalent which holds proportional to mean divided by variance:</p> <div class="highlight"><pre><span></span>mu <span class="o">&lt;-</span> <span class="m">0.1</span> sg <span class="o">&lt;-</span> <span class="m">1</span> zetasq <span class="o">&lt;-</span> <span class="p">(</span>mu<span class="o">/</span>sg<span class="p">)</span><span class="o">^</span><span class="m">2</span> <span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span> feat <span class="o">&lt;-</span> rexp<span class="p">(</span><span class="m">1e6</span><span class="p">,</span>rate<span class="o">=</span><span class="m">1</span><span class="p">)</span> rets <span class="o">&lt;-</span> rnorm<span class="p">(</span><span class="kp">length</span><span class="p">(</span>feat<span class="p">),</span>mean<span class="o">=</span><span class="kp">rep</span><span class="p">(</span>mu<span class="p">,</span><span class="kp">length</span><span class="p">(</span>feat<span class="p">)),</span>sd<span class="o">=</span>sg<span class="o">*</span><span class="kp">sqrt</span><span class="p">(</span>feat<span class="p">))</span> <span class="c1"># optimal allocation; </span> ww <span class="o">&lt;-</span> <span class="m">1</span> <span class="o">/</span> <span class="p">(</span>feat<span class="o">+</span>zetasq<span class="p">)</span> <span class="c1"># markowitz allocation;</span> mw <span class="o">&lt;-</span> <span class="m">1</span> <span class="o">/</span> feat <span class="kn">library</span><span class="p">(</span>SharpeR<span class="p">)</span> buyhold <span class="o">&lt;-</span> <span class="p">(</span>as.sr<span class="p">(</span>rets<span class="p">)</span><span class="o">$</span>sr<span class="p">)</span> optimal <span class="o">&lt;-</span> <span class="p">(</span>as.sr<span class="p">(</span>rets<span class="o">*</span>ww<span class="p">)</span><span class="o">$</span>sr<span class="p">)</span> markwtz <span class="o">&lt;-</span> <span class="p">(</span>as.sr<span class="p">(</span>rets<span class="o">*</span>mw<span class="p">)</span><span class="o">$</span>sr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>expint<span class="p">)</span> qfunc <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>zetsq<span class="p">)</span> <span class="p">{</span> <span class="kn">require</span><span class="p">(</span>expint<span class="p">,</span>quietly<span class="o">=</span><span class="kc">TRUE</span><span class="p">)</span> zetsq <span class="o">*</span> <span class="kp">exp</span><span class="p">(</span>zetsq<span class="p">)</span> <span class="o">*</span> expint<span class="p">(</span>zetsq<span class="p">)</span> <span class="p">}</span> psnrfunc <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>zetsq<span class="p">)</span> <span class="p">{</span> qqq <span class="o">&lt;-</span> qfunc<span class="p">(</span>zetsq<span class="p">)</span> <span class="kp">sqrt</span><span class="p">(</span>qqq <span class="o">/</span> <span class="p">(</span><span class="m">1</span><span class="o">-</span>qqq<span class="p">))</span> <span class="p">}</span> theoretical <span class="o">&lt;-</span> psnrfunc<span class="p">(</span>zetasq<span class="p">)</span> </pre></div> <p>The empirical SNR of the buy-and-hold strategy is 0.1, which is very close to the theoretical value of 0.1. We compute the SNR of the optimal strategy to be 0.207, which is very close to the theoretical value we compute as 0.206 using the exponential integral above. The signal-noise ratio of the Markowitz strategy, however, is a measly 0.0152. </p> <p>We note that for this setup, it is simple to find the optimal <span class="math">$$k$$</span> degree polynomial <span class="math">$$w\left(f_t\right)$$</span>, and confirm they have lower SNR than what we observe here. We leave that as an exercise in our book.</p> <h2>Timing the Market</h2> <p>Here we use this technique on returns of the Market, as defined in the Fama-French factors. We take the 12 month rolling volatility of the Market returns, delayed by a month, as our feature. First we plot the market returns and squared market returns as a function of our feature. We see essentially a flat <span class="math">$$\mu$$</span> but an increasing <span class="math">$$\alpha_2$$</span>.</p> <div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span> <span class="kn">library</span><span class="p">(</span>fromo<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>magrittr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span> <span class="p">})</span> <span class="kr">if</span> <span class="p">(</span><span class="o">!</span><span class="kn">require</span><span class="p">(</span>aqfb.data<span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="kn">require</span><span class="p">(</span>devtools<span class="p">))</span> <span class="p">{</span> devtools<span class="o">::</span>install_github<span class="p">(</span><span class="s">&#39;shabbychef/aqfb_data&#39;</span><span class="p">)</span> <span class="p">}</span> <span class="kn">library</span><span class="p">(</span>aqfb.data<span class="p">)</span> data<span class="p">(</span>mff4<span class="p">)</span> df <span class="o">&lt;-</span> <span class="kt">data.frame</span><span class="p">(</span>mkt<span class="o">=</span>mff4<span class="o">$</span>Mkt<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>vol12<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>fromo<span class="o">::</span>running_sd<span class="p">(</span>Mkt<span class="p">,</span><span class="m">12</span><span class="p">,</span>min_df<span class="o">=</span><span class="m">12L</span><span class="p">)))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>feature<span class="o">=</span>dplyr<span class="o">::</span>lag<span class="p">(</span>vol12<span class="p">,</span><span class="m">2</span><span class="p">))</span> <span class="o">%&gt;%</span> dplyr<span class="o">::</span>filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>feature<span class="p">))</span> <span class="c1"># check on the first moments</span> ph <span class="o">&lt;-</span> df <span class="o">%&gt;%</span> rename<span class="p">(</span>mu<span class="o">=</span>Mkt<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>alpha_2<span class="o">=</span>mu<span class="o">^</span><span class="m">2</span><span class="p">)</span> <span class="o">%&gt;%</span> tidyr<span class="o">::</span>gather<span class="p">(</span>key<span class="o">=</span>series<span class="p">,</span>value<span class="o">=</span>value<span class="p">,</span>mu<span class="p">,</span>alpha_2<span class="p">)</span> <span class="o">%&gt;%</span> ggplot<span class="p">(</span>aes<span class="p">(</span>feature<span class="p">,</span>value<span class="p">))</span> <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> stat_smooth<span class="p">()</span> <span class="o">+</span> facet_grid<span class="p">(</span>series<span class="o">~</span><span class="m">.</span><span class="p">,</span>scales<span class="o">=</span><span class="s">&#39;free&#39;</span><span class="p">)</span> <span class="o">+</span> labs<span class="p">(</span>x<span class="o">=</span><span class="s">&#39;12 month vol, lagged one month&#39;</span><span class="p">,</span> y<span class="o">=</span><span class="s">&#39;mean or second moment&#39;</span><span class="p">,</span> title<span class="o">=</span><span class="s">&#39;Returns of the Market&#39;</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span> </pre></div> <p><img src="https://www.gilgamath.com/figure/nonparametric_market_timing_check_market-1.png" title="plot of chunk check_market" alt="plot of chunk check_market" width="1000px" height="500px" /></p> <p>We now perform a GAM fit on the first and second moments of the Market returns. I have to use a trick to force the second moment estimate to be positive. I plot the optimal allocation versus the feature below. Note that it vaguely resembles the optimal allocation from the exponential heteroskedasticity toy example above. One could also estimate the SNR one would achieve in this case, but that ignores the effects of any estimation error. Moreover, the multi-period SNR that we compute here might be considered a very long term average, something that might not be terribly noticeable on a short time scale.</p> <div class="highlight"><pre><span></span><span class="c1"># do two fits</span> <span class="kp">suppressMessages</span><span class="p">({</span> <span class="kn">library</span><span class="p">(</span>mgcv<span class="p">)</span> <span class="p">})</span> spn <span class="o">&lt;-</span> <span class="m">0.9</span> mufunc <span class="o">&lt;-</span> mgcv<span class="o">::</span>gam<span class="p">(</span>Mkt <span class="o">~</span> feature<span class="p">,</span>data<span class="o">=</span>df<span class="p">,</span>family<span class="o">=</span>gaussian<span class="p">())</span> a2func <span class="o">&lt;-</span> mgcv<span class="o">::</span>gam<span class="p">(</span><span class="kp">I</span><span class="p">(</span><span class="kp">log</span><span class="p">(</span><span class="kp">pmax</span><span class="p">(</span><span class="m">1e-6</span><span class="p">,</span>Mkt<span class="o">^</span><span class="m">2</span><span class="p">)))</span> <span class="o">~</span> feature<span class="p">,</span>data<span class="o">=</span>df<span class="p">,</span>family<span class="o">=</span>gaussian<span class="p">())</span> alloc <span class="o">&lt;-</span> tibble<span class="p">(</span>feature<span class="o">=</span><span class="kp">seq</span><span class="p">(</span><span class="kp">min</span><span class="p">(</span>df<span class="o">$</span>feature<span class="p">)</span><span class="o">*</span><span class="m">1.05</span><span class="p">,</span><span class="kp">max</span><span class="p">(</span>df<span class="o">$</span>feature<span class="p">)</span><span class="o">*</span><span class="m">0.95</span><span class="p">,</span>length.out<span class="o">=</span><span class="m">501</span><span class="p">))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>wts<span class="o">=</span>predict<span class="p">(</span>mufunc<span class="p">,</span><span class="m">.</span><span class="p">)</span> <span class="o">/</span> <span class="kp">exp</span><span class="p">(</span>predict<span class="p">(</span>a2func<span class="p">,</span><span class="m">.</span><span class="p">)))</span> <span class="c1"># if you wanted to estimate the SNR of this allocation:</span> df2 <span class="o">&lt;-</span> df <span class="o">%&gt;%</span> mutate<span class="p">(</span>wts<span class="o">=</span>predict<span class="p">(</span>mufunc<span class="p">,</span><span class="m">.</span><span class="p">)</span> <span class="o">/</span> <span class="kp">exp</span><span class="p">(</span>predict<span class="p">(</span>a2func<span class="p">,</span><span class="m">.</span><span class="p">)))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>ret<span class="o">=</span>Mkt <span class="o">*</span> wts<span class="p">)</span> zetfoo <span class="o">&lt;-</span> as.sr<span class="p">(</span>df2<span class="o">$</span>ret<span class="p">,</span>ope<span class="o">=</span><span class="m">12</span><span class="p">)</span> ph <span class="o">&lt;-</span> alloc <span class="o">%&gt;%</span> ggplot<span class="p">(</span>aes<span class="p">(</span>feature<span class="p">,</span>wts<span class="p">))</span> <span class="o">+</span> geom_line<span class="p">()</span> <span class="o">+</span> labs<span class="p">(</span>x<span class="o">=</span><span class="s">&#39;12 month vol, lagged one month&#39;</span><span class="p">,</span> y<span class="o">=</span><span class="s">&#39;optimal allocation, up to scaling&#39;</span><span class="p">,</span> title<span class="o">=</span><span class="s">&#39;Timing the Market&#39;</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span> </pre></div> <p><img src="https://www.gilgamath.com/figure/nonparametric_market_timing_optimal_allocation-1.png" title="plot of chunk optimal_allocation" alt="plot of chunk optimal_allocation" width="1000px" height="500px" /></p> <h2>Checking on leverage</h2> <p>One odd way to use this nonparametric market timing trick in quantitative trading (though do not take this as investing advice!) is as a kind of check on the leverage of a strategy that levers itself. That is, suppose you have some kind of quantitative strategy that does not always use all the capital allocated to it. Let <span class="math">$$f_t$$</span> be the proportion of wealth that the strategy 'decides' to allocate. Of course this is observable prior to the investment decision. Then estimate, nonparametrically, the first and second moment of the returns of the strategy <em>on full leverage</em> from historical returns. Then compute the optimal leverage as a function of the allocated leverage, and plot one against the other: they should fall on a straight line! If they do not fall on a straight line, the strategy is not making optimal decisions regarding leverage (modulo estimation error).</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>ohenery!2019-09-25T21:32:52-07:002019-09-25T21:32:52-07:00Steventag:www.gilgamath.com,2019-09-25:/ohenery.html<p>ohenery package to CRAN</p><p>I just pushed the first version of my <a href="http://github.com/shabbychef/ohenery"><code>ohenery</code></a> package to <a href="https://cran.r-project.org/package=ohenery">CRAN</a>. The package supports estimation of softmax regression for ordinal outcomes under the Harville and Henery models. Unlike the usual multinomial representation for ordinal outcomes, softmax regression is useful for 'ragged' cases. Contrast:</p> <ul> <li>observed independent variables on participants in multiple races, with the outcomes recorded, and different participants in each race, perhaps different numbers of participants in each race. </li> <li>observed independent variables on independent trials where for each trial there is a single outcome taking values from some ordered set.</li> </ul> <p>Multinomial ordinal regression is for the latter case, while softmax is for the former. It generalizes logistic regression. I had first stumbled on the idea when <a href="best-picture-data">working in the film industry</a>, but called it a <a href="taste-preferences">'Bradley-Terry model'</a> out of ignorance.</p> <!-- PELICAN_END_SUMMARY --> <p>The basic setup is as follows: suppose you observe independent variables <span class="math">$$x_i$$</span> for a participant in a race. Let <span class="math">$$\eta_i = x_i^{\top}\beta$$</span> for some coefficients <span class="math">$$\beta$$</span>. Then let </p> <div class="math">$$\pi_i = \frac{\exp{\eta_i}}{\sum_j \exp{\eta_j}},$$</div> <p> where we sum over all <span class="math">$$j$$</span> in the same race. Under the softmax regression model, the probability that participant <span class="math">$$i$$</span> takes first place is <span class="math">$$\pi_i$$</span>.</p> <p>This formulation is sufficient when you only observe the winner of a multi-participant race, like say the Best Picture winner of the Oscars. However, in some cases you observe the rank of several or all participants. For example, in Olympic events, one observes Gold, Silver and Bronze finishes.</p> <p>Note that it is generally recommended that you <em>not</em> discard continuous information to dichotomize your variables in this way. However, in some cases one only observes the ordinal outcomes. In this case softmax regression can be used.</p> <p>In the case where ranked outcomes are observed beyond the winner, we wish to 'recycle' softmax probabilities. Under the Harville model, the probabilities are recycled proportionally. An example will illustrate: condition on the outcome that participant 11 took first place. Then for <span class="math">$$i \ne 11$$</span>, compute </p> <div class="math">$$\pi_i = \frac{\exp{\eta_i}}{\sum_{j\ne 11} \exp{\eta_j}}.$$</div> <p> Under the Harville model, the probability that the <span class="math">$$i$$</span>th participant took <em>second</em> place is <span class="math">$$\pi_i$$</span>, conditional on the event that 11 took first.</p> <p>The Henery model slightly generalizes the Harville model. Here we imagine some <span class="math">$$\gamma_2, \gamma_3, \gamma_4$$</span> and so on such that the above computation becomes </p> <div class="math">$$\pi_i = \frac{\exp{\gamma_2 \eta_i}}{\sum_{j\ne 11} \exp{\gamma_2 \eta_j}}.$$</div> <p> Then conditional on 11 taking first, and participant 5 taking second, compute </p> <div class="math">$$\pi_i = \frac{\exp{\gamma_3 \eta_i}}{\sum_{j\ne 11, j\ne 5} \exp{\gamma_3 \eta_j}}$$</div> <p> as the probability that participant <span class="math">$$i$$</span> takes third place, and so on. Obviously the Harville model is a Henery model with all <span class="math">$$\gamma_i=1$$</span>.</p> <p>I wasn't sure how to deal with ties in the code. On the one hand, ties are legitimate possible outcomes in some cases. On the other, they are convenient to introduce as some unobserved 'runner up' status. For example, create an 'Aluminum Medal' outcome for Olympians who take neither Gold, Silver or Bronze; in this case many participants tie for the fourth place medal. However, we should not expect the regression to try to fit some order on those participants. The solution was to introduce weights to the estimation. Set the weights to zero for outcomes which are fake ties, and set them to one otherwise.</p> <p>The package uses <code>Rcpp</code> to compute a likelihood (and gradient), then <code>maxLik</code> does the estimation and inference. The rest of the work was me tearing my hair out trying to decipher <code>model.frame</code> and its friends.</p> <h2>Olympic Diving</h2> <p>The package is bundled with a dataset of 100 years of Olympic Men's Platform Diving Records, sourced from Randi Griffin's excellent <a href="https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results">dataset on kaggle</a>.</p> <p>Here we convert the medal records into finishing places of 1, 2, 3 and 4 (no medal), add weights for the fitting, make a factor variable for age, factor the NOC (country) of the athlete. Because Platform Diving is a subjective competition, based on scores from judges, we investigate whether there is a 'home field advantage' by creating a Boolean variable indicating whether the athlete is representing the host nation.</p> <p>We then fit a Henery model to the data. Note that the gamma terms come out very close to one, indicating the Harville model would be sufficient. The home field advantage does not appear real in this analysis. (<em>Note:</em> in the first draft of this blog post, using the first version of the package, the home field effect appeared significant due to coding error.)</p> <div class="highlight"><pre><span></span><span class="c1"># this should be ohenery 0.1.1</span> <span class="kn">library</span><span class="p">(</span>ohenery<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>forcats<span class="p">)</span> data<span class="p">(</span>diving<span class="p">)</span> fitdat <span class="o">&lt;-</span> diving <span class="o">%&gt;%</span> mutate<span class="p">(</span>Finish<span class="o">=</span>case_when<span class="p">(</span><span class="kp">grepl</span><span class="p">(</span><span class="s">&#39;Gold&#39;</span><span class="p">,</span>Medal<span class="p">)</span> <span class="o">~</span> <span class="m">1</span><span class="p">,</span> <span class="c1"># make outcomes</span> <span class="kp">grepl</span><span class="p">(</span><span class="s">&#39;Silver&#39;</span><span class="p">,</span>Medal<span class="p">)</span> <span class="o">~</span> <span class="m">2</span><span class="p">,</span> <span class="kp">grepl</span><span class="p">(</span><span class="s">&#39;Bronze&#39;</span><span class="p">,</span>Medal<span class="p">)</span> <span class="o">~</span> <span class="m">3</span><span class="p">,</span> <span class="kc">TRUE</span> <span class="o">~</span> <span class="m">4</span><span class="p">))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>weight<span class="o">=</span><span class="kp">ifelse</span><span class="p">(</span>Finish <span class="o">&lt;=</span> <span class="m">3</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">0</span><span class="p">))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>cut_age<span class="o">=</span><span class="kp">cut</span><span class="p">(</span>coalesce<span class="p">(</span>Age<span class="p">,</span><span class="m">22.0</span><span class="p">),</span><span class="kt">c</span><span class="p">(</span><span class="m">12</span><span class="p">,</span><span class="m">19.5</span><span class="p">,</span><span class="m">21.5</span><span class="p">,</span><span class="m">22.5</span><span class="p">,</span><span class="m">25.5</span><span class="p">,</span><span class="m">99</span><span class="p">),</span>include.lowest<span class="o">=</span><span class="kc">TRUE</span><span class="p">))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>country<span class="o">=</span>forcats<span class="o">::</span>fct_relevel<span class="p">(</span>forcats<span class="o">::</span>fct_lump<span class="p">(</span><span class="kp">factor</span><span class="p">(</span>NOC<span class="p">),</span>n<span class="o">=</span><span class="m">5</span><span class="p">),</span><span class="s">&#39;Other&#39;</span><span class="p">))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>home_advantage<span class="o">=</span>NOC<span class="o">==</span>HOST_NOC<span class="p">)</span> hensm<span class="p">(</span>Finish <span class="o">~</span> cut_age <span class="o">+</span> country <span class="o">+</span> home_advantage<span class="p">,</span>data<span class="o">=</span>fitdat<span class="p">,</span>weights<span class="o">=</span>weight<span class="p">,</span>group<span class="o">=</span>EventId<span class="p">,</span>ngamma<span class="o">=</span><span class="m">3</span><span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span>-------------------------------------------- Maximum Likelihood estimation BFGS maximization, 43 iterations Return code 0: successful convergence Log-Likelihood: -214.01 12 free parameters Estimates: Estimate Std. error t value Pr(&gt; t) cut_age(19.5,21.5] 0.0303 0.4185 0.07 0.94227 cut_age(21.5,22.5] -0.7276 0.5249 -1.39 0.16565 cut_age(22.5,25.5] 0.0950 0.3790 0.25 0.80199 cut_age(25.5,99] -0.1838 0.4111 -0.45 0.65474 countryGBR -0.6729 0.8039 -0.84 0.40258 countryGER 1.0776 0.4960 2.17 0.02981 * countryMEX 0.7159 0.4744 1.51 0.13126 countrySWE 0.6207 0.5530 1.12 0.26172 countryUSA 2.3201 0.4579 5.07 4.1e-07 *** home_advantageTRUE 0.5791 0.4112 1.41 0.15904 gamma2 1.0054 0.2853 3.52 0.00042 *** gamma3 0.9674 0.2963 3.26 0.00109 ** --- Signif. codes: 0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 -------------------------------------------- </pre></div> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Discrete State Market Timing2019-06-30T10:22:58-07:002019-06-30T10:22:58-07:00Steventag:www.gilgamath.com,2019-06-30:/market-timing.html<p>Market timing with a discrete feature</p><p>In a <a href="portfolio-flattening">previous blog post</a> I talked about two methods for dealing with conditioning information in portfolio construction. Here I apply them both to the problem of <em>market timing with a discrete feature</em>. Suppose that you have a single asset which you can trade long or short. You observe some 'feature', <span class="math">$$f_i$$</span> prior to the time required to make an investment decision to capture the returns <span class="math">$$x_i$$</span>. In this blog post we consider the case where <span class="math">$$f_i$$</span> takes one of <span class="math">$$J$$</span> known discrete values. (note: this subsumes the case where one observes a finite number of different discrete features, since you can combine them into one discrete feature.)</p> <!-- PELICAN_END_SUMMARY --> <p>The features are not something we can manipulate, and usually we consider them random (<em>e.g.</em> are we in a Bear or Bull market? are interest rates high or low? <em>etc.</em>), or if not quite random, at least uncontrollable (<em>e.g.</em> what month is it? did the FOMC just announce? <em>etc.</em>) </p> <p>Denote the states by <span class="math">$$z_j$$</span>, and then assume that, conditional on <span class="math">$$f_i=z_j$$</span> the expected value and variance of <span class="math">$$x_i$$</span> are, respectively, <span class="math">$$m_j$$</span> and <span class="math">$$s_j^2$$</span>. Let the probability that <span class="math">$$f_i=z_j$$</span> be <span class="math">$$\pi_j$$</span>. Suppose that conditional on observing <span class="math">$$f_i=z_j$$</span> you decide to hold <span class="math">$$w_j$$</span> of the asset long or short, depending on the sign of <span class="math">$$w_j$$</span>. The (long term) expected return of your strategy is </p> <div class="math">$$\mu = \sum_{1\le j \le J} \pi_j w_j m_j,$$</div> <p> and the (long term) variance of your returns are </p> <div class="math">$$\sigma^2 = \sum_{1\le j \le J} \pi_j w_j^2 \left(s_j^2 + m_j^2\right) - \mu^2.$$</div> <p>Note that you can directly work with these equations. For example, it is relatively easy to show that you can maximize your signal-noise ratio, <span class="math">$$\mu/\sigma$$</span> by taking </p> <div class="math">$$w_j = c \frac{m_j}{s_j^2 + m_j^2},$$</div> <p> for some constant <span class="math">$$c$$</span> chosen to achieve some long term volatility target. However, some of the analysis we might like to perform (are the weights different for different <span class="math">$$j$$</span>? should we do this at all? <em>etc.</em>) is hard here because we have to start from scratch.</p> <h2>Flatten it!</h2> <p>This is a textbook case for "flattening", whereby we turn a conditional portfolio problem into an unconditional one. Let <span class="math">$$y_{i,j} = \chi_{f_i = z_j} x_i$$</span> be the product of the indicator for being in the <span class="math">$$j$$</span>th state, and the returns <span class="math">$$x_i$$</span>. Let <span class="math">$$w$$</span> be the <span class="math">$$J$$</span>-vector of your portfolio weights <span class="math">$$w_j$$</span>. The return of your strategy on the <span class="math">$$i$$</span>th period is <span class="math">$$y_{i,\cdot} w$$</span>. Letting <span class="math">$$Y$$</span> be the matrix whose <span class="math">$$i,j$$</span>th element is <span class="math">$$y_{i,j}$$</span>, you can perform naive Markowitz on the sample <span class="math">$$Y$$</span> to estimate <span class="math">$$w$$</span>.</p> <p>But now you can easily perform inference: to see whether there is any "there there", you can compute the squared sample Sharpe ratio of the sample Markowitz portfolio, then essentially use Hotelling's <span class="math">$$T^2$$</span> test. More interesting, however, is whether there is any <em>additional</em> gains to be had from market timing beyond the buy-and-hold strategy. This can be couched as the following portfolio optimization problem: </p> <div class="math">$$\max_{w: g^{\top} \Sigma w = 0} \frac{w^{\top}\mu}{\sqrt{w^{\top}\Sigma w}},$$</div> <p> where <span class="math">$$g$$</span> is some portfolio which we would like our portfolio to have no correlation to. Here the elements of the vector <span class="math">$$\mu$$</span> are <span class="math">$$\pi_j m_j$$</span>, and the covariance <span class="math">$$\Sigma$$</span> is <span class="math">$$\operatorname{diag}\left(d\right) - \mu\mu^{\top},$$</span> where <span class="math">$$d_j = \pi_j \left(s_j^2 + m_j^2\right).$$</span> To test whether market timing beats buy-and-hold, take <span class="math">$$g$$</span> to be the vector of all ones, and then test the signal-noise ratio of the resultant portfolio. (<em>n.b.</em> This test is agnostic as to whether buy-and-hold long is better than buy-and-hold short!) That test is actually a "spanning test", and can be performed by using the delta method, as I outlined in section 4.2 of my paper on the <a href="https://arxiv.org/abs/1312.0557">distribution of the Markowitz portfolio</a>.</p> <h2>Conditional Markowitz</h2> <p>In the conditional Markowitz procedure we force the <span class="math">$$s_j^2$$</span> to be equal, while allowing the <span class="math">$$m_j$$</span> to vary. To test this we construct a <span class="math">$$J$$</span>-vector <span class="math">$$f_i$$</span> of the indicator functions, then perform a linear regression of <span class="math">$$f_i$$</span> against <span class="math">$$x_i$$</span>. Pooling the residuals of this in-sample fit, we then compute the estimate of <span class="math">$$s_{\cdot}^2$$</span>. Note that the conditional Markowitz portfolio now has <span class="math">$$w_j$$</span> simply proportional to (our estimate of) <span class="math">$$m_j$$</span>, since the variance is fixed.</p> <p>To test for presence of an effect one uses an MGLH test, like the Hotelling-Lawley trace. Now, however, the test for market timing ability beyond buy-and-hold is not via a spanning test. The spanning test outlined in section 4.5 of <a href="https://arxiv.org/abs/1312.0557">the asymptotic Markowitz paper</a> only tests against other static portfolios on the assets, but in this case there is only a single asset, the market; To have zero correlation to the buy-and-hold portfolio one would have to hold zero dollars of the market. To test ability beyond buy-and-hold, one should use a regression test for equality of the regression betas, in this case equivalent to testing equality of all the <span class="math">$$m_j$$</span>. That is, an 'ANOVA'.</p> <h2>Lets try it</h2> <p>Here I demonstrate the idea with some toy data. The 'market' in this case are the monthly simple returns of the Market portfolio, taken from the Fama French data. I have added the risk-free rate back to the market returns as they were published by <a href="https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html">Ken French</a>, since we may hold the Market long or short.</p> <p>For features I compute the 6 month rolling mean return, and the 12 month volatility of Market returns. The mean computation is a bit odd, since these are simple returns, not geometric returns, and so they do not telescope. I lag both of these computations by two months, then align them with Market. The two month lag is equivalent to lagging the feature one month minus an epsilon, and is 'causal' in the sense that one could observe the features prior to making a trade decision.</p> <p>I then binarize both these variables, comparing the mean to <span class="math">$$1\%$$</span> per month to define the market as 'bear or 'bull', and comparing the volatility to <span class="math">$$4\%$$</span> per square root month to define the environment as 'high vol' or 'low vol'. The product of these two give us a feature with four states. The odd cutoffs were chosen to give approximately equal <span class="math">$$\pi_j$$</span>. Here I load the data and compute the feature. </p> <div class="highlight"><pre><span></span><span class="c1"># devtools::install_github(&#39;shabbychef/aqfb_data&#39;)</span> <span class="kn">library</span><span class="p">(</span>aqfb.data<span class="p">)</span> data<span class="p">(</span>mff4<span class="p">)</span> <span class="kp">suppressMessages</span><span class="p">({</span> <span class="kn">library</span><span class="p">(</span>fromo<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>magrittr<span class="p">)</span> <span class="p">})</span> df <span class="o">&lt;-</span> <span class="kt">data.frame</span><span class="p">(</span>mkt<span class="o">=</span>mff4<span class="o"></span>Mkt<span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>mean06<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>fromo<span class="o">::</span>running_mean<span class="p">(</span>Mkt<span class="p">,</span><span class="m">6</span><span class="p">,</span>min_df<span class="o">=</span><span class="m">6L</span><span class="p">)),</span> vol12<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>fromo<span class="o">::</span>running_sd<span class="p">(</span>Mkt<span class="p">,</span><span class="m">12</span><span class="p">,</span>min_df<span class="o">=</span><span class="m">12L</span><span class="p">)))</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>vola<span class="o">=</span><span class="kp">ifelse</span><span class="p">(</span>dplyr<span class="o">::</span>lag<span class="p">(</span>vol12<span class="p">,</span><span class="m">2</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="m">4</span><span class="p">,</span><span class="s">&#39;hivol&#39;</span><span class="p">,</span><span class="s">&#39;lovol&#39;</span><span class="p">),</span> bear<span class="o">=</span><span class="kp">ifelse</span><span class="p">(</span>dplyr<span class="o">::</span>lag<span class="p">(</span>mean06<span class="p">,</span><span class="m">2</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="m">1</span><span class="p">,</span><span class="s">&#39;bull&#39;</span><span class="p">,</span><span class="s">&#39;bear&#39;</span><span class="p">))</span> <span class="o">%&gt;%</span> dplyr<span class="o">::</span>filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>vola<span class="p">),</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>bear<span class="p">))</span> <span class="o">%&gt;%</span> tidyr<span class="o">::</span>unite<span class="p">(</span>feature<span class="p">,</span>bear<span class="p">,</span>vola<span class="p">,</span>remove<span class="o">=</span><span class="kc">FALSE</span><span class="p">)</span> </pre></div> <p>Here are plots of the distribution of Market returns in each of the four states of the feature. On the top are the high volatility states; bear and bull are denoted by different colors. The violin plots show the distribution, while jittered points give some indication of the location of outliers.</p> <div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span> <span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span> ph <span class="o">&lt;-</span> df <span class="o">%&gt;%</span> ggplot<span class="p">(</span>aes<span class="p">(</span>x<span class="o">=</span>feature<span class="p">,</span>y<span class="o">=</span>Mkt<span class="p">,</span>color<span class="o">=</span>bear<span class="p">))</span> <span class="o">+</span> geom_violin<span class="p">()</span> <span class="o">+</span> geom_jitter<span class="p">(</span>alpha<span class="o">=</span><span class="m">0.4</span><span class="p">,</span>width<span class="o">=</span><span class="m">0.3</span><span class="p">,</span>height<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="o">+</span> geom_hline<span class="p">(</span>yintercept<span class="o">=</span><span class="m">0</span><span class="p">,</span>linetype<span class="o">=</span><span class="m">2</span><span class="p">,</span>alpha<span class="o">=</span><span class="m">0.5</span><span class="p">)</span> <span class="o">+</span> coord_flip<span class="p">()</span> <span class="o">+</span> facet_grid<span class="p">(</span>vola<span class="o">~</span><span class="m">.</span><span class="p">,</span>space<span class="o">=</span><span class="s">&#39;free&#39;</span><span class="p">,</span>scales<span class="o">=</span><span class="s">&#39;free&#39;</span><span class="p">)</span> <span class="o">+</span> labs<span class="p">(</span>y<span class="o">=</span><span class="s">&#39;monthly market returns (pct)&#39;</span><span class="p">,</span> x<span class="o">=</span><span class="s">&#39;feature&#39;</span><span class="p">,</span> color<span class="o">=</span><span class="s">&#39;bear/bull market?&#39;</span><span class="p">,</span> title<span class="o">=</span><span class="s">&#39;market returns for different states&#39;</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span> </pre></div> <p><img src="https://www.gilgamath.com/figure/market_timing_violins-1.png" title="plot of chunk violins" alt="plot of chunk violins" width="1000px" height="500px" /></p> <p>We clearly see higher volatility in the <code>hivol</code> case, but it is hard to get a sense of how the mean differs in the four cases. Here I tabulate the mean and standard deviation of returns for each of the four states, and then compute the quasi Markowitz portfolio defined as <span class="math">$$m_j/ \left(s_j^2 + m_j^2\right)$$</span>. There is some momentum effect with higher Markowitz weights in bull markets, and a low-vol effect due to autocorrelated heteroskedasticity.</p> <div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>knitr<span class="p">)</span> df <span class="o">%&gt;%</span> group_by<span class="p">(</span>feature<span class="p">)</span> <span class="o">%&gt;%</span> summarize<span class="p">(</span>muv<span class="o">=</span><span class="kp">mean</span><span class="p">(</span>Mkt<span class="p">),</span>sdv<span class="o">=</span>sd<span class="p">(</span>Mkt<span class="p">),</span>count<span class="o">=</span>n<span class="p">())</span> <span class="o">%&gt;%</span> ungroup<span class="p">()</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span><span class="sb">quasi markowitz</span><span class="o">=</span>muv <span class="o">/</span> <span class="p">(</span>sdv<span class="o">^</span><span class="m">2</span> <span class="o">+</span> muv<span class="o">^</span><span class="m">2</span><span class="p">))</span> <span class="o">%&gt;%</span> rename<span class="p">(</span><span class="sb">mean ret</span><span class="o">=</span>muv<span class="p">,</span><span class="sb">sd ret</span><span class="o">=</span>sdv<span class="p">)</span> <span class="o">%&gt;%</span> knitr<span class="o">::</span>kable<span class="p">()</span> </pre></div> <table> <thead> <tr> <th align="left">feature</th> <th align="right">mean ret</th> <th align="right">sd ret</th> <th align="right">count</th> <th align="right">quasi markowitz</th> </tr> </thead> <tbody> <tr> <td align="left">bear_hivol</td> <td align="right">0.772418</td> <td align="right">7.90544</td> <td align="right">273</td> <td align="right">0.012243</td> </tr> <tr> <td align="left">bear_lovol</td> <td align="right">0.695631</td> <td align="right">4.09344</td> <td align="right">222</td> <td align="right">0.040350</td> </tr> <tr> <td align="left">bull_hivol</td> <td align="right">1.115115</td> <td align="right">5.03884</td> <td align="right">262</td> <td align="right">0.041869</td> </tr> <tr> <td align="left">bull_lovol</td> <td align="right">1.014484</td> <td align="right">3.41283</td> <td align="right">310</td> <td align="right">0.080028</td> </tr> </tbody> </table> <p>Now let's perform inference. Luckily I already coded all of the tests we will need here in <code>SharpeR</code>. For flattening, take the product of the Market returns and the dummy 0/1 variables for the feature. I then feed them to <code>as.sropt</code>, which computes and displays: the Sharpe ratio of the Markowitz portfolio; the "Sharpe Ratio Information Criterion" of <a href="https://arxiv.org/abs/1602.06186">Paulsen and Sohl</a>, which is unbiased for the out-of-sample performance; the 95 percent confidence bounds on the optimal Signal-Noise ratio; the Hotelling <span class="math">$$T^2$$</span> and associated <span class="math">$$p$$</span>-value.</p> <div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span> <span class="kn">library</span><span class="p">(</span>SharpeR<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>fastDummies<span class="p">)</span> <span class="p">})</span> mktsr <span class="o">&lt;-</span> as.sr<span class="p">(</span>df<span class="o">$</span>Mkt<span class="p">,</span>ope<span class="o">=</span><span class="m">12</span><span class="p">)</span> Y <span class="o">&lt;-</span> df <span class="o">%&gt;%</span> dummy_columns<span class="p">(</span>select_columns<span class="o">=</span><span class="s">&#39;feature&#39;</span><span class="p">)</span> <span class="o">%&gt;%</span> mutate<span class="p">(</span>y_bull_hivol<span class="o">=</span>Mkt <span class="o">*</span> feature_bull_hivol<span class="p">,</span> y_bear_hivol<span class="o">=</span>Mkt <span class="o">*</span> feature_bear_hivol<span class="p">,</span> y_bull_lovol<span class="o">=</span>Mkt <span class="o">*</span> feature_bull_lovol<span class="p">,</span> y_bear_lovol<span class="o">=</span>Mkt <span class="o">*</span> feature_bear_lovol<span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span>matches<span class="p">(</span><span class="s">&#39;^y_(bull|bear)_(hi|lo)vol$&#39;</span><span class="p">))</span> sstar <span class="o">&lt;-</span> as.sropt<span class="p">(</span>Y<span class="p">,</span>ope<span class="o">=</span><span class="m">12</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span>sstar<span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span> SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(&gt;T^2) Sharpe 0.738 0.692 0.499 0.927 48.4 1.3e-09 *** --- Signif. codes: 0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 </pre></div> <p>We compute the Sharpe ratio of the sample Markowitz portfolio to be <span class="math">$$0.738022 \mbox{yr}^{-1/2}$$</span>. Compare this to the Sharpe ratio of the long market portfolio, which we compute to be around <span class="math">$$0.585551 \mbox{yr}^{-1/2}$$</span>. We now perform the spanning test. This is via the <code>as.del_sropt</code> function, where we feed in portfolios to hedge against. We display the in-sample Sharpe statistic, confidence intervals on the population quantity, and the <span class="math">$$F$$</span> statistic and <span class="math">$$p$$</span> value.</p> <div class="highlight"><pre><span></span>spansr <span class="o">&lt;-</span> as.del_sropt<span class="p">(</span>Y<span class="p">,</span>G<span class="o">=</span><span class="kt">matrix</span><span class="p">(</span><span class="kp">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">),</span>nrow<span class="o">=</span><span class="m">1</span><span class="p">),</span>ope<span class="o">=</span><span class="m">12</span><span class="p">)</span> <span class="kp">print</span><span class="p">(</span>spansr<span class="p">)</span> </pre></div> <div class="highlight"><pre><span></span> SR/sqrt(yr) 2.5 % 97.5 % F value Pr(&gt;F) Sharpe 0.449 0.2 0.639 5.8 0.00062 *** --- Signif. codes: 0 &#39;***&#39; 0.001 &#39;**&#39; 0.01 &#39;*&#39; 0.05 &#39;.&#39; 0.1 &#39; &#39; 1 </pre></div> <p>We estimate the Sharpe of the hedged portfolio to be <span class="math">$$0.449229 \mbox{yr}^{-1/2}$$</span>. It is worth pointing out the subadditivity of SNR here. If you have two uncorrelated assets, the Signal-Noise ratio the optimal portfolio on the assets is the root square sum of the SNRs of the assets. Generalizing to <span class="math">$$k$$</span> independent assets, the optimal SNR is the length of the vector whose elements are the SNRs of the assets. In this case we observe </p> <div class="math">$$\sqrt{0.449229^2 + 0.585551^2} \approx 0.738022,$$</div> <p> which was the Sharpe of the unhedged timing portfolio. The gains beyond buy-and-hold seem modest indeed; one would require very patient investors to prove out this strategy in real trading.</p> <p><code>SharpeR</code> does not compute the portfolio weights. So here I use <code>MarkowitzR</code> to compute and display the weights of the unhedged Markowitz portfolio and the Markowitz portfolio hedged against buy-and-hold. The first should have weights proportional to the quasi Markowitz weights shown above.</p> <div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>MarkowitzR<span class="p">)</span> bare <span class="o">&lt;-</span> mp_vcov<span class="p">(</span>Y<span class="p">)</span> kable<span class="p">(</span>bare<span class="o"></span>W<span class="p">,</span>caption<span class="o">=</span><span class="s">&#39;unhedged portfolio&#39;</span><span class="p">)</span> </pre></div> <p>Table: unhedged portfolio</p> <table> <thead> <tr> <th align="left"></th> <th align="right">Intercept</th> </tr> </thead> <tbody> <tr> <td align="left">y_bull_hivol</td> <td align="right">0.043931</td> </tr> <tr> <td align="left">y_bear_hivol</td> <td align="right">0.012845</td> </tr> <tr> <td align="left">y_bull_lovol</td> <td align="right">0.083913</td> </tr> <tr> <td align="left">y_bear_lovol</td> <td align="right">0.042368</td> </tr> </tbody> </table> <div class="highlight"><pre><span></span>vsbh <span class="o">&lt;-</span> mp_vcov<span class="p">(</span>Y<span class="p">,</span>Gmat<span class="o">=</span><span class="kt">matrix</span><span class="p">(</span><span class="kp">rep</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">),</span>nrow<span class="o">=</span><span class="m">1</span><span class="p">))</span> kable<span class="p">(</span>vsbh<span class="o"></span>W<span class="p">,</span>caption<span class="o">=</span><span class="s">&#39;hedged portfolio&#39;</span><span class="p">)</span> </pre></div> <p>Table: hedged portfolio</p> <table> <thead> <tr> <th align="left"></th> <th align="right">Intercept</th> </tr> </thead> <tbody> <tr> <td align="left">y_bull_hivol</td> <td align="right">0.012535</td> </tr> <tr> <td align="left">y_bear_hivol</td> <td align="right">-0.018551</td> </tr> <tr> <td align="left">y_bull_lovol</td> <td align="right">0.052517</td> </tr> <tr> <td align="left">y_bear_lovol</td> <td align="right">0.010972</td> </tr> </tbody> </table> <p>The hedged portfolio has negative or near-zero weights in bear markets, and generally smaller holdings in high volatility environments, as expected. Note that we have achieved zero correlation to buy-and-hold without apparently having zero mean weights. In reality our portfolio weights have zero volatility-weighted mean.</p> <p>All of this analysis was via the "flattening trick". I realize I do not have good tools in place to perform the spanning test in the conditional Markowitz formulation. Of course, R has tools for the ANOVA test, but they will not report the effect size in units like the Sharpe, so it is hard to interpret economic significance. However, I can easily compute the conditional Markowitz portfolio weights, which I tabulate below. Note that the assumption of equal volatility makes the portfolio weights proportional to the estimated mean returns</p> <div class="highlight"><pre><span></span><span class="c1"># conditional markowitz. </span> featfit <span class="o">&lt;-</span> mp_vcov<span class="p">(</span>X<span class="o">=</span><span class="kp">as.matrix</span><span class="p">(</span>df<span class="o">$</span>Mkt<span class="p">),</span> feat<span class="o">=</span>df <span class="o">%&gt;%</span> dummy_columns<span class="p">(</span>select_columns<span class="o">=</span><span class="s">&#39;feature&#39;</span><span class="p">)</span> <span class="o">%&gt;%</span> select<span class="p">(</span>matches<span class="p">(</span><span class="s">&#39;^feature_(bull|bear)_(hi|lo)vol$&#39;</span><span class="p">))</span> <span class="o">%&gt;%</span> <span class="kp">as.matrix</span><span class="p">(),</span> fit.intercept<span class="o">=</span><span class="kc">FALSE</span><span class="p">)</span> kable<span class="p">(</span><span class="kp">t</span><span class="p">(</span>featfit<span class="o"></span>W<span class="p">),</span>caption<span class="o">=</span><span class="s">&#39;conditional Markowitz unhedged portfolio&#39;</span><span class="p">)</span> </pre></div> <p>Table: conditional Markowitz unhedged portfolio</p> <table> <thead> <tr> <th align="left"></th> <th align="right">as.matrix(dfMkt)1</th> </tr> </thead> <tbody> <tr> <td align="left">feature_bear_hivol</td> <td align="right">0.026648</td> </tr> <tr> <td align="left">feature_bear_lovol</td> <td align="right">0.023999</td> </tr> <tr> <td align="left">feature_bull_hivol</td> <td align="right">0.038471</td> </tr> <tr> <td align="left">feature_bull_lovol</td> <td align="right">0.034999</td> </tr> </tbody> </table> <h3>Caveats</h3> <p>I feel it is worthwhile to point out this is a toy analysis: the data go back to the late 1920's, which was a far different trading environment; we ignore any trading frictions and assume you can freely short or lever the Market; the feature is highly autocorrelated so investors are unlikely to see the long-term benefit of this timing portfolio, <em>etc.</em> In any case, don't take investing advice from a blogpost.</p> <h2>Further work</h2> <p>There is a non-parametric analogue of the flattening trick used here that applies to the case of market timing with a single continous feature, which I hope to present in a future blog post.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Conditional Portfolios with Feature Flattening2019-06-19T21:04:21-07:002019-06-19T21:04:21-07:00Steven E. Pavtag:www.gilgamath.com,2019-06-19:/portfolio-flattening.html<h2>Conditional Portfolios</h2> <p>When I first started working at a quant fund I tried to read about portfolio theory. (Beyond, you know, "<em>Hedge Funds for Dummies</em>.") I learned about various objectives and portfolio constraints, including the Markowitz portfolio, which felt very natural. Markowitz solves the mean-variance optimization problem, as well as the Sharpe maximization problem, namely </p> <div class="math">$$\operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}.$$</div> <p> This is solved, up to scaling, by the Markowitz portfolio <span class="math">$$\Sigma^{-1}\mu$$</span>.</p> <p>When I first read about the theory behind Markowitz, I did not read anything about where <span class="math">$$\mu$$</span> and <span class="math">$$\Sigma$$</span> come from. I assumed the authors I was reading were talking about the vanilla sample estimates of the mean and covariance, though the theory does not require this.</p> <p>There are some problems with the Markowitz portfolio. For us, as a small quant fund, the most pressing issue was that holding the Markowitz portfolio based on the historical mean and covariance was not a good look. You don't get paid "2 and twenty" for computing some long term averages.</p> <!-- PELICAN_END_SUMMARY --> <p>Rather than holding an <em>unconditional</em> portfolio, we sought to construct a <em>conditional</em> one, conditional on some "features". (I now believe this topic falls under the rubric of "Tactical Asset Allocation".) We stumbled on two simple methods for adapting Markowitz theory to accept conditioning information: Conditional Markowitz, and "Flattening".</p> <h2>Conditional Markowitz</h2> <p>Suppose you observe some <span class="math">$$l$$</span> vector of features, <span class="math">$$f_i$$</span> prior to the time you have to allocate into <span class="math">$$p$$</span> assets to enjoy returns <span class="math">$$x_i$$</span>. Assume that the returns are linear in the features, but the covariance is a long term average. That is </p> <div class="math">$$E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma.$$</div> <p>Note that Markowitz theory never really said how to estimate mean …</p><h2>Conditional Portfolios</h2> <p>When I first started working at a quant fund I tried to read about portfolio theory. (Beyond, you know, "<em>Hedge Funds for Dummies</em>.") I learned about various objectives and portfolio constraints, including the Markowitz portfolio, which felt very natural. Markowitz solves the mean-variance optimization problem, as well as the Sharpe maximization problem, namely </p> <div class="math">$$\operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}.$$</div> <p> This is solved, up to scaling, by the Markowitz portfolio <span class="math">$$\Sigma^{-1}\mu$$</span>.</p> <p>When I first read about the theory behind Markowitz, I did not read anything about where <span class="math">$$\mu$$</span> and <span class="math">$$\Sigma$$</span> come from. I assumed the authors I was reading were talking about the vanilla sample estimates of the mean and covariance, though the theory does not require this.</p> <p>There are some problems with the Markowitz portfolio. For us, as a small quant fund, the most pressing issue was that holding the Markowitz portfolio based on the historical mean and covariance was not a good look. You don't get paid "2 and twenty" for computing some long term averages.</p> <!-- PELICAN_END_SUMMARY --> <p>Rather than holding an <em>unconditional</em> portfolio, we sought to construct a <em>conditional</em> one, conditional on some "features". (I now believe this topic falls under the rubric of "Tactical Asset Allocation".) We stumbled on two simple methods for adapting Markowitz theory to accept conditioning information: Conditional Markowitz, and "Flattening".</p> <h2>Conditional Markowitz</h2> <p>Suppose you observe some <span class="math">$$l$$</span> vector of features, <span class="math">$$f_i$$</span> prior to the time you have to allocate into <span class="math">$$p$$</span> assets to enjoy returns <span class="math">$$x_i$$</span>. Assume that the returns are linear in the features, but the covariance is a long term average. That is </p> <div class="math">$$E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma.$$</div> <p>Note that Markowitz theory never really said how to estimate mean returns, and thus the conditional expectation here can be used directly in the Markowitz portfolio definition. Thus the conditional Markowitz portfolio, conditional on observing <span class="math">$$f_i$$</span> is simply <span class="math">$$\Sigma^{-1} B f_i$$</span>. Another way of viewing this is to estimate the "Markowitz coefficient", <span class="math">$$W=\Sigma^{-1} B$$</span> and just multiply this by <span class="math">$$f_i$$</span> when it is observed.</p> <p>I have written about inference on the <a href="https://arxiv.org/abs/1312.0557">conditional Markowitz</a> portfolio: via the MGLH tests one can test essentially whether <span class="math">$$W$$</span> is all zeros, or test the total effect size. However, the conditional Markowitz procedure is, like the unconditional procedure, subject to the <a href="https://arxiv.org/abs/1409.5936">Cramer Rao portfolio bounds</a> in the 'obvious' way: increasing the number of fit coefficients faster than the signal-noise ratio can cause degraded out-of-sample performance.</p> <h2>The Flattening Trick</h2> <p>The other approach for adding conditional information is slicker. When I first reinvented it, I called it the "flattening trick". I assumed it was well established in the folklore of the quant community, but I have only found one reference to it, a <a href="https://www.researchgate.net/publication/5184999_Dynamic_Portfolio_Selection_by_Augmenting_the_Asset_Space">paper by Brandt and Santa Clara</a>, where they refer to it as "augmenting the asset space". </p> <p>The idea is as follows: in the conditional Markowitz procedure we ended with a matrix <span class="math">$$W$$</span> such that, conditional on <span class="math">$$f_i$$</span> we would hold portfolio <span class="math">$$W f_i$$</span>. Why not just start with the assumption that you seek a portfolio that is linear in <span class="math">$$f_i$$</span> and optimize the <span class="math">$$W$$</span>? Note that the returns you experience by holding <span class="math">$$W f_i$$</span> is exactly </p> <div class="math">$$x_i^{\top} W f_i = \operatorname{trace}\left(x_i^{\top} W f_i\right) = \operatorname{trace}\left(f_i x_i^{\top} W\right) = \operatorname{vec}^{\top}\left(f_i x_i^{\top}\right) \operatorname{vec}\left(W\right),$$</div> <p> where <span class="math">$$\operatorname{vec}$$</span> is the vectorization operator that take a matrix to a vector columnwise. I called this "flattening," but maybe it's more like "unravelling".</p> <p>Now note that the optimization problem you are trying to solve is to find the vector <span class="math">$$\operatorname{vec}\left(W\right)$$</span>, with pseudo-returns of <span class="math">$$y_i = \operatorname{vec}\left(f_i x_i^{\top}\right)$$</span>.<br> You can simply construct these pseudo returns <span class="math">$$y_i$$</span> from your historical data, and feed them into an unconditional portfolio process. You can use unconditional Markowitz for this, or any other unconditional procedure. Then take the results of the unconditional process and unflatten them back to <span class="math">$$W$$</span>.</p> <p>Note that even when you use unconditional Markowitz on the flattened problem, you will not regain the <span class="math">$$W$$</span> from conditional Markowitz. The reason is that we are essentially allowing the covariance of returns to vary with our features as well, which was not possible in conditional Markowitz. In practice we often found that flattening trick had slightly worse out-of-sample performance than conditional Markowitz when used on the same data, which we broadly attributed to overfitting. In conditional Markowitz we would estimate the <span class="math">$$p \times l$$</span> matrix <span class="math">$$B$$</span> and the <span class="math">$$p \times p$$</span> matrix <span class="math">$$\Sigma$$</span>, to arrive at <span class="math">$$p \times l$$</span> matrix <span class="math">$$W$$</span>. In flattening plus unconditional Markowitz you estimate a <span class="math">$$pl$$</span> vector of means, and the <span class="math">$$pl \times pl$$</span> matrix of covariance to arrive at the <span class="math">$$p \times l$$</span> matrix <span class="math">$$W$$</span>.</p> <p>To mitigate the overfitting, it is fairly easy to add sparsity to the flattening trick. If you wish to force an element of <span class="math">$$W$$</span> to be zero, because you think a certain feature should have no bearing on your holdings of a certain asset, you can just elide it from the flattening pseudo returns. Moreover, if you feel that certain feature should only have, say, a positive influence on your holdings of a particular asset, you can directly impose that positivity constraint in the pseudo portfolio optimization problem. Because you are solving directly for elements of <span class="math">$$W$$</span>, this is much easier than in conditional Markowitz where <span class="math">$$W$$</span> is the product of two matrices.</p> <p>Flattening is a neat trick. You should consider it the next time you're allocating assets tactically.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>No Parity like a Risk Parity.2019-06-09T22:53:04-07:002019-06-09T22:53:04-07:00Steven E. Pavtag:www.gilgamath.com,2019-06-09:/risk-parity.html<h2>Portfolio Selection and Exchangeability</h2> <p>Consider the problem of <em>portfolio selection</em>, where you observe some historical data on <span class="math">$$p$$</span> assets, say <span class="math">$$n$$</span> days worth in an <span class="math">$$n\times p$$</span> matrix, <span class="math">$$X$$</span>, and then are required to construct a (dollarwise) portfolio <span class="math">$$w$$</span>. You can view this task as a function <span class="math">$$w\left(X\right)$$</span>. There are a few different kinds of <span class="math">$$w$$</span> function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.</p> <p>How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.</p> <!-- PELICAN_END_SUMMARY --> <p>One way to consider these different methods of portfolio construction is via the lens of <em>exchangeability</em>. Roughly speaking, how does the function <span class="math">$$w\left(X\right)$$</span> react under certain systematic changes in <span class="math">$$X$$</span> that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of <span class="math">$$X$$</span> alphabetically, so now you must reorder your <span class="math">$$X$$</span>. Assuming no new data has been observed, shouldn't <span class="math">$$w\left(X\right)$$</span> simply reorder its output in the same way?</p> <p>Put another way, suppose a method <span class="math">$$w$$</span> systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where <code>AAPL</code> happened to be the first stock in the universe. Your <span class="math">$$w$$</span> might seem to outperform other methods for no good reason.</p> <p>Equivariance to order is a kind of exchangeability condition. The 'right' kind of <span class="math">$$w$$</span> is 'order …</p><h2>Portfolio Selection and Exchangeability</h2> <p>Consider the problem of <em>portfolio selection</em>, where you observe some historical data on <span class="math">$$p$$</span> assets, say <span class="math">$$n$$</span> days worth in an <span class="math">$$n\times p$$</span> matrix, <span class="math">$$X$$</span>, and then are required to construct a (dollarwise) portfolio <span class="math">$$w$$</span>. You can view this task as a function <span class="math">$$w\left(X\right)$$</span>. There are a few different kinds of <span class="math">$$w$$</span> function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.</p> <p>How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.</p> <!-- PELICAN_END_SUMMARY --> <p>One way to consider these different methods of portfolio construction is via the lens of <em>exchangeability</em>. Roughly speaking, how does the function <span class="math">$$w\left(X\right)$$</span> react under certain systematic changes in <span class="math">$$X$$</span> that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of <span class="math">$$X$$</span> alphabetically, so now you must reorder your <span class="math">$$X$$</span>. Assuming no new data has been observed, shouldn't <span class="math">$$w\left(X\right)$$</span> simply reorder its output in the same way?</p> <p>Put another way, suppose a method <span class="math">$$w$$</span> systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where <code>AAPL</code> happened to be the first stock in the universe. Your <span class="math">$$w$$</span> might seem to outperform other methods for no good reason.</p> <p>Equivariance to order is a kind of exchangeability condition. The 'right' kind of <span class="math">$$w$$</span> is 'order exchangeable'. Other examples come from considering rotations or basketization. Suppose that today your universe consists of stocks A and B, which you can hold long or short, but tomorrow you can only buy basket C which is equal dollars long in A and B, and basket D which is equal dollars long A and short B. Tomorrow you can achieve the same holdings that you wanted today, but by holding the baskets. Your portfolio function should be exchangeable with respect to this transformation, suggesting you hold the same equivalent position.</p> <p>In math, let <span class="math">$$Q$$</span> be an invertible <span class="math">$$p\times p$$</span> matrix. We will consider what should happen if returns are transformed by <span class="math">$$Q^{\top}$$</span>. Exchangeability holds when </p> <div class="math">$$w\left(X Q\right) = Q^{-1}w\left(X\right).$$</div> <p> If this holds for all invertible <span class="math">$$Q$$</span> then the <span class="math">$$w$$</span> satisfies the exchangeability condition. Some <span class="math">$$w$$</span> might maintain the above relationship for some kinds of <span class="math">$$Q$$</span>, leading to weaker forms of exchangeability. Here we name them with the class of <span class="math">$$Q$$</span>:</p> <ul> <li>A <span class="math">$$w$$</span> satisfies 'order exchangeability' property if it exchangeable for all permutation matrices <span class="math">$$Q$$</span>;</li> <li>'leverage exchangeability' if it is exchangeable for all diagonal <span class="math">$$Q$$</span>;</li> <li>'rotational exchangeability' if it is exchangeable for all orthogonal <span class="math">$$Q$$</span>.</li> </ul> <p>Leverage exchangeability is illustrated by considering what would happen if each asset was replaced by, say, a 2x or 3x levered version of the same asset.</p> <p>One consequence of exchangeability is that "only returns matter". That is if we exchange returns <span class="math">$$x\mapsto Q^{\top}x$$</span>, and portfolio <span class="math">$$w\mapsto Q^{-1}w$$</span>, then the returns achieved by that portfolio map to <span class="math">$$x^{\top}w \mapsto x^{\top}Q Q^{-1}w = x^{\top}w$$</span>. The returns you achieve are the same under the transformation. This dependence of <span class="math">$$w$$</span> on returns was a key assumption in my <a href="https://arxiv.org/abs/1409.5936">work on portfolio quality bounds</a>.</p> <p>It should be recognized that "only returns matter" is questionable in practical portfolio construction, since the real world often imposes constraints (long only, max concentration), exhibits different costs and frictions for different assets, and contains other oddities like tax implications, and so on.</p> <p>Constraints in particular complicate the general definition of exchangeability because in the transformation by <span class="math">$$Q$$</span> the original constraints should also be translated. In some cases, say where the constraint is an upper bound on risk, the constraint definition is identical under the transformation. However, the image of a long-only constraint under a general linear transformation by <span class="math">$$Q$$</span> will not in general still be a long-only constraint. In a long-only world, we can perhaps only expect order- or (positive) leverage exchangeability, and not the general form.</p> <p>Setting aside the issues with constraints, it is still useful, I think, to consider the <em>objectives</em> of portfolio construction techniques with respect to exchangeability, inasmuch as they have them.</p> <p>For example, the "one over N" (or "equal dollar", "Talmudic", <em>etc.</em>) rule clearly does not satisfy general exchangeability, nor leverage exchangeability. The Equal Risk Contribution portfolio, which we will describe below, also fails exchangeability. The Markowitz Portfolio, however, does satisfy exchangeability: </p> <div class="math">$$\Sigma^{-1}\mu \mapsto Q^{-1}\Sigma^{-1}Q^{-\top}Q^{\top}\mu = Q^{-1}\Sigma^{-1}\mu,$$</div> <p> as needed.</p> <p>In fact, "equal dollar" seems not so much an objective as a constraint of the portfolio allocation. There is no objective beyond perhaps "make it seem like we are doing something with client money." The same complaint will apply to ERC. In fact, you <em>can</em> express Markowitz, Mean Variance, ERC (and I believe equal dollar) as similar <a href="https://www.grahamcapital.com/Equal%20Risk%20Contribution%20April%202019.pdf">optimization problems with risk constraints</a>. However, the objectives do look a lot like make-work.</p> <h2>Equal Risk Contribution</h2> <p>The set-up for Equal Risk Contribution portfolio, or Risk Parity, is as follows: define the risk of a portfolio <span class="math">$$w$$</span> as the standard deviation of returns, <span class="math">$$r = \sqrt{w^{\top}\Sigma w}$$</span>. This function is homogeneous of order 1 meaning that if you positively rescale your whole portfolio by <span class="math">$$k$$</span>, the risk scales by <span class="math">$$k$$</span>. That is if you map <span class="math">$$w \mapsto k w$$</span> then <span class="math">$$\sqrt{w^{\top}\Sigma w} \mapsto k \sqrt{w^{\top}\Sigma w}$$</span> for positive <span class="math">$$k$$</span>.</p> <p>Using Euler's Homogeneous function theorem, we can express the risk as </p> <div class="math">$$r = w^{\top} \nabla_{w}r = w^{\top} \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}.$$</div> <p> The theory behind Risk Parity then says because of this equation, the vector <span class="math">$$w \odot \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}$$</span> is the "risk in each asset," where <span class="math">$$\odot$$</span> is the Hadamard (elementwise) multiplication. This is very tempting because the sum of the elements of this vector is exactly <span class="math">$$r$$</span> by Euler's Theorem. The Equal Risk Portfolio is the one such that each element of <span class="math">$$w \odot \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}$$</span> is the same. It has "equal risk in each asset".</p> <p>However, I can see no principled reason to view this vector as the risk in each asset. By definition it happens to be the marginal contribution to risk from each asset due to a proportional change in holdings. That is, it is equal to <span class="math">$$\nabla_{\log(w)}r$$</span>, and expresses how risk would change under a small proportional change in weight in your portfolio. However, it is clearly not the risk in each asset because it can contain negative elements! If you hold an asset that diversifies (<em>i.e.</em> has negative correlation with) existing holdings, then increasing your contribution can decrease risk. The fact that the elements of this vector sum to the total risk is also not convincing: one could just as easily say that each asset has <span class="math">$$r / p$$</span> risk in it, and capture the same property.</p> <p>As mentioned above, the risk contribution vector does not satisfy an exchangeability condition. Taking <span class="math">$$x\mapsto Q^{\top}x$$</span> and assuming exchangeability, <span class="math">$$w\mapsto Q^{-1}w$$</span>, then <span class="math">$$r \mapsto r$$</span> and </p> <div class="math">$$w \odot \frac{\Sigma w}{r} \mapsto Q^{-1} w \odot \frac{Q^{\top}\Sigma w}{r}.$$</div> <p> That is, if <span class="math">$$w$$</span> was the ERC portfolio, then <span class="math">$$Q^{-1}w$$</span> is not the ERC in transformed space.</p> <p>You can confirm this in code, which I have lifted from the <code>riskParityPortfolio</code> <a href="https://cran.r-project.org/web/packages/riskParityPortfolio/vignettes/RiskParityPortfolio.html">vignette</a>. The ERC is not exchangeable for general <span class="math">$$Q$$</span> or orthogonal <span class="math">$$Q$$</span>, but is for diagonal <span class="math">$$Q$$</span>. We check them here:</p> <div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span> <span class="kn">library</span><span class="p">(</span>riskParityPortfolio<span class="p">)</span> <span class="kn">library</span><span class="p">(</span>mvtnorm<span class="p">)</span> <span class="p">})</span> risk <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>w<span class="p">,</span>Sigma<span class="p">)</span> <span class="p">{</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">as.numeric</span><span class="p">(</span>w <span class="o">%*%</span> Sigma <span class="o">%*%</span> w<span class="p">))</span> <span class="p">}</span> riskcon <span class="o">&lt;-</span> <span class="kr">function</span><span class="p">(</span>w<span class="p">,</span>Sigma<span class="p">)</span> <span class="p">{</span> Sw <span class="o">&lt;-</span> Sigma <span class="o">%*%</span> w <span class="kp">as.numeric</span><span class="p">(</span>w <span class="o">*</span> <span class="p">(</span>Sw<span class="p">)</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">as.numeric</span><span class="p">(</span>w <span class="o">%*%</span> Sw<span class="p">)))</span> <span class="p">}</span> <span class="c1"># from the excellent vignette:</span> <span class="c1"># generate synthetic data</span> <span class="kp">set.seed</span><span class="p">(</span><span class="m">42</span><span class="p">)</span> N <span class="o">&lt;-</span> <span class="m">5</span> V <span class="o">&lt;-</span> <span class="kt">matrix</span><span class="p">(</span>rnorm<span class="p">(</span>N<span class="o">*</span><span class="p">(</span>N<span class="m">+50</span><span class="p">)),</span> ncol <span class="o">=</span> N<span class="p">)</span> Sigma <span class="o">&lt;-</span> cov<span class="p">(</span>V<span class="p">)</span> portfolio <span class="o">&lt;-</span> riskParityPortfolio<span class="p">(</span>Sigma<span class="o">=</span>Sigma<span class="p">)</span> <span class="c1"># print(&#39;check general exchangeability\n&#39;)</span> Q <span class="o">&lt;-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">50</span><span class="p">,</span>Sigma<span class="o">=</span><span class="kp">diag</span><span class="p">(</span>N<span class="p">))</span> <span class="kp">dim</span><span class="p">(</span>Q<span class="p">)</span> <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span>N<span class="p">,</span>N<span class="p">)</span> knitr<span class="o">::</span>kable<span class="p">(</span>tibble<span class="p">(</span>start<span class="o">=</span>riskcon<span class="p">(</span>portfolio<span class="o">$</span>w<span class="p">,</span>Sigma<span class="p">),</span> Q_trans<span class="o">=</span>riskcon<span class="p">(</span><span class="kp">solve</span><span class="p">(</span>Q<span class="p">,</span>portfolio<span class="o">$</span>w<span class="p">),</span><span class="kp">t</span><span class="p">(</span>Q<span class="p">)</span> <span class="o">%*%</span> Sigma <span class="o">%*%</span> Q<span class="p">)))</span> </pre></div> <table> <thead> <tr> <th align="right">start</th> <th align="right">Q_trans</th> </tr> </thead> <tbody> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.078</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.071</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.077</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.078</td> </tr> </tbody> </table> <div class="highlight"><pre><span></span><span class="c1"># print(&#39;check orthogonal exchangeability\n&#39;)</span> <span class="kp">set.seed</span><span class="p">(</span><span class="m">123</span><span class="p">)</span> B <span class="o">&lt;-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">50</span><span class="p">,</span>Sigma<span class="o">=</span><span class="kp">diag</span><span class="p">(</span>N<span class="p">))</span> <span class="kp">dim</span><span class="p">(</span>B<span class="p">)</span> <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span>N<span class="p">,</span>N<span class="p">)</span> Q <span class="o">&lt;-</span> <span class="kp">eigen</span><span class="p">(</span>B<span class="p">)</span><span class="o">$</span>vectors knitr<span class="o">::</span>kable<span class="p">(</span>tibble<span class="p">(</span>start<span class="o">=</span>riskcon<span class="p">(</span>portfolio<span class="o">$</span>w<span class="p">,</span>Sigma<span class="p">),</span> Q_trans<span class="o">=</span>riskcon<span class="p">(</span><span class="kp">solve</span><span class="p">(</span>Q<span class="p">,</span>portfolio<span class="o"></span>w<span class="p">),</span><span class="kp">t</span><span class="p">(</span>Q<span class="p">)</span> <span class="o">%*%</span> Sigma <span class="o">%*%</span> Q<span class="p">)))</span> </pre></div> <table> <thead> <tr> <th align="right">start</th> <th align="right">Q_trans</th> </tr> </thead> <tbody> <tr> <td align="right">0.076</td> <td align="right">0.047</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.021</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.226</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.083</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.003</td> </tr> </tbody> </table> <div class="highlight"><pre><span></span><span class="c1"># print(&#39;check leverage exchangeability\n&#39;)</span> <span class="kp">set.seed</span><span class="p">(</span><span class="m">17</span><span class="p">)</span> Q <span class="o">&lt;-</span> <span class="kp">diag</span><span class="p">(</span>runif<span class="p">(</span>N<span class="p">,</span>min<span class="o">=</span><span class="m">0.5</span><span class="p">,</span>max<span class="o">=</span><span class="m">2.0</span><span class="p">))</span> knitr<span class="o">::</span>kable<span class="p">(</span>tibble<span class="p">(</span>start<span class="o">=</span>riskcon<span class="p">(</span>portfolio<span class="o"></span>w<span class="p">,</span>Sigma<span class="p">),</span> Q_trans<span class="o">=</span>riskcon<span class="p">(</span><span class="kp">solve</span><span class="p">(</span>Q<span class="p">,</span>portfolio<span class="o"></span>w<span class="p">),</span><span class="kp">t</span><span class="p">(</span>Q<span class="p">)</span> <span class="o">%*%</span> Sigma <span class="o">%*%</span> Q<span class="p">)))</span> </pre></div> <table> <thead> <tr> <th align="right">start</th> <th align="right">Q_trans</th> </tr> </thead> <tbody> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> <tr> <td align="right">0.076</td> <td align="right">0.076</td> </tr> </tbody> </table> <h2>The Symmetric Square Root</h2> <p>One of the reasons I wanted to write this post was to draw attention to the symmetric square root, which we typically do not use for portfolio construction, but is useful for risk decomposition. We can express the risk of a portfolio as </p> <div class="math">$$r = \| \Sigma^{1/2} w \|_2^2,$$</div> <p> where <span class="math">$$\Sigma^{1/2}$$</span> is any matrix square root of <span class="math">$$\Sigma$$</span>. Then the elements of <span class="math">$$\Sigma^{1/2} w$$</span> would seem to decompose the risk of your portfolio, in a squared error sense. That is, the elements of <span class="math">$$\Sigma^{1/2} w$$</span>, <em>when squared</em> sum to the risk squared. That vector may contain negative elements, but this does not affect the square sum. We can just square the elements of <span class="math">$$\Sigma^{1/2} w$$</span>, and claim we have "decomposed risk". Whether this is a useful decomposition, or has any real meaning, is debatable. We can check if this is an exchangeable function.</p> <p>If you use the Cholesky square root, this risk decomposition does not satisfy order exchangeability! This clearly seems like a bad way to express risk. If, however, you use the symmetric square root, then the decomposition is exchangeable with resect to reordering, relevering, and even to rotation, but perhaps not to general transformation by <span class="math">$$Q$$</span>. Under a orthogonal <span class="math">$$Q$$</span> we have <span class="math">$$\Sigma^{1/2} \mapsto Q^{\top}\Sigma^{1/2}Q$$</span> and so if <span class="math">$$w\mapsto Q^{-1}w$$</span>, then <span class="math">$$\Sigma^{1/2} w \mapsto Q^{\top}\Sigma^{1/2}w$$</span>.</p> <p>Again it is not clear this is a meaningful decomposition of risk. Whether it is or not, I am not aware of this definition being used to construct an ERC portfolio, though I suspect it is only a matter of time.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); mathjaxscript.id = 'mathjaxscript_pelican_#%@#\$@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>