Gilgamathhttp://www.gilgamath.com/2018-06-01T10:00:32-07:00R in Finance 20182018-06-01T10:00:32-07:002018-06-01T10:00:32-07:00Steventag:www.gilgamath.com,2018-06-01:/rfin2018.html<p>Review of R in Finance 2018 conference</p><!-- image: rfin2018_cover.png image_alt: Review of R in Finance 2018. -->
<p>2018 marks the tenth year of R in Finance. Once again, here is my biased and incomplete take on
the proceedings. </p>
<!-- Date: Fri May 19 2017 09:30:24 -->
<!-- PELICAN_END_SUMMARY -->
<h3>Day One, Morning Lightning Round</h3>
<ol>
<li>
<p>Yu Li started the conference with a lightning talk on whether click and visit data on
Morningstar's website was predictive of fund flows, using ordinary linear regression,
when the usual dependent variables (beta, momentum, <em>etc.</em>) are taken into account. The
early results looked inconclusive. (<em>n.b.</em>: if visits and clicks are dependent on market
movement, there will be complicated interactions with momentum that would have to be
controlled.) Another takeaway is that your actions on the internet are a potential
gold mine (well, data mine) for <em>someone</em>.</p>
</li>
<li>
<p>Daniel McKellar used a graph theory view of companies to compare
geographical effects to sector and industry effects. The target metric is something called
'modularity'. (My fear is that this metric is defined in a trivially gameable way, but this
could be due to my general paranoia.) Turns out that country clustering gives higher modularity
than sector or industry clustering (yay), but then clustering by country <em>and</em> sector gives
lower modularity than just by country (uh, uhoh). There followed a linear regression model
of the correlation matrix entries to check for clustering effects. While people are trained
to digest linear regression models (and perhaps this is the way to go with upper management),
I hope there are more advanced techniques for covariance cluster analysis.</p>
</li>
<li>
<p>Jonathan Regenstein presented a shiny page that performs Fama French decomposition on a
portfolio that the user enters. He bemoaned the weird choice of distribution channel for
Ken French's data (zipped CSV with header junk). I have packaged a few of the
datasets into a <a href="https://github.com/shabbychef/aqfb_data">data package for my book</a>.
But I have been thinking there should be a canonical data package that contains all the FF data
(it only updates once a year, and could be made programmatic.).</p>
</li>
</ol>
<h2>Day One, Morning Talks</h2>
<p>Kris Boudt gave a talk on a new kind of regularized covariance estimator. The idea is
to combine the row subselection of MCD with a kind of shrinkage to deal with the 'fat data'
problem (more columns than rows). The motto appears to be 'shrink when needed.'
He presented the 'C-step' theorem which underpins their
algorithm: suppose you subselect some rows of data, compute sample mean and
covariance, then define <a href="https://en.wikipedia.org/wiki/Mahalanobis_distance">Mahalanobis distances</a>
on that sample mean and covariance. Then if you pick another subsample of the data
that has smaller sum of Mahalanobis distances than your original sample, then
than new subset will have smaller covariance determinant. The implication then is to
just compute a Mahalanobis function, then take the subset of the smallest Mahalanobis
distances and iterate. This shows that the algorithm converges, and determinant
decreases at each step. (I confirmed with Kris that the objective is not convex,
so this method falls into local minima; to find a global minimum, you have to employ
some tricks.)
He followed up with
some examples showing how the algo works; toy data experiments confirm it helps
to have clairvoyance on the population outlier rate. His example computing
the minimum variance portfolio triggered me, as I am not convinced covariance shrinkage
should be used for portfolio construction. </p>
<hr>
<p>Majeed Simaan gave a talk on 'Rational Explanations for Rule-of-Thumb Practices in
Asset Allocation.' He seemed to be looking for conditions under which one would prefer
the Global Minimum Variance portfolio, the Mean Variance Portfolio, or the Naive
Allocation Portfolio. These are based on the likely estimation error. My interpretation
is that if, for example, the true Markowitz portfolio weights (err, MVP in Majeed's
terminology) are widely dispersed, you are less likely to make a deleterious estimation error
in constructing the sample Markowitz portfolio. These rules of thumb are translated
into more easily digestible forms which can be tested. I look forward to the paper.</p>
<h2>Keynote</h2>
<p>Norm Matloff gave a keynote on 'Parallel Computation for the Rest of Us'. He notes
that there are a number of paradigms for parallelizing computation in R, with
varying levels of abstraction and sophistication. However, it is (still) the case
that using these packages requires some knowledge of hardware (<em>e.g.</em> caching)
and how the computations are parallelized. There is no 'silver bullet' that automatically
parallelizes computations. He described the two key design paradigms of the <code>partools</code> package:
Leave It There (<em>ie</em> bring computation to the data, and leave the data there), and
Software Alchemy (try to automatically convert regular problems into approximately
equivalent Embarassingly Parallel problems, and solve those instead). </p>
<hr>
<h2>Day One, Afternoon Talks</h2>
<p>Matthew Ginley gave a talk about forecasting rare events under monotonicity constraints. I don't
think I can do his technique justice, but he seemed to start with a density estimator, then
estimate the proportion of rare events at values of the independent variable (or 'feature')
near where the rare events were observed, then he constructed a monotonic regression on those
values. My notes say I should look up ROSE (random over sampling examples) and
BART (Bayesian Additive Regression Trees), and that
the 'usual' metrics you might use to score performance (like MSE, or 0/1 loss) may give
counterintuitive results.</p>
<hr>
<p>Rainer Hirk discussed multivariate ordinal regression models using
<a href="https://cran.r-project.org/web/packages/mvord/index.html"><code>mvord</code></a>. He presented some problems around credit rating data from the big three (Moody's, Fitch, S&P):
how are the ratings related to each other, how do they change over time, can they be explained
by independent variables (features of the rated companies), and so on. Ordinal regression
apparently works as a latent variable with some thresholds to determine the classes.
The <code>mvord</code> package can handle all the questions he threw at it.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Wilmer Pineda had some of the prettiest slides I saw all day, but I had a difficult time
understanding his talk, I'm afraid.</p>
</li>
<li>
<p>Neil Hwang talked about 'bipartite block models'. The idea seemed that you might have a
bipartite graph with edges defining some commonality between nodes, and you want to detect
communities among them. (This reminds me of some work I did in my short stint in the film
industry on trying to detect similarities between actors and films based on the 'appeared-in'
definition of edges.)</p>
</li>
<li>
<p>Glenn Schultz gave an advertorial for <a href="http://bondlab.io">bondlab</a>, which appears to be a
bond pricing package of the same name, connected to data from their web site. I think if
you work in fixed income, you'll want to take a look at this.</p>
</li>
<li>
<p>Dirk Hugen gave a talk on using R in Postgres via the PL/R extension. This is a nice trick if
you use Postgres: you can basically ship R UDFs into the database and run them there. I am
always a fan of having the DB do the work that my desktop cannot. </p>
</li>
</ol>
<h2>Talk</h2>
<p>Michael Gordy, from the Federal Reserve, discussed 'spectral backtests'. Apparently banks
produce 1 day ahead forecasts of their profit and loss every night.
The banks also track what their actual PnL is (weird, right?), which are then translated
into quantiles under their forecasts.
The question is whether the forecasts are any good, and whether that can be quantified
without dichotomizing the data. (For example, by looking at
the proportion of actuals at or above the 0.05 upper tail of loss forecasts.)
I didn't follow the transform, but he took an integral with some weighting function, and
out popped some hypothesis tests. Go check out the
<a href="https://doi.org/10.17016/FEDS.2018.021">paper</a>, and apparently there is a package coming as
well.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Mario Annau gave a progress report on <code>hdf5r</code>, which provides HDF5 file support for R.
HDF5 is still probably the best multi-language high performance data format, and this package
was apparently rewritten for performance by cutting out some C++ middleman code.
The roadmap for this package includes <code>dplyr</code> support, which would be a welcome feature.</p>
</li>
<li>
<p>David Smith gave a talk promoting the new Azure backend for the
<a href="https://cran.r-project.org/web/packages/foreach/index.html"><code>foreach</code></a> package.</p>
</li>
<li>
<p>Stephen Bronder gave a progress report on porting Stan to GPUs. Matrix operations like
inversion and Cholesky factorization are hard to parallelize, but they are coming to Stan.</p>
</li>
<li>
<p>Xin Chen presented the
<a href="https://github.com/chenx26/glmGammaNet"><code>glmGammaNet</code></a> package
to perform Elastic Net (L1 and L2 regularized) regression, but with Gamma distributed data,
which is appropriate for non-negative errors.</p>
</li>
<li>
<p>JJ Lay discussed multilevel Monte Carlo simulations for
stochastic volatility and interest rate modeling. Apparently he achieved 10 thousand
time reduction in runtime (!) from a serial computation by parallelizing in this way.</p>
</li>
</ol>
<h2>Talks</h2>
<p>Michale Kane gave a talk on an analysis of cryptocurrency pairs prices from the
Bittrex market.
(The first part was a hilarious tour through the shady contraband-for-bitcoin market.)
He used SVD to approximate the returns of <span class="math">\(p=290\)</span> returns of currency pairs down
to dimension <span class="math">\(d\)</span>.
He used Frobenius norm of error of this approximation, plus a <span class="math">\(d/\sqrt{p}\)</span> regularization
to optimize <span class="math">\(d\)</span>, finding that <span class="math">\(d\approx 2.5\)</span> was
consistent with his sample, fluctuating perhaps to 4 at times. My interpretation
is that people think of cryptocurrencies as Bitcoin and also-rans, although perhaps
there is a numeraire effect in there. As Michael put it, despite the
variety of coins, their returns are not well differentiated.</p>
<hr>
<p>William Foote gave a presentation in the form of a shiny dashboard, instead of
a slideshow, on the topic of, I think, shipping metals and metals prices.
A fair amount of time was spent showing the source code for the shiny page,
rather than demo'ing the page. If you are in the business of shipping Copper,
Alumnium or Nickel around, you would definitely want a dashboard like this,
but it is not clear how to interpret all the plots (contours of correlation
in an animation?) or 'drive' actions from the dashboard.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Justin Shea discussed Hamilton's working paper, "Why you should never use the
Hodrick-Prescott filter." (I <em>love</em> unambiguous paper titles!) He implemented
Hamilton's suggested replacement for the HP filter for detrending time series, in the
<a href="https://cran.r-project.org/web/packages/neverhpfilter/index.html"><code>neverhpfilter</code></a> package, which implements this regression, returning a <code>glm</code> object.</p>
</li>
<li>
<p>Thomas Zakrzewski talked about using a <a href="https://en.wikipedia.org/wiki/Q-Gaussian_distribution">'Q-Gaussian' distribution</a>
(apparently it generalizes Gaussian, <span class="math">\(t\)</span>, and bounded Gaussian-like symmetric distributions)
in Merton's model for probability of default.</p>
</li>
<li>
<p>Paul Laux talked about inferring the cost of insuring against small
and large market movements from the returns of VIX futures and delta-neutral
SPX straddles. He looked at these inferred costs around news announcement
dates (jobs reports, FOMC meetings) versus other times, and found that the
costs were significantly non-zero around news dates.</p>
</li>
<li>
<p>Hernando Cortina gave a talk for <a href="https://justcapital.com/">Just Capital</a>,
which is apparently a non-profit, established by Paul Tudor Jones, that
analyzes companies based on ESG criteria (Environment, Social, Governance). He
created quintile portfolios based on these rankings, and found that, over a
1 year out-of-sample period, the "most socially responsible" quintile outperformed the
least responsible. (Years ago I worked at a fund that tried to build a SRI
vehicle for a client, without much luck.)
He then tried to decompose the 'alpha' in the responsible quantile in terms of the ingredients
in their Just companies index.</p>
</li>
</ol>
<h2>Keynote</h2>
<p>J. J. Allaire gave a talk on Machine Learning with TensorFlow. TensorFlow
is apparently a numerical computing library, which is hardware independent and open source,
running on CPU, GPU or TPUs (if you can find one). It defines a data flow
process which is executed in C++, and reminds me somewhat of Spark.
The Models that are built are language-independent (again, like Spark's MLLib).
He then talked about Deep Learning, which, if I understood correctly, is
just neural nets with <em>lots</em> of layers ('deep'), like hundreds maybe. These
are good for 'perceptual-like' tasks, but maybe not so much other areas (uhh, finance?).
Apparently Deep Learning has become more popular now because we now have
the computational resources and the massive amounts of data to train such huge
neural nets, some of which have millions of coefficients. (I imagine if you could
analyze how a human brain recognizes digits, say, it would involve thousands
of neurons; encoding them all in ten thousand parameters seems about right.)
Deep Learning still has problems: the models are not interpretable, can be
fooled by adversarial examples, and require lots of data and computational power.</p>
<p>He introduced Rstudio's <a href="tensorflow.rstudio.com">tensorflow packages</a>, including the
<a href="https://cran.r-project.org/web/packages/keras/index.html"><code>keras</code></a> package. The package gives you access to a plethora of layer types you might
want to put in a Deep Learning model, some appropriate for, say, graphical or image
learning, or time series or language processing <em>etc.</em> You do need a fair amount
of domain knowledge to create a good collection of layers, and apparently lots of
experimentation is required. (I'm predicting that Deep MetaLearning will be the big thing
in ten years when we have more data and computational power.)
He ran through example uses of TensorFlow from R: classification of images, weather forecasting,
fraud detection, <em>etc.</em> The package ecosystem here seems ready for use.
For more info, do check out
<a href="https://www.manning.com/books/deep-learning-with-r">Deep Learning with R</a>, or the more
theoretical book, <a href="http://www.deeplearningbook.org/">Deep Learning</a>.</p>
<hr>
<h2>Day Two</h2>
<h3>Lightning Round</h3>
<ol>
<li>
<p><a href="https://quantstrattrader.wordpress.com/">Ilya Kipnis</a> started the day by
describing some technical-based strategies on VIX ETNs. It's apparently hard
to do worse
than <a href="https://www.marketwatch.com/story/xiv-trader-ive-lost-4-million-3-years-of-work-and-other-peoples-money-2018-02-06">buy and hold XIV</a>.
I talked to Ilya after the conference, and he tells me he has "skin in the game,"
so this is not just another bunch of quant farts on a blog .</p>
</li>
<li>
<p>Matt Dancho gave a talk on
<a href="https://cran.r-project.org/web/packages/tibbletime/index.html"><code>tibbletime</code></a>,
which provides a time-aware layer over <code>tibble</code> objects, with 'collapse by time'
operators (which act like groupings, I think, but are applied in tandem with <code>group_by</code>),
a 'rollify' operator which (naively) applies functions at each point in time over a fixed window,
time subselection
operations and more. He also mentioned <a href="https://github.com/DavisVaughan/flyingfox"><code>flyingFox</code></a>
which uses <code>reticulate</code> to communicate with the Quantopian <code>zipline</code> package. <code>zipline</code>,
while rather weak compared to what most quant shops will develop in-house, is the <em>only</em>
open source backtesting engine that I know. It is good to see this is coming to R.
(I should note this packages seems similar to the
<a href="https://cran.r-project.org/web/packages/tsibble/index.html"><code>tsibble</code></a> package.)</p>
</li>
<li>
<p>Carson Sievert talked about <code>dashR</code>, a not-yet-released packag for using
<a href="http://dash.plot.ly"><code>dash</code></a>, which is Python's latest attempt to replicate <code>shiny</code> (<code>pyxley</code> having suffered an early death,
apparently). I suppose someone will find this useful, but I was not convinced by Carson's
arguments in favor of this approach: easy switching between Python and R, and the ability
to quickly import new React components. <em>If</em> the syntax of this framework were much easier
to think about than <code>shiny</code>, it would certainly win some converts, but I believe reactive
programming is just hard to reason about. At this point many users have learned to
embrace the weirdness of <code>shiny</code> and will be unlikely to defect.</p>
</li>
<li>
<p>Michael Kapler: Interactively Exploring Seasonality Patterns in R
<code>rtsviz</code> package. This seems to be a package with a shiny page that
can quickly give you a view of the seasonality of your time series data.</p>
</li>
<li>
<p>Bernhard Pfaff introduced the
<a href="https://github.com/bpfaff/rbtc"><code>rbtc</code></a> package. This wraps the bitcoin API for looking at the blockchain.
This is complementary to the <code>rbitcoin</code> and <code>coindeskr</code> packages, which
seem to provide <em>pricing information</em>. Expect more from this package in the
coming year (perhaps the ability to <em>mine</em> coins, or define your own wallet.)</p>
</li>
</ol>
<h2>Talks</h2>
<p>Eran Raviv gave a talk about combining forecasts using the
<a href="https://cran.r-project.org/web/packages/ForecastComb/index.html"><code>ForecastComb</code></a> package. As an example he showed a few different forecast methods
applied to a time series of UK electricity supply. The
package supports <em>many</em> different methods of combining forecasts:
simple averaging; OLS combination (which outperforms simple averaging,
but might not be <em>convex</em> in the forecasts, sometimes extrapolating
from them); trivial methods like median, trimming, <em>etc.</em> ;
accuracy based methods, like inverse Rank, inverse RMSE, Eigenvector approach;
regression based methods: OLS, LAD, CLS, subset regressions.
There are also summary and plotting functions. If you are combining
forecasts, this is the package to use.</p>
<hr>
<p>Leopoldo Catania motivated the
<a href="https://cran.r-project.org/web/packages/eDMA/index.html"><code>eDMA</code></a> (efficient dynamic model averaging)
package by looking at predicting cryptocurrency returns using
some predictive features: technical features on the returns
themselves and macroeconomic features. The model under consideration
looks like the setup for a Kalman Filter, with a linear model
where the coefficients change under an AR(1) model, but instead
somehow summarized by a 'forgetting factor'. A consequence
is that, somehow, yo have to perform linear regressions on all
subsets, using multiple forgetting factors, and maybe evaluate
them all on a rolling basis. The good news is that this package is
fairly efficient, using <code>Rcpp</code> and <code>RcppArmadillo</code>, and is
perhaps 50 times faster than the
<a href="https://cran.r-project.org/web/packages/dma/index.html"><code>dma</code></a> package, but still it takes around an hour to run a regression
with 18 features and 500 rows. And the results were hard for
me to interpret, and seemed to be worse than the benchmark method
under MSE metric. (And the claim that predictability 'increased over time'
could possibly be attributed to the longer time series?)</p>
<h2>Talks</h2>
<p>Guanhao Feng gave a talk on "Deep Learning Alpha". As I understand it,
the motivation was that there is a veritable "zoo" of factors and factor models
(see Harvey & Liu (2016)), but factors are typically defined oddly.
That is, most factor returns are defined to be relatively robust to how
you would define the purported anomaly ('size', 'momentum', <em>etc.</em>),
and are rebalanced annually. The speaker, I think, was looking to
use Deep Learning to 'automatically' define factors which would be less
subject to our lame human ideas of what factors should look like.
(The speaker noted that you cannot use ML 'directly' to forecast cross
sectional returns because of imbalanced data and missing values: not all
features are defined at all times, not all stocks exist at all times, there
are mergers and aquisitions, <em>etc.</em>) I think I missed the part where
the model was compared to Fama French 3 or 5 factor models.</p>
<hr>
<p>Xiao Qiao gave an interesting talk on <em>correlated</em> idiosyncratic volatility shocks.
The idea is that idiosyncratic volatility has cross-sectional correlation (called,
"TVV" for Time Varying Vol), as well as autocorrelation ("VIN" (not <em>that</em> VIN), for
Volatility INnovations.) He built what he called a 'Dynamic Factor Correlation' model,
which generalizes Bollerslev's CCC and Engle & Kelly's DECO models. He found that
there <em>is</em> a signficant cross-sectional correlation of GARCH residuals (TVV), then
built portfolios based on sorts (two sorts, if I recall), and showed that the
lowest quintile portfolios outperformed the highest quintile. The interpretation
from the speaker was roughly that
high VIN securities are a kind of 'insurance' against vol spikes, and
high TVV securities payout when vol is high in general.
(There was also a "Lake Volbegone" effect, where <em>all</em> the portfolios had above-average
excess returns, but Stephen Rush pointed out this was likely due to the difference
between simple averaging and value averaging.) My notes tell me to look up
Ang <em>et al.</em> (2006) and Herskovic <em>et al.</em> (2014).</p>
<h2>Keynote</h2>
<p>Li Deng gave a talk on using AI in finance. Li drove the AI effort at Microsoft before joining
Citadel. While he couldn't be terribly specific about what he is doing now, he gave a good
overview of the history of AI, including its successes in perceptually tasks. He was also
fairly honest about the challenges of using AI in finance: low (I would say, "very low")
signal-noise ratio compared to perceptual tasks, nonstationarity and adversarial landscape, and
the heterogeneity of big data. My guess is that the first of those is the biggest
problem, while the third is an engineering, or model design, challenge.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Keven Bluteau: gave a talk on sentiment analysis. (I think I approached him after the
conference, and after two drinks, and told him I enjoyed his <em>talk about hdf5</em>. Oooops! Sorry!
You don't really look like Mario!) This was one of a slew of talks about sentiment, and
they also around when my computer decided to remount its filesystem read-only. (ack!)</p>
</li>
<li>
<p>Samuel Borms talked about the
<a href="https://cran.r-project.org/web/packages/sentometrics/index.html"><code>sentometrics</code></a> for computing and aggregating textual sentiment.</p>
</li>
<li>
<p>Kyle Balkissoon gave a short talk on using weather data to create weather-based
signals on companies (I feel like this idea time traveled from the 60's), as well as
building text-based signals on companies. The latter is, as Kyle noted, fairly difficult,
(as is <em>any</em> signal construction) unless you can really represent what one company is
over time. (In addition to the Ship of Theseus argument, splits and mergers complicate
the picture, and they complicate our understanding of textual data about companies. I suspect
that everyone at the conference who uses CRISP data just sweeps this under the rug, which would be
worth the price of admission.)</p>
</li>
<li>
<p>Petra Bakosova, from Hull Tactical, gave an impromptu talk on seasonal effects, which includes calendar-based
effects (month boundaries, January effect, weekend effect, sell in May), as well
as 'announcement' dates (FOMC, and maybe earnings announcements?). Building several seasonal strategies,
she found that many had higher Sharpe than Buy and Hold around announcements (this seems odd if
they are long only), but has lower overall return because the capital is not deployed at all
times. (On the other hand, if the seasonal strategies could 'share' capital with other kinds of
strategies, maybe it would all work out.)</p>
</li>
<li>
<p>Che Guan gave a talk on using Machine Learning for 'digital' (or you might say, 'crypto')
currency predictions, using technical factors on the coin returns as well as macroeconomic
features. I would like to see his results compared and constrasted with those of Leopoldo Catania,
who seemed to target the same application with different methods.</p>
</li>
</ol>
<hr>
<h2>Afternoon Talks</h2>
<p>David Ardia gave a talk about sparse forecasting using news-based sentiment.
The motivating problem was forecasting economic growth. In Europe, this
is apparently done by 'ESI', which is some kind of average of survey responses.
Can this be automated, sped up, even improved by text-based forecasts? David
pursued a penalized least squares approach. The recipe is:
classify texts by topic (economic, labor, government, <em>etc</em>), and choose a subset of topics;
using multiple lexicons (lexica?) compute the sentiment of each text at time <span class="math">\(t\)</span>;
aggregate across topics to get some sentiments;
to obtain a bunch of topic-based sentiments;
get some time series aggregated values (a little hazy here);
take a linear combination to get the best forecasts.</p>
<p>Using Germany as an example, he looked at news from LexisNexis from the mid 90's to 2016,
filtered articles by geography, topic, article size, applied bag of words sentiment
calculation (I think these are 'bivalent' indicators) using 3 lexicons, collapse by
lexicon and then looked at sentiment by time and topic. The takeway was that the sentiment
indicator seemed to capture the same dynamics as ESI, but perhaps reacted more
quickly to the Great Financial Collapse. He also found that <em>combining</em> the sentiment
indicator and ESI improved forecasts.</p>
<hr>
<p>Dries Cornilly gave a really nice talk on the
<a href="https://github.com/cdries/rTrawl"><code>rTrawl</code></a> package for modeling High Frequency Financial Time Series.
In the setup he presented some stylized facts of high frequency returns
data.
He plotted the autoregressive coefficient for returns in an AR(1) model at different
observation frequencies. It exhibits an odd dip to around -0.2 or so at
a period of around 1 second, but is otherwise around zero. Why the dip?
He then also plots the variance of returns divided by T versus the observation
frequency, which goes from around 0.05 down to zero. Again, why?
He outlined some of the approaches to the problem, then described the
integer valued Lévy processes and 'trawl' processes. From what I understood,
you first generate some some finite set of points in some space, one dimension
representing time. Then you imagine sweeping across time and computing the
sum of all points within the 'wake' of your sweep. In fact, you don't have to imagine the sweep,
he showed animations of the sweep. The Lévy processes have like a constant wake, while
the trawl processes are supposed to evoke a fisherman with a finite sized net
from which the 'fish' escape. He also described a combination of the
Lévy and trawl processes, which is like, uhh, a weird net, I guess. Anyway, the
<code>rTrawl</code> package apparently supports computing these things, as well as
estimating the parameters from an observed series. The parameters would be, I think,
the generating process for the points (err, 'fish'), and maybe the size of the 'net'
or something.
(I don't really know how we transitioned from SPX trades to fish, but it worked.)
The kicker at the end is that the combined trawl processes have closed form
AR(1) coefficients and variance, so he showed the plots from the beginning of the
talk along with the values from the trawl fit, and they match very well! </p>
<p>Luis Damiano gave a talk on Hierarchical Hidden Markov Models in High-Frequency
Stock Markets. I think the idea was to create a Hidden Markov Model on stocks
("bullish", "bearish"), but then have another level of hidden Markov Models
on top of that (thus "hierarchical"). He backtested this system on a couple
of stocks over a short time period, but the story out of sample seemed
inconclusive (in contrast to a 2009 article by Tayal he referenced). As a side
note, apparently the github page for this project has some L1 and L2 tick data
that you can play along with.</p>
<hr>
<h2>Intermission</h2>
<p>So, this happened:</p>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Are you looking to attend an all-male conference in <a href="https://twitter.com/hashtag/DataScience?src=hash&ref_src=twsrc%5Etfw">#DataScience</a>? <a href="https://twitter.com/hashtag/rfinance2018?src=hash&ref_src=twsrc%5Etfw">#rfinance2018</a> has got you covered! 🧑🏻🙋🏻♂️🧓🏻👱🏻♂️🤵🏻👨🏻🧔🏻👴🏻👨🏻💼<br>100% male committee, 100% male speakers, no Code of Conduct. Yes, this is 2018! 📆 <a href="https://t.co/EfhR1QhwWj">https://t.co/EfhR1QhwWj</a> <a href="https://twitter.com/hashtag/BinderFullofMen?src=hash&ref_src=twsrc%5Etfw">#BinderFullofMen</a> 👬 <a href="https://t.co/NLbS31y43V">pic.twitter.com/NLbS31y43V</a></p>— Women in ML/DS (@wimlds) <a href="https://twitter.com/wimlds/status/1002597607468761088?ref_src=twsrc%5Etfw">June 1, 2018</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>This tweet went out on the first day of the conference, and there was a pile-on on twitter.
(It looks like I picked the right year not to give a talk!).
While I have to admit this tweet was very effective at drawing attention to the problem,
drawing attention is only a step towards solving a problem.
The organizers were able, on short notice, to get a talk from two members of "R Ladies", one
who I believe was attending the conference anyway, to talk solutions. They suggested
a Code of Conduct for the conference, which makes sense: the conference draws people from
different backgrounds and cultures, and it is better to make explicit how they should
be expected to behave. Moreover, if it increases attendance and improves audience diversity,
I am all for it. The two ladies also made a call to action to the audience for us to
proactively seek diversity. This is a much larger conversation that is appropriate
for this review (and nobody reads this blog for a conversation), but I do hope
to see positive changes in diversity and inclusion at the conference, but also in the
industry as a whole.</p>
<hr>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Phillip Guerra talked about 'autotrading', which appears to be his terminology for
taking a backtest to market. Phillip is an anesthesologist who moonlights in asset management.
One reason I love this conference is its big tent approach to speakers, who range from
academics to industry to independents.</p>
</li>
<li>
<p>Bryan Lewis gave a talk about Stat Arb Something Something, based on some
offhand comments he made at the conference a few years back about how you could
just quickly throw together a stat arb strategy. The idea is to find groups of
stocks with cointegration relationships, and trade in expectation of a reversal
when they diverge from the relationship. Bryan filled in some of the details to this general
sketch. One of the problems is there is a huge number of combinations of assets
to check for cointegration, and the classical tests do not scale well
(in terms of coverage, I believe) to large numbers of time series. Bryan talked
about using spectral clustering on the regularized covariance of returns
to get candidate sets of assets, then use a Bayesian approach to cointegration.
The reference I am to check is a 2002 paper by
<a href="http://www.carolalexander.org/publish/download/JournalArticles/PDFs/RIBF_16_65-90.pdf">Alexander, Giblin and Weddington.</a></p>
</li>
</ol>
<h2>Talks</h2>
<p>I was starting to think that cryptocurrency talks would outpace total meme count
this year, then Stephen Rush killed it for team meme with his talk on
Currency Risk and Information Diffusion. The motivating idea for the talk
is that
Information moves from currency markets to equity markets at different speeds.
Can we analyze that speed and figure out why it is faster or slower for
some companies.
The speaker computed VPIN from second-resolution NYSE TAQ data, then downsampled
to daily frequency,
used CRISP daily data for around 20 years on about 17K firms,
then built a linear model for returns of each firm taking into account some future information.
The normalized regression coefficients then give you some idea of the 'price adjustment'
which is basically a measure of the <em>inefficiency</em> of each stock. The speaker
found that VPIN, Size, Turnover and Analyst's coverage had negative effects
on this price adjustment (<em>i.e.</em> are indicative of higher efficiency), while
Institutional Ownership has a positive effect (lower efficiency). This latter
factor is associated with a significant alpha, on the order of around 6% annualized
for the top decile.</p>
<hr>
<p>Jasen Mackie gave a talk on 'Round Turn Trade Simulation'. This seemed to be
related to the idea of random portfolios, but focused on computing <em>e.g.</em>
the expected maximum drawdown of a trading strategy by sampling from its
'round turn' trades (that is, positions which are opened then closed, presumably
defined in a LIFO sense). Using the <code>blotter</code> object, the speaker extracted
some stylized facts of the trading strategy: duration of these trade,
ratio of long to short, maybe position sizes , <em>etc.</em> Then random realizations
were drawn with similar properties. I guess you can think of this as a kind
of bootstrap of the backtest. I suppose the autocorrelation of trades would
be much trickier to establish (and I suspect would have a <em>huge</em> influence on
maximum drawdown). </p>
<hr>
<p>Thomas Harte closed the conference with a talk on
"Pricing Derivatives When Prices Are Not Observable". This is a bit different than
incomplete markets. He built a linear model for private equity returns based
on some factors, then used that linear model somehow as a proxy in a Rubinstein-type
lattice pricing scheme. From these Thomas was able to price certain options on
private equity firms (say, a leveraged buyout fund).</p>
<hr>
<p>This was another great conference, and I hope to be back next year. If you have
anything to add, feel free to comment.</p>
<!-- modelines -->
<!-- vim:ts=2:sw=2:tw=96:fdm=marker:syn=markdown:ft=markdown:ai:nocin:nu:fo=ncroqlt:cms=<!--%s-->
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Another Confidence Limit for the Markowitz Signal Noise ratio2018-03-28T21:33:59-07:002018-03-28T21:33:59-07:00Steventag:www.gilgamath.com,2018-03-28:/new_mp_ci.html<p>Another confidence limit on the Signal Noise ratio of the Markowitz portfolio.</p><p>In a <a href="bad-cis">previous blog post</a>, I looked at two asymptotic confidence intervals
for the Signal-Noise ratio of the sample Markowitz portfolio, finding that
they generally did not give nominal type I rates even for large sample sizes
(50 years of daily data). In a <a href="markowitz-cov-elliptical">followup post</a>, I
looked at the covariance of some elements of the Markowitz portfolio, finding
that they seemed to be nearly normal for modest sample sizes. However, in that
post, I used the 'TAS' transform to a <span class="math">\(t\)</span> variate, and found, again, that large
sample sizes were required to pass the eyeball test in a Q-Q plot. </p>
<p>Here I mash up those two ideas to construct another confidence limit for the
Signal Noise ratio. So I take the asymptotic covariance in the Markowitz
portfolio elements, and use them with the TAS transform to get a confidence
limit. (You can get the gory details around equation (52) of version 5 of
my <a href="https://arxiv.org/abs/1312.0557">paper</a>, which also contains the
simulations below.)</p>
<p>Here we examine all three of those confidence limits, finding that none
of them achieve near nominal type I rates. Again, I let <span class="math">\(p\)</span> be the number of
assets, <span class="math">\(n\)</span> the number of days observed, <span class="math">\(\zeta\)</span> is the population maximal
Signal Noise ratio. Here I am observing multivariate normal returns, so
the kurtosis factor is not used here. I perform a number of simulations
across different values of these parameters, each time performing 10000
simulations, computing the sample Markowitz portfolio, and its Signal Noise
ratio.</p>
<!-- PELICAN_END_SUMMARY -->
<div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tibble<span class="p">)</span>
<span class="c1"># https://cran.r-project.org/web/packages/doFuture/vignettes/doFuture.html</span>
<span class="kn">library</span><span class="p">(</span>doFuture<span class="p">)</span>
registerDoFuture<span class="p">()</span>
plan<span class="p">(</span>multiprocess<span class="p">)</span>
<span class="p">})</span>
<span class="c1"># one simulation of n periods of data on p assets with true optimal</span>
<span class="c1"># SNR of (the vector of) pzeta</span>
onesim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="p">{</span>
pmus <span class="o"><-</span> pzeta <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span>p<span class="p">)</span>
<span class="c1"># simulate an X: too slow.</span>
<span class="c1">#X <- matrix(rnorm(n*p,mean=pmus[1],sd=1),ncol=p)</span>
<span class="c1">#smu1 <- colMeans(X)</span>
<span class="c1">#ssig <- ((n-1)/n) * cov(X)</span>
<span class="c1"># this is faster:</span>
smu1 <span class="o"><-</span> rnorm<span class="p">(</span>p<span class="p">,</span>mean<span class="o">=</span>pmus<span class="p">[</span><span class="m">1</span><span class="p">],</span>sd<span class="o">=</span><span class="m">1</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span>n<span class="p">))</span>
ssig <span class="o"><-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span>df<span class="o">=</span>n<span class="m">-1</span><span class="p">,</span>Sigma<span class="o">=</span><span class="kp">diag</span><span class="p">(</span><span class="m">1</span><span class="p">,</span>ncol<span class="o">=</span>p<span class="p">,</span>nrow<span class="o">=</span>p<span class="p">))</span> <span class="o">/</span> n <span class="c1"># sic n</span>
<span class="kp">dim</span><span class="p">(</span>ssig<span class="p">)</span> <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span>p<span class="p">,</span>p<span class="p">)</span>
smus <span class="o"><-</span> <span class="kp">outer</span><span class="p">(</span>smu1<span class="p">,</span>pmus <span class="o">-</span> pmus<span class="p">[</span><span class="m">1</span><span class="p">],</span>FUN<span class="o">=</span><span class="s">'+'</span><span class="p">)</span>
smps <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>ssig<span class="p">,</span>smus<span class="p">)</span>
szeta <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smus <span class="o">*</span> smps<span class="p">))</span>
psnr <span class="o"><-</span> pmus <span class="o">*</span> <span class="kp">as.numeric</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smps<span class="p">)</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smps<span class="o">^</span><span class="m">2</span><span class="p">)))</span>
<span class="kp">cbind</span><span class="p">(</span>pzeta<span class="p">,</span>szeta<span class="p">,</span>psnr<span class="p">)</span>
<span class="p">}</span>
<span class="c1"># do that many times.</span>
repsim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="p">{</span>
foo <span class="o"><-</span> <span class="kp">replicate</span><span class="p">(</span>nrep<span class="p">,</span>onesim<span class="p">(</span>pzeta<span class="o">=</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">))</span>
baz <span class="o"><-</span> <span class="kp">aperm</span><span class="p">(</span>foo<span class="p">,</span><span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">2</span><span class="p">))</span>
<span class="kp">dim</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span>nrep <span class="o">*</span> <span class="kp">length</span><span class="p">(</span>zetas<span class="p">),</span><span class="kp">dim</span><span class="p">(</span>foo<span class="p">)[</span><span class="m">2</span><span class="p">])</span>
<span class="kp">colnames</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kp">colnames</span><span class="p">(</span>foo<span class="p">)</span>
<span class="kp">invisible</span><span class="p">(</span><span class="kp">as.data.frame</span><span class="p">(</span>baz<span class="p">))</span>
<span class="p">}</span>
manysim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>nnodes<span class="o">=</span><span class="m">7</span><span class="p">)</span> <span class="p">{</span>
<span class="kr">if</span> <span class="p">(</span>nrep <span class="o">></span> <span class="m">4</span><span class="o">*</span>nnodes<span class="p">)</span> <span class="p">{</span>
<span class="c1"># do in parallel.</span>
nper <span class="o"><-</span> <span class="kp">table</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="p">((</span><span class="m">0</span><span class="o">:</span><span class="p">(</span>nrep<span class="m">-1</span><span class="p">)</span> <span class="o">%%</span> nnodes<span class="p">)))</span>
retv <span class="o"><-</span> foreach<span class="p">(</span>i<span class="o">=</span><span class="m">1</span><span class="o">:</span>nnodes<span class="p">,</span><span class="m">.</span>export <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">'zetas'</span><span class="p">,</span><span class="s">'n'</span><span class="p">,</span><span class="s">'p'</span><span class="p">,</span><span class="s">'repsim'</span><span class="p">,</span><span class="s">'onesim'</span><span class="p">))</span> <span class="o">%dopar%</span> <span class="p">{</span>
repsim<span class="p">(</span>nrep<span class="o">=</span>nper<span class="p">[</span>i<span class="p">],</span>zetas<span class="o">=</span>zetas<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)</span>
<span class="p">}</span> <span class="o">%>%</span>
bind_rows<span class="p">()</span>
<span class="p">}</span> <span class="kr">else</span> <span class="p">{</span>
retv <span class="o"><-</span> repsim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>zetas<span class="o">=</span>zetas<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)</span>
<span class="p">}</span>
retv
<span class="p">}</span>
<span class="c1"># actually do it many times.</span>
ope <span class="o"><-</span> <span class="m">252</span>
zetasq <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="m">8</span><span class="p">,</span><span class="m">1</span><span class="o">/</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="o">/</span><span class="m">2</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">4</span><span class="p">)</span> <span class="o">/</span> ope
zeta <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>zetasq<span class="p">)</span>
params <span class="o"><-</span> tidyr<span class="o">::</span>crossing<span class="p">(</span>tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>n<span class="p">,</span><span class="m">100</span><span class="p">,</span><span class="m">200</span><span class="p">,</span><span class="m">400</span><span class="p">,</span><span class="m">800</span><span class="p">,</span><span class="m">1600</span><span class="p">,</span><span class="m">3200</span><span class="p">,</span><span class="m">6400</span><span class="p">,</span><span class="m">12800</span><span class="p">),</span>
tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>p<span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">8</span><span class="p">,</span><span class="m">16</span><span class="p">),</span>
tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>kurty<span class="p">,</span><span class="m">1</span><span class="p">))</span>
nrep <span class="o"><-</span> <span class="m">10000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">2356</span><span class="p">)</span>
<span class="kp">system.time</span><span class="p">({</span>
results <span class="o"><-</span> params <span class="o">%>%</span>
group_by<span class="p">(</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>sims<span class="o">=</span><span class="kt">list</span><span class="p">(</span>manysim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>zetas<span class="o">=</span>zeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
tidyr<span class="o">::</span>unnest<span class="p">()</span>
<span class="p">})</span>
</pre></div>
<div class="highlight"><pre><span></span> user system elapsed
113.488 406.360 74.088
</pre></div>
<p>Here I collect the simulations together, computing the three confidence limits and
then the empirical type I rates. I plot them below. </p>
<div class="highlight"><pre><span></span><span class="c1"># the nominal rate:</span>
typeI <span class="o"><-</span> <span class="m">0.05</span>
<span class="c1"># invert the TAS function</span>
anti_tas <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> x <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> x<span class="o">^</span><span class="m">2</span><span class="p">)</span> <span class="p">}</span>
<span class="c1"># confidence intervals and coverage:</span>
cires <span class="o"><-</span> results <span class="o">%>%</span>
mutate<span class="p">(</span>kurty<span class="o">=</span><span class="m">1</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>bit1 <span class="o">=</span> <span class="p">(</span>kurty<span class="o">*</span>pzeta<span class="o">^</span><span class="m">2</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> p<span class="p">),</span>
bit2 <span class="o">=</span> <span class="p">(</span><span class="m">3</span> <span class="o">*</span> kurty <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span>pzeta<span class="o">^</span><span class="m">2</span><span class="o">/</span><span class="m">4</span><span class="p">)</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>lam1<span class="o">=</span>pzeta<span class="o">*</span><span class="kp">sqrt</span><span class="p">((</span><span class="m">2+3</span><span class="o">*</span><span class="p">(</span>kurty<span class="m">-1</span><span class="p">))</span><span class="o">/</span><span class="p">(</span><span class="m">4</span><span class="o">*</span>n<span class="p">)),</span>
lamp<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> kurty<span class="o">*</span>pzeta<span class="o">^</span><span class="m">2</span><span class="p">)</span><span class="o">/</span><span class="kp">sqrt</span><span class="p">(</span>n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>tpart<span class="o">=</span>qt<span class="p">(</span>typeI<span class="p">,</span>df<span class="o">=</span>p<span class="m">-1</span><span class="p">,</span>ncp<span class="o">=</span>szeta<span class="o">/</span>lam1<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_add <span class="o">=</span> szeta <span class="o">+</span> <span class="p">((</span>bit1 <span class="o">+</span> bit2<span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> n <span class="o">*</span> pzeta<span class="p">))</span> <span class="o">+</span> qnorm<span class="p">(</span>typeI<span class="p">)</span> <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>bit2<span class="o">/</span>n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_div <span class="o">=</span> szeta <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="p">((</span>bit1 <span class="o">+</span> <span class="m">3</span> <span class="o">*</span> bit2<span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> n <span class="o">*</span> pzeta <span class="o">*</span> pzeta<span class="p">))</span> <span class="o">+</span> qnorm<span class="p">(</span>typeI<span class="p">)</span> <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>bit2 <span class="o">/</span> <span class="p">(</span>n<span class="o">*</span>pzeta<span class="o">*</span>pzeta<span class="p">))))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_tas <span class="o">=</span> pzeta <span class="o">*</span> anti_tas<span class="p">((</span>lam1 <span class="o">*</span> tpart<span class="p">)</span> <span class="o">/</span> <span class="p">(</span>lamp <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>p<span class="m">-1</span><span class="p">))))</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>type1_add <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_add<span class="p">),</span>
type1_div <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_div<span class="p">),</span>
type1_tas <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_tas<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">signif</span><span class="p">(</span>pzeta <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">),</span>digits<span class="o">=</span><span class="m">2</span><span class="p">))</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`annualized SNR`</span><span class="o">=</span>zyr<span class="p">)</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="c1"># plot CIs:</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> cires <span class="o">%>%</span>
tidyr<span class="o">::</span>gather<span class="p">(</span>key<span class="o">=</span>type<span class="p">,</span>value<span class="o">=</span>type1<span class="p">,</span>matches<span class="p">(</span><span class="s">'^type1_'</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>type<span class="o">=</span>case_when<span class="p">(</span><span class="m">.</span><span class="o">$</span>type<span class="o">==</span><span class="s">'type1_add'</span> <span class="o">~</span> <span class="s">'type I rate, difference form'</span><span class="p">,</span>
<span class="m">.</span><span class="o">$</span>type<span class="o">==</span><span class="s">'type1_div'</span> <span class="o">~</span> <span class="s">'type I rate, ratio form'</span><span class="p">,</span>
<span class="m">.</span><span class="o">$</span>type<span class="o">==</span><span class="s">'type1_tas'</span> <span class="o">~</span> <span class="s">'type I rate, tas form'</span><span class="p">,</span>
<span class="kc">TRUE</span> <span class="o">~</span> <span class="s">'bad code'</span><span class="p">))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>n<span class="p">,</span>type1<span class="p">,</span>color<span class="o">=</span>type<span class="p">))</span> <span class="o">+</span>
geom_line<span class="p">()</span> <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">~</span> <span class="sb">`annualized SNR`</span><span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">,</span>labeller<span class="o">=</span>label_both<span class="p">)</span> <span class="o">+</span>
scale_x_log10<span class="p">()</span> <span class="o">+</span>
geom_hline<span class="p">(</span>yintercept<span class="o">=</span><span class="m">0.05</span><span class="p">,</span>linetype<span class="o">=</span><span class="m">2</span><span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'number of days data'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical type I rates at nominal 0.05 level'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'Theoretical and empirical coverage of 0.05 CIs on SNR of Markowitz Portfolio, using some clairvoyance, normal returns.'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/new_mp_ci_ci_plots-1.png" title="plot of chunk ci_plots" alt="plot of chunk ci_plots" width="900px" height="700px" /></p>
<p>The new confidence limit, plotted in blue and called the "tas form" here, is apparently very optimistic and
much too high.
The empirical rate of type I errors is enormous, sometimes over 90%.
It should be noted that the simulations here use some amount of
'clairvoyance' on <span class="math">\(\zeta\)</span>; use of a sample estimate would further degrade them, but they are already unusable
except for unreasonably large sample sizes. So back to the drawing board.</p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Markowitz Portfolio Covariance, Elliptical Returns2018-03-12T22:28:31-07:002018-03-12T22:28:31-07:00Steven E. Pavtag:www.gilgamath.com,2018-03-12:/markowitz-cov-elliptical.html<p>In a <a href="bad-cis">previous blog post</a>, I looked at asymptotic confidence
intervals for the Signal to Noise ratio of the (sample) Markowitz
portfolio, finding them to be deficient. (Perhaps they are useful if
one has hundreds of thousands of days of data, but are otherwise
awful.) Those confidence intervals came from revision four of my paper
on the <a href="https://arxiv.org/abs/1312.0557">Asymptotic distribution of the Markowitz Portfolio</a>.
In that same update, I also describe, albeit in an obfuscated form,
the asymptotic distribution of the sample Markowitz portfolio for
elliptical returns. Here I check that finding empirically.
<!-- PELICAN_END_SUMMARY --></p>
<p>Suppose you observe a <span class="math">\(p\)</span> vector of returns drawn from an elliptical
distribution with mean <span class="math">\(\mu\)</span>, covariance <span class="math">\(\Sigma\)</span> and 'kurtosis factor',
<span class="math">\(\kappa\)</span>. Three times the kurtosis factor is the kurtosis of marginals
under this assumed model. It takes value <span class="math">\(1\)</span> for a multivariate normal.
This model of returns is slightly more realistic than multivariate normal,
but does not allow for skewness of asset returns, which seems unrealistic.</p>
<p>Nonetheless, let <span class="math">\(\hat{\nu}\)</span> be the Markowitz portfolio built on a sample
of <span class="math">\(n\)</span> days of independent returns:
</p>
<div class="math">$$
\hat{\nu} = \hat{\Sigma}^{-1} \hat{\mu},
$$</div>
<p>
where <span class="math">\(\hat{\mu}, \hat{\Sigma}\)</span> are the regular 'vanilla' estimates
of mean and covariance. The vector <span class="math">\(\hat{\nu}\)</span> is, in a sense, over-corrected,
and we need to cancel out a square root of <span class="math">\(\Sigma\)</span> (the population value). So
we will consider the distribution of <span class="math">\(Q \Sigma^{\top/2} \hat{\nu}\)</span>, where
<span class="math">\(\Sigma^{\top/2}\)</span> is the upper triangular Cholesky factor of <span class="math">\(\Sigma\)</span>,
and where <span class="math">\(Q\)</span> is an orthogonal matrix (<span class="math">\(Q Q^{\top} = I\)</span>), and where
<span class="math">\(Q\)</span> rotates <span class="math">\(\Sigma^{-1/2}\mu\)</span> onto <span class="math">\(e_1\)</span>, the first basis vector:
</p>
<div class="math">$$
Q \Sigma^{-1/2}\mu = \zeta e_1,
$$</div>
<p>
where <span class="math">\(\zeta\)</span> is the Signal to Noise ratio of the population Markowitz
portfolio: <span class="math">\(\zeta = \sqrt{\mu^{\top}\Sigma^{-1}\mu} = \left\Vert …</span></p><p>In a <a href="bad-cis">previous blog post</a>, I looked at asymptotic confidence
intervals for the Signal to Noise ratio of the (sample) Markowitz
portfolio, finding them to be deficient. (Perhaps they are useful if
one has hundreds of thousands of days of data, but are otherwise
awful.) Those confidence intervals came from revision four of my paper
on the <a href="https://arxiv.org/abs/1312.0557">Asymptotic distribution of the Markowitz Portfolio</a>.
In that same update, I also describe, albeit in an obfuscated form,
the asymptotic distribution of the sample Markowitz portfolio for
elliptical returns. Here I check that finding empirically.
<!-- PELICAN_END_SUMMARY --></p>
<p>Suppose you observe a <span class="math">\(p\)</span> vector of returns drawn from an elliptical
distribution with mean <span class="math">\(\mu\)</span>, covariance <span class="math">\(\Sigma\)</span> and 'kurtosis factor',
<span class="math">\(\kappa\)</span>. Three times the kurtosis factor is the kurtosis of marginals
under this assumed model. It takes value <span class="math">\(1\)</span> for a multivariate normal.
This model of returns is slightly more realistic than multivariate normal,
but does not allow for skewness of asset returns, which seems unrealistic.</p>
<p>Nonetheless, let <span class="math">\(\hat{\nu}\)</span> be the Markowitz portfolio built on a sample
of <span class="math">\(n\)</span> days of independent returns:
</p>
<div class="math">$$
\hat{\nu} = \hat{\Sigma}^{-1} \hat{\mu},
$$</div>
<p>
where <span class="math">\(\hat{\mu}, \hat{\Sigma}\)</span> are the regular 'vanilla' estimates
of mean and covariance. The vector <span class="math">\(\hat{\nu}\)</span> is, in a sense, over-corrected,
and we need to cancel out a square root of <span class="math">\(\Sigma\)</span> (the population value). So
we will consider the distribution of <span class="math">\(Q \Sigma^{\top/2} \hat{\nu}\)</span>, where
<span class="math">\(\Sigma^{\top/2}\)</span> is the upper triangular Cholesky factor of <span class="math">\(\Sigma\)</span>,
and where <span class="math">\(Q\)</span> is an orthogonal matrix (<span class="math">\(Q Q^{\top} = I\)</span>), and where
<span class="math">\(Q\)</span> rotates <span class="math">\(\Sigma^{-1/2}\mu\)</span> onto <span class="math">\(e_1\)</span>, the first basis vector:
</p>
<div class="math">$$
Q \Sigma^{-1/2}\mu = \zeta e_1,
$$</div>
<p>
where <span class="math">\(\zeta\)</span> is the Signal to Noise ratio of the population Markowitz
portfolio: <span class="math">\(\zeta = \sqrt{\mu^{\top}\Sigma^{-1}\mu} = \left\Vert \Sigma^{-1/2}\mu \right\Vert.\)</span>
A <a href="https://arxiv.org/abs/1803.01381">very recent paper</a> on arxiv calls
<span class="math">\(\Sigma^{-1/2}\mu\)</span> the 'Generalized Information Ratio', and I think
it may be productive to analyze this quantity.</p>
<p>Back to our problem, as <span class="math">\(n\)</span> gets large, we expect <span class="math">\(\hat{\Sigma}\)</span> to approach
<span class="math">\(\Sigma\)</span> in which case <span class="math">\(Q \Sigma^{\top/2} \hat{\nu}\)</span> should approach <span class="math">\(\zeta
e_1\)</span>. What I find, by the delta method, is that
</p>
<div class="math">$$
\sqrt{n}\left(Q \Sigma^{\top/2} \hat{\Sigma}^{-1}\hat{\mu} - \zeta e_1\right)
\rightsquigarrow \mathcal{N}\left(0,
\left(1+\kappa\zeta^2\right)I + \left(2\kappa - 1\right)\zeta^2 e_1 e_1^{\top}
\right).
$$</div>
<p>
Note:</p>
<ul>
<li>The true mean return of the sample Markowitz portfolio is equal to
<div class="math">$$
\mu^{\top} \hat{\Sigma}^{-1}\hat{\mu} =
\mu^{\top} \Sigma^{-1} \Sigma^{1/2} Q^{\top} Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu} =
\zeta e_1^{\top} Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu},
$$</div>
that is, all the expected return is due to the first element of <span class="math">\(Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu}\)</span>.
The first element may have non-zero mean, but the remaining elements are
asymptotically zero mean.</li>
<li>The volatility of the sample Markowitz portfolio is equal to
<div class="math">$$
\sqrt{\hat{\mu}^{\top}\hat{\Sigma}^{-1}\Sigma\hat{\Sigma}^{-1}\hat{\mu}} =
\sqrt{\hat{\mu}^{\top}\hat{\Sigma}^{-1}\Sigma^{1/2}Q^{\top} Q \Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu}} =
\left\Vert Q \Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu} \right\Vert.
$$</div>
So the total length of the vector <span class="math">\(Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu}\)</span> gives
the risk of our portfolio. </li>
<li>By means of <span class="math">\(Q\)</span> we have rotated the space such that the errors in the
elements of <span class="math">\(Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu}\)</span> are asymptotically independent
(their covariance is diagonal).</li>
</ul>
<p>I learned the hard way in the previous post that 'asymptotically' can require
very large sample sizes, much bigger than practical. So here I first check
these covariances for reasonable sample sizes. I draw returns from either
multivariate normal, or multivariate <span class="math">\(t\)</span> distribution with degrees of freedom
selected to achieve a fixed value of <span class="math">\(\kappa\)</span>, the kurtosis factor. I perform
simulations with the sample ranging between 100 and 1600 days, with the
Signal Noise ratio of the population Markowitz portfolio ranging from 1/2
to 2 in annualized units, and I test universes of 4 or 16 assets. For each
choice of parameters, I perform 10K simulations. I compute the error
</p>
<div class="math">$$
\sqrt{n}\left(Q \Sigma^{\top/2} \hat{\Sigma}^{-1}\hat{\mu} - \zeta e_1\right)
\operatorname{diag}\left({\left(1+\kappa\zeta^2\right)I + \left(2\kappa - 1\right)\zeta^2 e_1 e_1^{\top}}\right)^{-1/2}.
$$</div>
<p>
I save the first and last element of that vector for each simulation. Then for
a fixed setting of the parameters, I will create a Q-Q plot of the actual
errors against Normal quantiles. We will not test independence of the elements,
but we should get a quick read on whether we have correctly expressed the
mean and covariance of <span class="math">\(Q \Sigma^{\top/2} \hat{\Sigma}^{-1}\hat{\mu}\)</span>, and
what sample size is required to reach 'asymptotically'. The simulations:</p>
<div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tibble<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>mvtnorm<span class="p">)</span>
<span class="c1"># https://cran.r-project.org/web/packages/doFuture/vignettes/doFuture.html</span>
<span class="kn">library</span><span class="p">(</span>doFuture<span class="p">)</span>
registerDoFuture<span class="p">()</span>
plan<span class="p">(</span>multiprocess<span class="p">)</span>
<span class="p">})</span>
<span class="c1"># one simulation of n periods of data on p assets with true optimal</span>
<span class="c1"># SNR of (the vector of) pzeta</span>
onesim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="m">0.3</span><span class="p">)</span> <span class="p">{</span>
<span class="c1"># create the population</span>
sig <span class="o"><-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span>df<span class="o">=</span><span class="m">1000</span><span class="p">,</span>Sigma<span class="o">=</span><span class="p">(</span><span class="m">1</span><span class="o">-</span><span class="kp">beta</span><span class="p">)</span><span class="o">*</span><span class="kp">diag</span><span class="p">(</span>p<span class="p">)</span><span class="o">+</span><span class="kp">beta</span><span class="p">)[,,</span><span class="m">1</span><span class="p">]</span>
<span class="c1">#mu <- rnorm(p)</span>
<span class="c1"># to simplify our lives, force Q to be the identity</span>
hasig <span class="o"><-</span> <span class="kp">chol</span><span class="p">(</span>sig<span class="p">)</span>
<span class="c1"># don't worry, we rescale it later</span>
e1 <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="kp">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span>p<span class="m">-1</span><span class="p">))</span>
mu <span class="o"><-</span> <span class="kp">t</span><span class="p">(</span>hasig<span class="p">)</span> <span class="o">%*%</span> e1
<span class="c1"># true Markowitz portfolio</span>
pwopt <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>sig<span class="p">,</span>mu<span class="p">)</span>
<span class="c1"># true optimal squared sharpe</span>
psrsqopt <span class="o"><-</span> <span class="kp">sum</span><span class="p">(</span>pwopt <span class="o">*</span> mu<span class="p">)</span>
psropt <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>psrsqopt<span class="p">)</span>
<span class="c1"># rescale mu to achieve pzeta</span>
rescal <span class="o"><-</span> <span class="p">(</span>pzeta <span class="o">/</span> psropt<span class="p">)</span>
mu <span class="o"><-</span> rescal <span class="o">*</span> mu
pwopt <span class="o"><-</span> rescal <span class="o">*</span> pwopt
psropt <span class="o"><-</span> pzeta
psrsqopt <span class="o"><-</span> pzeta<span class="o">^</span><span class="m">2</span>
<span class="c1"># now sample</span>
<span class="c1"># kurty is the kurtosis factor </span>
<span class="c1"># =1 means normal, uset</span>
<span class="kr">if</span> <span class="p">(</span>kurty<span class="o">==</span><span class="m">1</span><span class="p">)</span> <span class="p">{</span>
X <span class="o"><-</span> rmvnorm<span class="p">(</span>n<span class="p">,</span>mean<span class="o">=</span>mu<span class="p">,</span>sigma<span class="o">=</span>sig<span class="p">)</span>
<span class="p">}</span> <span class="kr">else</span> <span class="p">{</span>
df <span class="o"><-</span> <span class="m">4</span> <span class="o">+</span> <span class="p">(</span><span class="m">6</span> <span class="o">/</span> <span class="p">(</span>kurty<span class="m">-1</span><span class="p">))</span>
<span class="c1"># for a t distribution, I have to shift the sigma by df / (df-2)</span>
X <span class="o"><-</span> rmvt<span class="p">(</span>n<span class="p">,</span>delta<span class="o">=</span>mu<span class="p">,</span>sigma<span class="o">=</span><span class="p">((</span>df<span class="m">-2</span><span class="p">)</span><span class="o">/</span>df<span class="p">)</span> <span class="o">*</span> sig<span class="p">,</span>type<span class="o">=</span><span class="s">'shifted'</span><span class="p">,</span>df<span class="o">=</span>df<span class="p">)</span>
<span class="p">}</span>
smu1 <span class="o"><-</span> <span class="kp">colMeans</span><span class="p">(</span>X<span class="p">)</span>
ssig <span class="o"><-</span> <span class="p">((</span>n<span class="m">-1</span><span class="p">)</span><span class="o">/</span>n<span class="p">)</span> <span class="o">*</span> cov<span class="p">(</span>X<span class="p">)</span>
swopt <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>ssig<span class="p">,</span>smu1<span class="p">)</span>
<span class="c1"># scale by sigma^T/2</span>
ssmp <span class="o"><-</span> hasig <span class="o">%*%</span> swopt
stat <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>n<span class="p">)</span> <span class="o">*</span> <span class="p">(</span>ssmp <span class="o">-</span> pzeta <span class="o">*</span> e1<span class="p">)</span>
<span class="c1"># the claim is that this is the covariance of that thing:</span>
<span class="c1"># (1 + kurty * psrsqopt) * diag(p) + (2 * kurty - 1) * psrsqopt * outer(e1,e1)</span>
Omegd <span class="o"><-</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> kurty <span class="o">*</span> psrsqopt<span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> kurty <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> psrsqopt <span class="o">*</span> e1
<span class="c1"># divide by the root covariance. it is diagonal</span>
adjstat <span class="o"><-</span> stat <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span>Omegd<span class="p">)</span>
<span class="c1"># pick out just the first and last values</span>
firstv <span class="o"><-</span> adjstat<span class="p">[</span><span class="m">1</span><span class="p">]</span>
lastv <span class="o"><-</span> adjstat<span class="p">[</span>p<span class="p">]</span>
<span class="kp">cbind</span><span class="p">(</span>pzeta<span class="p">,</span>firstv<span class="p">,</span>lastv<span class="p">)</span>
<span class="p">}</span>
<span class="c1"># do that many times.</span>
repsim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="m">0.3</span><span class="p">)</span> <span class="p">{</span>
foo <span class="o"><-</span> <span class="kp">replicate</span><span class="p">(</span>nrep<span class="p">,</span>onesim<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">,</span><span class="kp">beta</span><span class="p">))</span>
baz <span class="o"><-</span> <span class="kp">aperm</span><span class="p">(</span>foo<span class="p">,</span><span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">2</span><span class="p">))</span>
<span class="kp">dim</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span>nrep <span class="o">*</span> <span class="kp">length</span><span class="p">(</span>pzeta<span class="p">),</span><span class="kp">dim</span><span class="p">(</span>foo<span class="p">)[</span><span class="m">2</span><span class="p">])</span>
<span class="kp">colnames</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kp">colnames</span><span class="p">(</span>foo<span class="p">)</span>
<span class="kp">invisible</span><span class="p">(</span><span class="kp">as.data.frame</span><span class="p">(</span>baz<span class="p">))</span>
<span class="p">}</span>
manysim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="m">0.3</span><span class="p">,</span>nnodes<span class="o">=</span><span class="m">6</span><span class="p">)</span> <span class="p">{</span>
<span class="kr">if</span> <span class="p">(</span>nrep <span class="o">></span> <span class="m">2</span><span class="o">*</span>nnodes<span class="p">)</span> <span class="p">{</span>
<span class="c1"># do in parallel.</span>
nper <span class="o"><-</span> <span class="kp">table</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="p">((</span><span class="m">0</span><span class="o">:</span><span class="p">(</span>nrep<span class="m">-1</span><span class="p">)</span> <span class="o">%%</span> nnodes<span class="p">)))</span>
retv <span class="o"><-</span> foreach<span class="p">(</span>i<span class="o">=</span><span class="m">1</span><span class="o">:</span>nnodes<span class="p">,</span><span class="m">.</span>export <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">'pzeta'</span><span class="p">,</span><span class="s">'n'</span><span class="p">,</span><span class="s">'p'</span><span class="p">,</span><span class="s">'kurty'</span><span class="p">,</span><span class="s">'beta'</span><span class="p">,</span><span class="s">'repsim'</span><span class="p">,</span><span class="s">'onesim'</span><span class="p">))</span> <span class="o">%dopar%</span> <span class="p">{</span>
repsim<span class="p">(</span>nrep<span class="o">=</span>nper<span class="p">[</span>i<span class="p">],</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="kp">beta</span><span class="p">)</span>
<span class="p">}</span> <span class="o">%>%</span>
bind_rows<span class="p">()</span>
<span class="p">}</span> <span class="kr">else</span> <span class="p">{</span>
retv <span class="o"><-</span> repsim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="kp">beta</span><span class="p">)</span>
<span class="p">}</span>
retv
<span class="p">}</span>
<span class="c1"># actually do it many times.</span>
ope <span class="o"><-</span> <span class="m">252</span>
zetasq <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">4</span><span class="p">)</span> <span class="o">/</span> ope
params <span class="o"><-</span> tidyr<span class="o">::</span>crossing<span class="p">(</span>tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>n<span class="p">,</span><span class="m">100</span><span class="p">,</span><span class="m">400</span><span class="p">,</span><span class="m">1600</span><span class="p">),</span>
tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>kurty<span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">8</span><span class="p">,</span><span class="m">16</span><span class="p">),</span>
tibble<span class="o">::</span>tibble<span class="p">(</span>pzeta<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>zetasq<span class="p">)),</span>
tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>p<span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">16</span><span class="p">))</span>
nrep <span class="o"><-</span> <span class="m">10000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span>
<span class="kp">system.time</span><span class="p">({</span>
results <span class="o"><-</span> params <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>sims<span class="o">=</span><span class="kt">list</span><span class="p">(</span>manysim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">)))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
tidyr<span class="o">::</span>unnest<span class="p">()</span>
<span class="p">})</span>
</pre></div>
<div class="highlight"><pre><span></span> user system elapsed
2195.52 5716.40 1077.00
</pre></div>
<p>We collect the simulations now:</p>
<div class="highlight"><pre><span></span><span class="c1"># summarize the moments:</span>
sumres <span class="o"><-</span> results <span class="o">%>%</span>
dplyr<span class="o">::</span>select<span class="p">(</span><span class="o">-</span>pzeta1<span class="p">)</span> <span class="o">%>%</span>
arrange<span class="p">(</span>firstv<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>firstq<span class="o">=</span>qnorm<span class="p">(</span>ppoints<span class="p">(</span><span class="kp">length</span><span class="p">(</span>firstv<span class="p">))))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
arrange<span class="p">(</span>lastv<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>lastq<span class="o">=</span>qnorm<span class="p">(</span>ppoints<span class="p">(</span><span class="kp">length</span><span class="p">(</span>lastv<span class="p">))))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>pzeta<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`annualized SNR`</span><span class="o">=</span>zyr<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`kurtosis factor`</span><span class="o">=</span>kurty<span class="p">)</span>
</pre></div>
<p>What follows are the Q-Q plots of, first, the first element of the vector
<span class="math">\(Q \Sigma^{\top/2} \hat{\Sigma}^{-1}\hat{\mu}\)</span>, and then the last element.
We have facet columns for <span class="math">\(\zeta\)</span> and <span class="math">\(\kappa\)</span>, and facet rows for
<span class="math">\(p\)</span> and <span class="math">\(n\)</span>. By my eye, these are all fairly encouraging, with near normal
quantiles of the standardized error, except for the <span class="math">\(n=100, p=16\)</span> case.
This suggests that larger sample sizes are required for a larger universe
of assets. Perhaps also there are issues when the kurtosis is very high, as
we see some deviances in the lower right corner of these plots.</p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> sumres <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>firstq<span class="p">,</span>firstv<span class="p">))</span> <span class="o">+</span>
geom_point<span class="p">()</span> <span class="o">+</span>
geom_abline<span class="p">(</span>slope<span class="o">=</span><span class="m">1</span><span class="p">,</span>intercept<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">+</span> n <span class="o">~</span> <span class="sb">`annualized SNR`</span><span class="o">+</span><span class="sb">`kurtosis factor`</span><span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">,</span>labeller<span class="o">=</span>label_both<span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'theoretical quantiles'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical quantiles'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'QQ, first element of the transformed Markowitz portfolio.'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/marko_cov_ellip_firstv_qq_plots-1.png" title="plot of chunk firstv_qq_plots" alt="plot of chunk firstv_qq_plots" width="900px" height="700px" /></p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> sumres <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>lastq<span class="p">,</span>lastv<span class="p">))</span> <span class="o">+</span>
geom_point<span class="p">()</span> <span class="o">+</span>
geom_abline<span class="p">(</span>slope<span class="o">=</span><span class="m">1</span><span class="p">,</span>intercept<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">+</span> n <span class="o">~</span> <span class="sb">`annualized SNR`</span><span class="o">+</span><span class="sb">`kurtosis factor`</span><span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">,</span>labeller<span class="o">=</span>label_both<span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'theoretical quantiles'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical quantiles'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'QQ, last element of the transformed Markowitz portfolio.'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/marko_cov_ellip_lastv_qq_plots-1.png" title="plot of chunk lastv_qq_plots" alt="plot of chunk lastv_qq_plots" width="900px" height="700px" /></p>
<h2>Quantiles of SNR</h2>
<p>Here we are interested in the Signal Noise ratio of the sample Markowitz
portfolio, which takes value
</p>
<div class="math">$$
u = \zeta \frac{e_1^{\top} Q\Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu}}{
\left\Vert Q \Sigma^{\top/2}\hat{\Sigma}^{-1}\hat{\mu} \right\Vert}.
$$</div>
<p>
Asymptotically we can think of this as
</p>
<div class="math">$$
u = \zeta \frac{\zeta + \sigma_1 z_1}{\sqrt{\left(\zeta + \sigma_1 z_1\right)^2
+ \sigma_p^2 \left(z_2^2 + \ldots + z_p^2\right)}},
$$</div>
<p>
where
</p>
<div class="math">$$
\sigma_1 = n^{-1/2}\sqrt{\left(1+\kappa\zeta^2\right) + \left(2\kappa - 1\right)\zeta^2},\quad\mbox{and}\quad
\sigma_p = n^{-1/2}\sqrt{\left(1+\kappa\zeta^2\right)},
$$</div>
<p>
and where the <span class="math">\(z_i\)</span> are independent standard normals.</p>
<p>Now consider the 'TAS' transform, defined as the Tangent of ArcSine, <span class="math">\(f_{TAS}(x) = x / \sqrt{1-x^2}\)</span>. Apply
this transformation to our SNR, with some rescaling
</p>
<div class="math">$$
f_{TAS}\left(\frac{u}{\zeta}\right) =
\frac{\zeta + \sigma_1 z_1}{\sigma_p\sqrt{z_2^2 + \ldots + z_p^2}},
$$</div>
<p>
which looks a lot like a non-central <span class="math">\(t\)</span> random variable, up to scaling.
(I used the same trick in my <a href="https://arxiv.org/abs/1409.5936">paper on portfolio quality bounds</a>.)
So write
</p>
<div class="math">$$
f_{TAS}\left(\frac{u}{\zeta}\right)
= \frac{\sigma_1}{\sigma_p \sqrt{p-1}} \frac{\frac{\zeta}{\sigma_1} + z_1}{\sqrt{z_2^2 + \ldots + z_p^2}/\sqrt{p-1}}
= \frac{\sigma_1}{\sigma_p \sqrt{p-1}} t,
$$</div>
<p>
where <span class="math">\(t\)</span> is a non-central <span class="math">\(t\)</span> random variable with <span class="math">\(p-1\)</span> degrees of freedom
and non-centrality parameter <span class="math">\(\zeta/\sigma_1\)</span>.</p>
<p>Here I test these quantiles briefly. For one setting of <span class="math">\(n, p, \zeta, \kappa\)</span>,
I perform 50000 simulations, then compute theoretical quantiles based on the
non-central <span class="math">\(t\)</span> distribution as above. I then Q-Q plot.</p>
<div class="highlight"><pre><span></span><span class="c1"># simulate the SNR of the sample Markowitz portfolio</span>
<span class="c1"># SNR of (the vector of) pzeta</span>
mp_snr_sim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">,</span>beta<span class="o">=</span><span class="m">0.3</span><span class="p">)</span> <span class="p">{</span>
<span class="c1"># create the population</span>
sig <span class="o"><-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span>df<span class="o">=</span><span class="m">1000</span><span class="p">,</span>Sigma<span class="o">=</span><span class="p">(</span><span class="m">1</span><span class="o">-</span><span class="kp">beta</span><span class="p">)</span><span class="o">*</span><span class="kp">diag</span><span class="p">(</span>p<span class="p">)</span><span class="o">+</span><span class="kp">beta</span><span class="p">)[,,</span><span class="m">1</span><span class="p">]</span>
<span class="c1"># to simplify our lives, force Q to be the identity</span>
hasig <span class="o"><-</span> <span class="kp">chol</span><span class="p">(</span>sig<span class="p">)</span>
<span class="c1"># don't worry, we rescale it later</span>
e1 <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="kp">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span>p<span class="m">-1</span><span class="p">))</span>
mu <span class="o"><-</span> <span class="kp">t</span><span class="p">(</span>hasig<span class="p">)</span> <span class="o">%*%</span> e1
<span class="c1"># true Markowitz portfolio</span>
pwopt <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>sig<span class="p">,</span>mu<span class="p">)</span>
<span class="c1"># true optimal squared sharpe</span>
psrsqopt <span class="o"><-</span> <span class="kp">sum</span><span class="p">(</span>pwopt <span class="o">*</span> mu<span class="p">)</span>
psropt <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>psrsqopt<span class="p">)</span>
<span class="c1"># rescale mu to achieve pzeta</span>
rescal <span class="o"><-</span> <span class="p">(</span>pzeta <span class="o">/</span> psropt<span class="p">)</span>
mu <span class="o"><-</span> rescal <span class="o">*</span> mu
pwopt <span class="o"><-</span> rescal <span class="o">*</span> pwopt
psropt <span class="o"><-</span> pzeta
psrsqopt <span class="o"><-</span> pzeta<span class="o">^</span><span class="m">2</span>
<span class="c1"># now sample</span>
<span class="c1"># kurty is the kurtosis factor </span>
<span class="c1"># =1 means normal, uset</span>
<span class="kr">if</span> <span class="p">(</span>kurty<span class="o">==</span><span class="m">1</span><span class="p">)</span> <span class="p">{</span>
X <span class="o"><-</span> rmvnorm<span class="p">(</span>n<span class="p">,</span>mean<span class="o">=</span>mu<span class="p">,</span>sigma<span class="o">=</span>sig<span class="p">)</span>
<span class="p">}</span> <span class="kr">else</span> <span class="p">{</span>
df <span class="o"><-</span> <span class="m">4</span> <span class="o">+</span> <span class="p">(</span><span class="m">6</span> <span class="o">/</span> <span class="p">(</span>kurty<span class="m">-1</span><span class="p">))</span>
<span class="c1"># for a t distribution, I have to shift the sigma by df / (df-2)</span>
X <span class="o"><-</span> rmvt<span class="p">(</span>n<span class="p">,</span>delta<span class="o">=</span>mu<span class="p">,</span>sigma<span class="o">=</span><span class="p">((</span>df<span class="m">-2</span><span class="p">)</span><span class="o">/</span>df<span class="p">)</span> <span class="o">*</span> sig<span class="p">,</span>type<span class="o">=</span><span class="s">'shifted'</span><span class="p">,</span>df<span class="o">=</span>df<span class="p">)</span>
<span class="p">}</span>
smu1 <span class="o"><-</span> <span class="kp">colMeans</span><span class="p">(</span>X<span class="p">)</span>
ssig <span class="o"><-</span> <span class="p">((</span>n<span class="m">-1</span><span class="p">)</span><span class="o">/</span>n<span class="p">)</span> <span class="o">*</span> cov<span class="p">(</span>X<span class="p">)</span>
swopt <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>ssig<span class="p">,</span>smu1<span class="p">)</span>
<span class="c1"># compute the true SNR:</span>
snr <span class="o"><-</span> <span class="kp">sum</span><span class="p">(</span>mu <span class="o">*</span> swopt<span class="p">)</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">t</span><span class="p">(</span>swopt<span class="p">)</span> <span class="o">%*%</span> <span class="p">(</span>sig <span class="o">%*%</span> swopt<span class="p">))</span>
<span class="p">}</span>
params <span class="o"><-</span> <span class="kt">data_frame</span><span class="p">(</span>pzeta<span class="o">=</span><span class="m">2</span><span class="o">/</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">),</span>n<span class="o">=</span><span class="m">10</span><span class="o">*</span>ope<span class="p">,</span>p<span class="o">=</span><span class="m">6</span><span class="p">,</span>kurty<span class="o">=</span><span class="m">4</span><span class="p">)</span>
ope <span class="o"><-</span> <span class="m">252</span>
nrep <span class="o"><-</span> <span class="m">50000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span>
<span class="kp">system.time</span><span class="p">({</span>
results <span class="o"><-</span> params <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>resu<span class="o">=</span><span class="kt">list</span><span class="p">(</span><span class="kt">data_frame</span><span class="p">(</span>rvs<span class="o">=</span><span class="kp">replicate</span><span class="p">(</span>nrep<span class="p">,</span>mp_snr_sim<span class="p">(</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">)))))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
unnest<span class="p">()</span>
<span class="p">})</span>
</pre></div>
<div class="highlight"><pre><span></span> user system elapsed
188.420 133.344 163.337
</pre></div>
<div class="highlight"><pre><span></span><span class="c1"># invert the TAS function</span>
anti_tas <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> x <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> x<span class="o">^</span><span class="m">2</span><span class="p">)</span> <span class="p">}</span>
<span class="c1"># here's a function which creates the associated quantile from the noncentral t</span>
qsnrs <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">,</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="p">{</span>
e1 <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">0</span><span class="p">)</span>
Omegd <span class="o"><-</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> kurty <span class="o">*</span> pzeta<span class="o">^</span><span class="m">2</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> kurty <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span>pzeta<span class="o">^</span><span class="m">2</span><span class="p">)</span> <span class="o">*</span> e1
sigma_1 <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>Omegd<span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="o">/</span> n<span class="p">)</span>
sigma_p <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>Omegd<span class="p">[</span><span class="m">2</span><span class="p">]</span> <span class="o">/</span> n<span class="p">)</span>
tvals <span class="o"><-</span> qt<span class="p">(</span>x<span class="p">,</span>df<span class="o">=</span>p<span class="m">-1</span><span class="p">,</span>ncp<span class="o">=</span>pzeta <span class="o">/</span> sigma_1<span class="p">)</span>
<span class="c1"># those were t's; bring them back with tas inverse</span>
retv <span class="o"><-</span> pzeta <span class="o">*</span> anti_tas<span class="p">(</span>sigma_1 <span class="o">*</span> tvals <span class="o">/</span> <span class="p">(</span>sigma_p <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>p<span class="m">-1</span><span class="p">)))</span>
<span class="p">}</span>
sumres <span class="o"><-</span> results <span class="o">%>%</span>
arrange<span class="p">(</span>rvs<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>qvs<span class="o">=</span>qsnrs<span class="p">(</span>ppoints<span class="p">(</span><span class="kp">length</span><span class="p">(</span>rvs<span class="p">)),</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>pzeta<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`annualized SNR`</span><span class="o">=</span>zyr<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`kurtosis factor`</span><span class="o">=</span>kurty<span class="p">)</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> sumres <span class="o">%>%</span>
mutate<span class="p">(</span>qvs<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>qvs<span class="p">,</span>rvs<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>rvs<span class="p">)</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>qvs<span class="p">,</span>rvs<span class="p">))</span> <span class="o">+</span>
geom_point<span class="p">()</span> <span class="o">+</span>
geom_abline<span class="p">(</span>slope<span class="o">=</span><span class="m">1</span><span class="p">,</span>intercept<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">+</span> n <span class="o">~</span> <span class="sb">`annualized SNR`</span><span class="o">+</span><span class="sb">`kurtosis factor`</span><span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">,</span>labeller<span class="o">=</span>label_both<span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'theoretical quantiles, annualized SNR'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical quantiles, annualized SNR'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'QQ plot, SNR of the sample Markowitz portfolio, 10 years data.'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/marko_cov_ellip_snr_qq_plots-1.png" title="plot of chunk snr_qq_plots" alt="plot of chunk snr_qq_plots" width="900px" height="700px" /></p>
<p>This is rather unfortunate, as it suggests there is still a bug in my code,
or in my covariance, or both.</p>
<p><strong>Edit</strong> I did not think to check the simulations above for longer sample
sizes. Indeed, if you assume the portfolio manager has 100 years of daily
data (!), instead of 10 years, as assumed above, the approximate
distribution of the Signal Noise ratio of the Markowitz portfolio given above
is reasonably accurate, as demonstrated below. So this seems to be another
instance of 'asymptotically' requiring an unreasonably large sample size.</p>
<div class="highlight"><pre><span></span><span class="c1"># once again, but for 100 days of data:</span>
params <span class="o"><-</span> <span class="kt">data_frame</span><span class="p">(</span>pzeta<span class="o">=</span><span class="m">2</span><span class="o">/</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">),</span>n<span class="o">=</span><span class="m">100</span><span class="o">*</span>ope<span class="p">,</span>p<span class="o">=</span><span class="m">6</span><span class="p">,</span>kurty<span class="o">=</span><span class="m">4</span><span class="p">)</span>
ope <span class="o"><-</span> <span class="m">252</span>
nrep <span class="o"><-</span> <span class="m">50000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span>
<span class="kp">system.time</span><span class="p">({</span>
results <span class="o"><-</span> params <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>resu<span class="o">=</span><span class="kt">list</span><span class="p">(</span><span class="kt">data_frame</span><span class="p">(</span>rvs<span class="o">=</span><span class="kp">replicate</span><span class="p">(</span>nrep<span class="p">,</span>mp_snr_sim<span class="p">(</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">)))))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
unnest<span class="p">()</span>
<span class="p">})</span>
</pre></div>
<div class="highlight"><pre><span></span> user system elapsed
1440.804 977.568 1254.830
</pre></div>
<div class="highlight"><pre><span></span>sumres <span class="o"><-</span> results <span class="o">%>%</span>
arrange<span class="p">(</span>rvs<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>kurty<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>qvs<span class="o">=</span>qsnrs<span class="p">(</span>ppoints<span class="p">(</span><span class="kp">length</span><span class="p">(</span>rvs<span class="p">)),</span>pzeta<span class="o">=</span>pzeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">,</span>kurty<span class="o">=</span>kurty<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>pzeta<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`annualized SNR`</span><span class="o">=</span>zyr<span class="p">)</span> <span class="o">%>%</span>
rename<span class="p">(</span><span class="sb">`kurtosis factor`</span><span class="o">=</span>kurty<span class="p">)</span>
ph <span class="o"><-</span> sumres <span class="o">%>%</span>
mutate<span class="p">(</span>qvs<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>qvs<span class="p">,</span>rvs<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">)</span><span class="o">*</span>rvs<span class="p">)</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>qvs<span class="p">,</span>rvs<span class="p">))</span> <span class="o">+</span>
geom_point<span class="p">()</span> <span class="o">+</span>
geom_abline<span class="p">(</span>slope<span class="o">=</span><span class="m">1</span><span class="p">,</span>intercept<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">+</span> n <span class="o">~</span> <span class="sb">`annualized SNR`</span><span class="o">+</span><span class="sb">`kurtosis factor`</span><span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">,</span>labeller<span class="o">=</span>label_both<span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'theoretical quantiles, annualized SNR'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical quantiles, annualized SNR'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'QQ plot, SNR of the sample Markowitz portfolio, 100 years data.'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/marko_cov_ellip_snr_qq_plots_100-1.png" title="plot of chunk snr_qq_plots_100" alt="plot of chunk snr_qq_plots_100" width="900px" height="700px" /></p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>A Lack of Confidence Interval2018-02-15T21:58:58-08:002018-02-15T21:58:58-08:00Steven E. Pavtag:www.gilgamath.com,2018-02-15:/bad-cis.html<p>For some years now I have been playing around with a certain problem
in portfolio statistics: suppose you observe <span class="math">\(n\)</span> independent observations
of a <span class="math">\(p\)</span> vector of returns, then form the Markowitz portfolio based on
those returns. What then is the distribution of what I call the 'signal to
noise ratio' of that Markowitz portfolio, defined as the true expected
return divided by the true volatility. That is, if <span class="math">\(\nu\)</span> is the Markowitz
portfolio, built on a sample, its 'SNR' is <span class="math">\(\nu^{\top}\mu /
\sqrt{\nu^{\top}\Sigma \nu}\)</span>, where <span class="math">\(\mu\)</span> is the population mean vector, and
<span class="math">\(\Sigma\)</span> is the population covariance matrix.
<!-- PELICAN_END_SUMMARY --></p>
<p>This is an odd problem, somewhat unlike classical statistical inference because the
unknown quantity, the SNR, depends on population parameters, but also the
sample. It is random and unknown. What you learn in your basic statistics class is
inference on fixed unknowns. (Actually, I never really took a basic statistics
class, but I think that's right.)</p>
<p>Paulsen and Sohl made some progress on this problem in their 2016 paper on what
they call the
<a href="https://arxiv.org/abs/1602.06186">Sharpe Ratio Information Criterion.</a>
They find a sample statistic which is unbiased for the portfolio SNR when
returns are (multivariate) Gaussian. In my mad scribblings on the backs of
envelopes and scrap paper, I have been trying to find the <em>distribution</em> of the SNR.
I have been looking for this love, as they say, in all the wrong places,
usually hoping for some clever transformation that will lead to a slick proof.
(I was taught from a young age to look for slick proofs.) </p>
<p>Having failed that mission, I pivoted to looking for confidence intervals for
the SNR (and maybe even <em>prediction intervals</em> on the out-of-sample Sharpe ratio
of the in-sample Markowitz portfolio). I realized that some of the work I had
done …</p><p>For some years now I have been playing around with a certain problem
in portfolio statistics: suppose you observe <span class="math">\(n\)</span> independent observations
of a <span class="math">\(p\)</span> vector of returns, then form the Markowitz portfolio based on
those returns. What then is the distribution of what I call the 'signal to
noise ratio' of that Markowitz portfolio, defined as the true expected
return divided by the true volatility. That is, if <span class="math">\(\nu\)</span> is the Markowitz
portfolio, built on a sample, its 'SNR' is <span class="math">\(\nu^{\top}\mu /
\sqrt{\nu^{\top}\Sigma \nu}\)</span>, where <span class="math">\(\mu\)</span> is the population mean vector, and
<span class="math">\(\Sigma\)</span> is the population covariance matrix.
<!-- PELICAN_END_SUMMARY --></p>
<p>This is an odd problem, somewhat unlike classical statistical inference because the
unknown quantity, the SNR, depends on population parameters, but also the
sample. It is random and unknown. What you learn in your basic statistics class is
inference on fixed unknowns. (Actually, I never really took a basic statistics
class, but I think that's right.)</p>
<p>Paulsen and Sohl made some progress on this problem in their 2016 paper on what
they call the
<a href="https://arxiv.org/abs/1602.06186">Sharpe Ratio Information Criterion.</a>
They find a sample statistic which is unbiased for the portfolio SNR when
returns are (multivariate) Gaussian. In my mad scribblings on the backs of
envelopes and scrap paper, I have been trying to find the <em>distribution</em> of the SNR.
I have been looking for this love, as they say, in all the wrong places,
usually hoping for some clever transformation that will lead to a slick proof.
(I was taught from a young age to look for slick proofs.) </p>
<p>Having failed that mission, I pivoted to looking for confidence intervals for
the SNR (and maybe even <em>prediction intervals</em> on the out-of-sample Sharpe ratio
of the in-sample Markowitz portfolio). I realized that some of the work I had
done on the
<a href="https://arxiv.org/abs/1312.0557">Asymptotic distribution of the Markowitz Portfolio</a>
could be adapted to this purpose. In fact I had even done much of the work when
I made the last revision to that paper, back in 2015. This would not be a slick
excursion, however, rather hand to hand combat with lots of matrices and the
covariances of matrices; kronecker products of Greek letters; a whole hairball of
symmetrizing matrices that only Jan Magnus would love. And I got a result, and
I pushed it out as version 4 of the paper.</p>
<p>And yet it sucks, to paraphrase Galileo. The confidence intervals are
plainly terrible. Below I perform a number of simulations, for 100 to
12800 days (50 year!) of daily data, for 2 to 16 stocks, and for true optimal
<em>squared</em> SNR ranging from 0.125 to 4 per year. At each setting I perform 25K
simulations, drawing from Gaussian population. (As an efficiency hack, for
each setting of the number of stocks and days, the whole spectrum of optimal
SNR are tested simultaneously, and so will not have independent errors.)</p>
<p>I then compute the lower 0.05 asymptotic confidence interval as quouted in the paper,
based on the difference between the SNR and the observed in-sample optimal
Sharpe (which is related to Hotelling's <span class="math">\(T^2\)</span> statistic). I compute another
confidence interval based on the ratio of those two statistics. These
confidence intervals are computed most optimistically, using the unknown
population maximal SNR, as if one were clairvoyant, rather than some sample
estimate. I compute a third confidence interval which is the average of the
additive and geometric. </p>
<div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tibble<span class="p">)</span>
<span class="c1"># https://cran.r-project.org/web/packages/doFuture/vignettes/doFuture.html</span>
<span class="kn">library</span><span class="p">(</span>doFuture<span class="p">)</span>
registerDoFuture<span class="p">()</span>
plan<span class="p">(</span>multiprocess<span class="p">)</span>
<span class="p">})</span>
<span class="c1"># one simulation of n periods of data on p assets with true optimal</span>
<span class="c1"># SNR of (the vector of) pzeta</span>
onesim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="p">{</span>
pmus <span class="o"><-</span> pzeta <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span>p<span class="p">)</span>
<span class="c1"># simulate an X: too slow.</span>
<span class="c1">#X <- matrix(rnorm(n*p,mean=pmus[1],sd=1),ncol=p)</span>
<span class="c1">#smu1 <- colMeans(X)</span>
<span class="c1">#ssig <- ((n-1)/n) * cov(X)</span>
<span class="c1"># this is faster:</span>
smu1 <span class="o"><-</span> rnorm<span class="p">(</span>p<span class="p">,</span>mean<span class="o">=</span>pmus<span class="p">[</span><span class="m">1</span><span class="p">],</span>sd<span class="o">=</span><span class="m">1</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span>n<span class="p">))</span>
ssig <span class="o"><-</span> rWishart<span class="p">(</span><span class="m">1</span><span class="p">,</span>df<span class="o">=</span>n<span class="m">-1</span><span class="p">,</span>Sigma<span class="o">=</span><span class="kp">diag</span><span class="p">(</span><span class="m">1</span><span class="p">,</span>ncol<span class="o">=</span>p<span class="p">,</span>nrow<span class="o">=</span>p<span class="p">))</span> <span class="o">/</span> n <span class="c1"># sic n</span>
<span class="kp">dim</span><span class="p">(</span>ssig<span class="p">)</span> <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span>p<span class="p">,</span>p<span class="p">)</span>
smus <span class="o"><-</span> <span class="kp">outer</span><span class="p">(</span>smu1<span class="p">,</span>pmus <span class="o">-</span> pmus<span class="p">[</span><span class="m">1</span><span class="p">],</span>FUN<span class="o">=</span><span class="s">'+'</span><span class="p">)</span>
smps <span class="o"><-</span> <span class="kp">solve</span><span class="p">(</span>ssig<span class="p">,</span>smus<span class="p">)</span>
szeta <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smus <span class="o">*</span> smps<span class="p">))</span>
psnr <span class="o"><-</span> pmus <span class="o">*</span> <span class="kp">as.numeric</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smps<span class="p">)</span> <span class="o">/</span> <span class="kp">sqrt</span><span class="p">(</span><span class="kp">colSums</span><span class="p">(</span>smps<span class="o">^</span><span class="m">2</span><span class="p">)))</span>
<span class="kp">cbind</span><span class="p">(</span>pzeta<span class="p">,</span>szeta<span class="p">,</span>psnr<span class="p">)</span>
<span class="p">}</span>
<span class="c1"># do that many times.</span>
repsim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="p">{</span>
foo <span class="o"><-</span> <span class="kp">replicate</span><span class="p">(</span>nrep<span class="p">,</span>onesim<span class="p">(</span>pzeta<span class="o">=</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">))</span>
baz <span class="o"><-</span> <span class="kp">aperm</span><span class="p">(</span>foo<span class="p">,</span><span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">2</span><span class="p">))</span>
<span class="kp">dim</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span>nrep <span class="o">*</span> <span class="kp">length</span><span class="p">(</span>zetas<span class="p">),</span><span class="kp">dim</span><span class="p">(</span>foo<span class="p">)[</span><span class="m">2</span><span class="p">])</span>
<span class="kp">colnames</span><span class="p">(</span>baz<span class="p">)</span> <span class="o"><-</span> <span class="kp">colnames</span><span class="p">(</span>foo<span class="p">)</span>
<span class="kp">invisible</span><span class="p">(</span><span class="kp">as.data.frame</span><span class="p">(</span>baz<span class="p">))</span>
<span class="p">}</span>
manysim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nrep<span class="p">,</span>zetas<span class="p">,</span>n<span class="p">,</span>p<span class="p">,</span>nnodes<span class="o">=</span><span class="m">7</span><span class="p">)</span> <span class="p">{</span>
<span class="kr">if</span> <span class="p">(</span>nrep <span class="o">></span> <span class="m">4</span><span class="o">*</span>nnodes<span class="p">)</span> <span class="p">{</span>
<span class="c1"># do in parallel.</span>
nper <span class="o"><-</span> <span class="kp">table</span><span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="p">((</span><span class="m">0</span><span class="o">:</span><span class="p">(</span>nrep<span class="m">-1</span><span class="p">)</span> <span class="o">%%</span> nnodes<span class="p">)))</span>
retv <span class="o"><-</span> foreach<span class="p">(</span>i<span class="o">=</span><span class="m">1</span><span class="o">:</span>nnodes<span class="p">,</span><span class="m">.</span>export <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">'zetas'</span><span class="p">,</span><span class="s">'n'</span><span class="p">,</span><span class="s">'p'</span><span class="p">,</span><span class="s">'repsim'</span><span class="p">,</span><span class="s">'onesim'</span><span class="p">))</span> <span class="o">%dopar%</span> <span class="p">{</span>
repsim<span class="p">(</span>nrep<span class="o">=</span>nper<span class="p">[</span>i<span class="p">],</span>zetas<span class="o">=</span>zetas<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)</span>
<span class="p">}</span> <span class="o">%>%</span>
bind_rows<span class="p">()</span>
<span class="p">}</span> <span class="kr">else</span> <span class="p">{</span>
retv <span class="o"><-</span> repsim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>zetas<span class="o">=</span>zetas<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)</span>
<span class="p">}</span>
retv
<span class="p">}</span>
<span class="c1"># actually do it many times.</span>
ope <span class="o"><-</span> <span class="m">252</span>
zetasq <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="m">8</span><span class="p">,</span><span class="m">1</span><span class="o">/</span><span class="m">4</span><span class="p">,</span><span class="m">1</span><span class="o">/</span><span class="m">2</span><span class="p">,</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">4</span><span class="p">)</span> <span class="o">/</span> ope
zeta <span class="o"><-</span> <span class="kp">sqrt</span><span class="p">(</span>zetasq<span class="p">)</span>
params <span class="o"><-</span> tidyr<span class="o">::</span>crossing<span class="p">(</span>tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>n<span class="p">,</span><span class="m">100</span><span class="p">,</span><span class="m">200</span><span class="p">,</span><span class="m">400</span><span class="p">,</span><span class="m">800</span><span class="p">,</span><span class="m">1600</span><span class="p">,</span><span class="m">3200</span><span class="p">,</span><span class="m">6400</span><span class="p">,</span><span class="m">12800</span><span class="p">),</span>
tibble<span class="o">::</span>tribble<span class="p">(</span><span class="o">~</span>p<span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">8</span><span class="p">,</span><span class="m">16</span><span class="p">))</span>
nrep <span class="o"><-</span> <span class="m">25000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">2356</span><span class="p">)</span>
<span class="kp">system.time</span><span class="p">({</span>
results <span class="o"><-</span> params <span class="o">%>%</span>
group_by<span class="p">(</span>n<span class="p">,</span>p<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>sims<span class="o">=</span><span class="kt">list</span><span class="p">(</span>manysim<span class="p">(</span>nrep<span class="o">=</span>nrep<span class="p">,</span>zetas<span class="o">=</span>zeta<span class="p">,</span>n<span class="o">=</span>n<span class="p">,</span>p<span class="o">=</span>p<span class="p">)))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
tidyr<span class="o">::</span>unnest<span class="p">()</span>
<span class="p">})</span>
</pre></div>
<div class="highlight"><pre><span></span>## user system elapsed
## 529.416 1747.076 311.907
</pre></div>
<div class="highlight"><pre><span></span>kurty <span class="o"><-</span> <span class="m">1</span>
<span class="c1"># summarize the moments:</span>
sumres <span class="o"><-</span> results <span class="o">%>%</span>
mutate<span class="p">(</span>diffo<span class="o">=</span>psnr <span class="o">-</span> szeta<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>emp_mean_szeta<span class="o">=</span><span class="kp">mean</span><span class="p">(</span>szeta<span class="p">),</span>emp_var_szeta<span class="o">=</span>var<span class="p">(</span>szeta<span class="p">),</span>
emp_mean_szetasq<span class="o">=</span><span class="kp">mean</span><span class="p">(</span>szeta<span class="o">^</span><span class="m">2</span><span class="p">),</span>emp_var_szetasq<span class="o">=</span>var<span class="p">(</span>szeta<span class="o">^</span><span class="m">2</span><span class="p">),</span>
emp_mean_psnr<span class="o">=</span><span class="kp">mean</span><span class="p">(</span>psnr<span class="p">),</span>emp_var_psnr<span class="o">=</span>var<span class="p">(</span>psnr<span class="p">),</span>
emp_mean_diffo<span class="o">=</span><span class="kp">mean</span><span class="p">(</span>diffo<span class="p">),</span>emp_var_diffo<span class="o">=</span>var<span class="p">(</span>diffo<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>bit1 <span class="o">=</span> <span class="p">(</span>kurty<span class="o">*</span>pzeta<span class="o">^</span><span class="m">2</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> p<span class="p">),</span>
bit2 <span class="o">=</span> <span class="p">(</span><span class="m">3</span> <span class="o">*</span> kurty <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span>pzeta<span class="o">^</span><span class="m">2</span><span class="o">/</span><span class="m">4</span><span class="p">)</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_mean_psnr<span class="o">=</span>pzeta <span class="o">+</span> bit1 <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> pzeta <span class="o">*</span> n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_mean_szeta<span class="o">=</span>pzeta <span class="o">-</span> bit2 <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> pzeta <span class="o">*</span> n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_mean_diffo<span class="o">=</span><span class="p">(</span>bit1 <span class="o">+</span> bit2<span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> pzeta <span class="o">*</span> n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_mean_weird<span class="o">=-</span><span class="p">(</span><span class="kc">pi</span><span class="o">/</span><span class="m">4</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span>thr_mean_psnr <span class="o">-</span> pzeta<span class="p">)</span> <span class="o">/</span> pzeta<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_var_szeta<span class="o">=</span>bit2 <span class="o">/</span> n<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_var_szetasq<span class="o">=</span><span class="p">(</span><span class="m">4</span> <span class="o">*</span> pzeta<span class="o">^</span><span class="m">2</span> <span class="o">+</span> <span class="m">2</span> <span class="o">*</span> pzeta<span class="o">^</span><span class="m">4</span><span class="p">)</span> <span class="o">/</span> n<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>thr_var_diffo<span class="o">=</span>thr_var_szeta<span class="p">)</span> <span class="o">%>%</span>
tidyr<span class="o">::</span>gather<span class="p">(</span>key<span class="o">=</span>series<span class="p">,</span>value<span class="o">=</span>value<span class="p">,</span>matches<span class="p">(</span><span class="s">'^(thr|emp)_'</span><span class="p">))</span> <span class="o">%>%</span>
tidyr<span class="o">::</span>separate<span class="p">(</span>series<span class="p">,</span>into<span class="o">=</span><span class="kt">c</span><span class="p">(</span><span class="s">'flavor'</span><span class="p">,</span><span class="s">'stat'</span><span class="p">,</span><span class="s">'metric'</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>flavor<span class="o">=</span>case_when<span class="p">(</span><span class="m">.</span><span class="o">$</span>flavor<span class="o">==</span><span class="s">'emp'</span> <span class="o">~</span> <span class="s">'empirical'</span><span class="p">,</span>
<span class="m">.</span><span class="o">$</span>flavor<span class="o">==</span><span class="s">'thr'</span> <span class="o">~</span> <span class="s">'theoretical'</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">signif</span><span class="p">(</span>pzeta <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">),</span>digits<span class="o">=</span><span class="m">2</span><span class="p">))</span>
<span class="c1"># confidence intervals and coverage:</span>
cires <span class="o"><-</span> results <span class="o">%>%</span>
mutate<span class="p">(</span>bit1 <span class="o">=</span> <span class="p">(</span>kurty<span class="o">*</span>pzeta<span class="o">^</span><span class="m">2</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> p<span class="p">),</span>
bit2 <span class="o">=</span> <span class="p">(</span><span class="m">3</span> <span class="o">*</span> kurty <span class="o">-</span> <span class="m">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span>pzeta<span class="o">^</span><span class="m">2</span><span class="o">/</span><span class="m">4</span><span class="p">)</span> <span class="o">+</span> <span class="m">1</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_add <span class="o">=</span> szeta <span class="o">+</span> <span class="p">((</span>bit1 <span class="o">+</span> bit2<span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> n <span class="o">*</span> pzeta<span class="p">))</span> <span class="o">+</span> qnorm<span class="p">(</span><span class="m">0.05</span><span class="p">)</span> <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>bit2<span class="o">/</span>n<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_div <span class="o">=</span> szeta <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="p">((</span>bit1 <span class="o">+</span> <span class="m">3</span> <span class="o">*</span> bit2<span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="m">2</span> <span class="o">*</span> n <span class="o">*</span> pzeta <span class="o">*</span> pzeta<span class="p">))</span> <span class="o">+</span> qnorm<span class="p">(</span><span class="m">0.05</span><span class="p">)</span> <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>bit2 <span class="o">/</span> <span class="p">(</span>n<span class="o">*</span>pzeta<span class="o">*</span>pzeta<span class="p">))))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ci_com <span class="o">=</span> <span class="m">0.5</span> <span class="o">*</span> <span class="p">(</span>ci_add <span class="o">+</span> ci_div<span class="p">))</span> <span class="o">%>%</span>
group_by<span class="p">(</span>pzeta<span class="p">,</span>n<span class="p">,</span>p<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>type1_add <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_add<span class="p">),</span>
type1_div <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_div<span class="p">),</span>
type1_com <span class="o">=</span> <span class="kp">mean</span><span class="p">(</span>psnr <span class="o"><</span> ci_com<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>zyr<span class="o">=</span><span class="kp">signif</span><span class="p">(</span>pzeta <span class="o">*</span> <span class="kp">sqrt</span><span class="p">(</span>ope<span class="p">),</span>digits<span class="o">=</span><span class="m">2</span><span class="p">))</span>
</pre></div>
<div class="highlight"><pre><span></span><span class="c1"># plot CIs:</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> cires <span class="o">%>%</span>
tidyr<span class="o">::</span>gather<span class="p">(</span>key<span class="o">=</span>type<span class="p">,</span>value<span class="o">=</span>type1<span class="p">,</span>matches<span class="p">(</span><span class="s">'^type1_'</span><span class="p">))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>n<span class="p">,</span>type1<span class="p">,</span>color<span class="o">=</span>type<span class="p">))</span> <span class="o">+</span>
geom_line<span class="p">()</span> <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">~</span> zyr<span class="p">,</span>scales<span class="o">=</span><span class="s">'free'</span><span class="p">)</span> <span class="o">+</span>
scale_x_log10<span class="p">()</span> <span class="o">+</span>
geom_hline<span class="p">(</span>yintercept<span class="o">=</span><span class="m">0.05</span><span class="p">,</span>linetype<span class="o">=</span><span class="m">2</span><span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'number of days data'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'empirical type I rates at nominal 0.05 level'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'theoretical and empirical coverage of 0.05 CIs on SNR of Markowitz Portfolio, using some clairvoyance'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/ci_plots-1.png" title="plot of chunk ci_plots" alt="plot of chunk ci_plots" width="900px" height="700px" /></p>
<p>I suppose that conceivably, the confidence intervals will <em>asymptotically</em>
achieve the nominal coverage. After all, I make no claims about the rate of
convergence in the sample size. And there is no reason to think the rate of
convergence is uniform across population parameters. However, nobody has enough
data to make these confidence intervals even remotely trustworthy;
they're a load of dingo's kidneys.</p>
<p>Looking forward, you can plainly see there are a few problems with these
results. For one, the asymptotic distribution of Hotelling's <span class="math">\(T^2\)</span> which I am
effectively using here, has terrible performance for the small <span class="math">\(n\)</span> case.
Well-known distributional results based on the non-central <span class="math">\(F\)</span> should be used
instead. However, the distribution of the Markowitz SNR itself is also pretty
poor, as shown in the plots below, using the same simulations. Sure, they
might make work asymptotically (and I have my doubts), but are useless in our
world. </p>
<div class="highlight"><pre><span></span><span class="c1"># plot mean snr:</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> sumres <span class="o">%>%</span>
filter<span class="p">(</span>metric<span class="o">==</span><span class="s">'psnr'</span><span class="p">,</span>stat<span class="o">==</span><span class="s">'mean'</span><span class="p">)</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>n<span class="p">,</span>value<span class="p">,</span>color<span class="o">=</span>flavor<span class="p">))</span> <span class="o">+</span>
geom_line<span class="p">()</span> <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span>
facet_grid<span class="p">(</span>p <span class="o">~</span> zyr<span class="p">,</span>scale<span class="o">=</span><span class="s">'free'</span><span class="p">)</span> <span class="o">+</span>
scale_x_log10<span class="p">()</span> <span class="o">+</span>
labs<span class="p">(</span>y<span class="o">=</span><span class="s">'mean SNR'</span><span class="p">,</span>x<span class="o">=</span><span class="s">'number of days data'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'theoretical and empirical mean SNR of sample Markowitz Portfolio'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/mean_mp_snr_plots-1.png" title="plot of chunk mean_mp_snr_plots" alt="plot of chunk mean_mp_snr_plots" width="900px" height="700px" /></p>
<p>I was worried that I had solved my problem and wouldn't know what to do with
myself anymore. (That's not true, I'm supposed to be working on my book.)
The good news, and also the bad news, is that I can continue to work on this problem.</p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>geom cloud.2017-09-21T21:51:28-07:002017-09-21T21:51:28-07:00Steven E. Pavtag:www.gilgamath.com,2017-09-21:/geom-cloud.html<p>I wanted a drop-in replacement for <code>geom_errorbar</code> in <code>ggplot2</code> that would
plot a density cloud of uncertainty.
<!-- PELICAN_END_SUMMARY -->
The idea is that typically (well, where I work),
the <code>ymin</code> and <code>ymax</code> of an errorbar are plotted at plus and minus
one standard deviation. A 'cloud' where the alpha is proportional to a normal
density with the same standard deviations could show the same information
on a plot with a little less clutter. I found out how to do this with
a very ugly function, but wanted to do it the 'right' way by spawning my
own geom. So the <code>geom_cloud</code>.</p>
<p>After looking at a bunch of other <code>ggplot2</code> extensions, some amount of
tinkering and hair-pulling, and we have the following code. The first part
just computes standard deviations which are equally spaced in normal density.
This is then used to create a list of <code>geom_ribbon</code> with equal alpha, but
the right size. A little trickery is used to get the scales right. There
are three parameters: the <code>steps</code>, which control how many ribbons are drawn.
The default value is a little conservative. A larger value, like 15, gives
very smooth clouds. The <code>se_mult</code> is the number of standard deviations that
the <code>ymax</code> and <code>ymin</code> are plotted at, defaulting to 1 here. If you plot
your errorbars at 2 standard errors, change this to 2. The <code>max_alpha</code> is the
alpha at the maximal density, <em>i.e.</em> around <code>y</code>.</p>
<div class="highlight"><pre><span></span><span class="c1"># get points equally spaced in density </span>
equal_ses <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>steps<span class="p">)</span> <span class="p">{</span>
xend <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">4</span><span class="p">)</span>
endpnts <span class="o"><-</span> dnorm<span class="p">(</span>xend<span class="p">)</span>
<span class="c1"># perhaps use ppoints instead?</span>
deql <span class="o"><-</span> <span class="kp">seq</span><span class="p">(</span>from<span class="o">=</span>endpnts<span class="p">[</span><span class="m">1</span><span class="p">],</span>to<span class="o">=</span>endpnts<span class="p">[</span><span class="m">2</span><span class="p">],</span>length.out<span class="o">=</span>steps<span class="m">+1</span><span class="p">)</span>
davg <span class="o"><-</span> <span class="p">(</span>deql<span class="p">[</span><span class="m">-1</span><span class="p">]</span> <span class="o">+</span> deql<span class="p">[</span><span class="o">-</span><span class="kp">length</span><span class="p">(</span>deql<span class="p">)])</span><span class="o">/</span><span class="m">2</span>
<span class="c1"># invert</span>
xeql <span class="o"><-</span> <span class="kp">unlist</span><span class="p">(</span><span class="kp">lapply</span><span class="p">(</span>davg<span class="p">,</span><span class="kr">function</span><span class="p">(</span>d<span class="p">)</span> <span class="p">{</span>
uniroot<span class="p">(</span>f<span class="o">=</span><span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> dnorm<span class="p">(</span>x<span class="p">)</span> <span class="o">-</span> d <span class="p">},</span>interval<span class="o">=</span>xend<span class="p">)</span><span class="o">$</span>root
<span class="p">}))</span>
xeql
<span class="p">}</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>grid<span class="p">)</span>
geom_cloud <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>mapping …</pre></div><p>I wanted a drop-in replacement for <code>geom_errorbar</code> in <code>ggplot2</code> that would
plot a density cloud of uncertainty.
<!-- PELICAN_END_SUMMARY -->
The idea is that typically (well, where I work),
the <code>ymin</code> and <code>ymax</code> of an errorbar are plotted at plus and minus
one standard deviation. A 'cloud' where the alpha is proportional to a normal
density with the same standard deviations could show the same information
on a plot with a little less clutter. I found out how to do this with
a very ugly function, but wanted to do it the 'right' way by spawning my
own geom. So the <code>geom_cloud</code>.</p>
<p>After looking at a bunch of other <code>ggplot2</code> extensions, some amount of
tinkering and hair-pulling, and we have the following code. The first part
just computes standard deviations which are equally spaced in normal density.
This is then used to create a list of <code>geom_ribbon</code> with equal alpha, but
the right size. A little trickery is used to get the scales right. There
are three parameters: the <code>steps</code>, which control how many ribbons are drawn.
The default value is a little conservative. A larger value, like 15, gives
very smooth clouds. The <code>se_mult</code> is the number of standard deviations that
the <code>ymax</code> and <code>ymin</code> are plotted at, defaulting to 1 here. If you plot
your errorbars at 2 standard errors, change this to 2. The <code>max_alpha</code> is the
alpha at the maximal density, <em>i.e.</em> around <code>y</code>.</p>
<div class="highlight"><pre><span></span><span class="c1"># get points equally spaced in density </span>
equal_ses <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>steps<span class="p">)</span> <span class="p">{</span>
xend <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">4</span><span class="p">)</span>
endpnts <span class="o"><-</span> dnorm<span class="p">(</span>xend<span class="p">)</span>
<span class="c1"># perhaps use ppoints instead?</span>
deql <span class="o"><-</span> <span class="kp">seq</span><span class="p">(</span>from<span class="o">=</span>endpnts<span class="p">[</span><span class="m">1</span><span class="p">],</span>to<span class="o">=</span>endpnts<span class="p">[</span><span class="m">2</span><span class="p">],</span>length.out<span class="o">=</span>steps<span class="m">+1</span><span class="p">)</span>
davg <span class="o"><-</span> <span class="p">(</span>deql<span class="p">[</span><span class="m">-1</span><span class="p">]</span> <span class="o">+</span> deql<span class="p">[</span><span class="o">-</span><span class="kp">length</span><span class="p">(</span>deql<span class="p">)])</span><span class="o">/</span><span class="m">2</span>
<span class="c1"># invert</span>
xeql <span class="o"><-</span> <span class="kp">unlist</span><span class="p">(</span><span class="kp">lapply</span><span class="p">(</span>davg<span class="p">,</span><span class="kr">function</span><span class="p">(</span>d<span class="p">)</span> <span class="p">{</span>
uniroot<span class="p">(</span>f<span class="o">=</span><span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> dnorm<span class="p">(</span>x<span class="p">)</span> <span class="o">-</span> d <span class="p">},</span>interval<span class="o">=</span>xend<span class="p">)</span><span class="o">$</span>root
<span class="p">}))</span>
xeql
<span class="p">}</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>grid<span class="p">)</span>
geom_cloud <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>mapping <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> data <span class="o">=</span> <span class="kc">NULL</span><span class="p">,</span> <span class="kc">...</span><span class="p">,</span>
na.rm <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span>
steps <span class="o">=</span> <span class="m">7</span><span class="p">,</span> se_mult<span class="o">=</span><span class="m">1</span><span class="p">,</span> max_alpha<span class="o">=</span><span class="m">1</span><span class="p">,</span>
inherit.aes <span class="o">=</span> <span class="kc">TRUE</span><span class="p">)</span> <span class="p">{</span>
layer<span class="p">(</span>
data <span class="o">=</span> data<span class="p">,</span>
mapping <span class="o">=</span> mapping<span class="p">,</span>
stat <span class="o">=</span> <span class="s">"identity"</span><span class="p">,</span>
geom <span class="o">=</span> GeomCloud<span class="p">,</span>
position <span class="o">=</span> <span class="s">"identity"</span><span class="p">,</span>
inherit.aes <span class="o">=</span> inherit.aes<span class="p">,</span>
params <span class="o">=</span> <span class="kt">list</span><span class="p">(</span>
na.rm <span class="o">=</span> na.rm<span class="p">,</span>
steps <span class="o">=</span> steps<span class="p">,</span>
se_mult <span class="o">=</span> se_mult<span class="p">,</span>
max_alpha <span class="o">=</span> max_alpha<span class="p">,</span>
<span class="kc">...</span>
<span class="p">)</span>
<span class="p">)</span>
<span class="p">}</span>
GeomCloud <span class="o"><-</span> ggproto<span class="p">(</span><span class="s">"GeomCloud"</span><span class="p">,</span> Geom<span class="p">,</span>
required_aes <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">"x"</span><span class="p">,</span> <span class="s">"y"</span><span class="p">,</span> <span class="s">"ymin"</span><span class="p">,</span> <span class="s">"ymax"</span><span class="p">),</span>
non_missing_aes <span class="o">=</span> <span class="kt">c</span><span class="p">(</span><span class="s">"fill"</span><span class="p">),</span>
default_aes <span class="o">=</span> aes<span class="p">(</span>
colour <span class="o">=</span> <span class="kc">NA</span><span class="p">,</span> fill <span class="o">=</span> <span class="kc">NA</span><span class="p">,</span> alpha <span class="o">=</span> <span class="m">1</span><span class="p">,</span> size<span class="o">=</span><span class="m">1</span><span class="p">,</span> linetype<span class="o">=</span><span class="m">1</span>
<span class="p">),</span>
setup_data <span class="o">=</span> <span class="kr">function</span><span class="p">(</span>data<span class="p">,</span>params<span class="p">)</span> <span class="p">{</span>
<span class="kr">if</span> <span class="p">(</span>params<span class="o">$</span>na.rm<span class="p">)</span> <span class="p">{</span>
ok_row <span class="o"><-</span> <span class="o">!</span><span class="p">(</span><span class="kp">is.na</span><span class="p">(</span>data<span class="o">$</span>x<span class="p">)</span> <span class="o">|</span> <span class="kp">is.na</span><span class="p">(</span>data<span class="o">$</span>y<span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="kp">is.na</span><span class="p">(</span>data<span class="o">$</span>ymin<span class="p">)</span> <span class="o">&</span> <span class="kp">is.na</span><span class="p">(</span>data<span class="o">$</span>ymax<span class="p">)))</span>
data <span class="o"><-</span> data<span class="p">[</span>ok_row<span class="p">,]</span>
<span class="p">}</span>
ses <span class="o"><-</span> equal_ses<span class="p">(</span>params<span class="o">$</span>steps<span class="p">)</span>
data<span class="o">$</span>up_se <span class="o"><-</span> <span class="p">(</span><span class="m">1</span><span class="o">/</span>params<span class="o">$</span>se_mult<span class="p">)</span> <span class="o">*</span> <span class="p">(</span>data<span class="o">$</span>ymax <span class="o">-</span> data<span class="o">$</span>y<span class="p">)</span>
data<span class="o">$</span>dn_se <span class="o"><-</span> <span class="p">(</span><span class="m">1</span><span class="o">/</span>params<span class="o">$</span>se_mult<span class="p">)</span> <span class="o">*</span> <span class="p">(</span>data<span class="o">$</span>y <span class="o">-</span> data<span class="o">$</span>ymin<span class="p">)</span>
<span class="c1"># a trick to get positions ok: puff up the ymax and ymin for now</span>
maxse <span class="o"><-</span> <span class="kp">max</span><span class="p">(</span>ses<span class="p">)</span>
data<span class="o">$</span>ymax <span class="o"><-</span> data<span class="o">$</span>y <span class="o">+</span> maxse <span class="o">*</span> data<span class="o">$</span>up_se
data<span class="o">$</span>ymin <span class="o"><-</span> data<span class="o">$</span>y <span class="o">-</span> maxse <span class="o">*</span> data<span class="o">$</span>dn_se
data
<span class="p">},</span>
draw_group <span class="o">=</span> <span class="kr">function</span><span class="p">(</span>data<span class="p">,</span> panel_scales<span class="p">,</span> coord<span class="p">,</span>
na.rm <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span>
steps <span class="o">=</span> <span class="m">5</span><span class="p">,</span> se_mult<span class="o">=</span><span class="m">1</span><span class="p">,</span> max_alpha<span class="o">=</span><span class="m">1</span><span class="p">)</span> <span class="p">{</span>
data<span class="o">$</span>alpha <span class="o"><-</span> max_alpha <span class="o">/</span> steps
<span class="c1"># 2FIX: use the coordinate transform? or just forget it?</span>
ses <span class="o"><-</span> equal_ses<span class="p">(</span>steps<span class="p">)</span>
grobs <span class="o"><-</span> <span class="kp">Map</span><span class="p">(</span><span class="kr">function</span><span class="p">(</span>myse<span class="p">)</span> <span class="p">{</span>
this_data <span class="o"><-</span> data
this_data<span class="o">$</span>ymax <span class="o"><-</span> this_data<span class="o">$</span>y <span class="o">+</span> myse <span class="o">*</span> this_data<span class="o">$</span>up_se
this_data<span class="o">$</span>ymin <span class="o"><-</span> this_data<span class="o">$</span>y <span class="o">-</span> myse <span class="o">*</span> this_data<span class="o">$</span>dn_se
ggplot2<span class="o">::</span>GeomRibbon<span class="o">$</span>draw_group<span class="p">(</span>this_data<span class="p">,</span> panel_scales<span class="p">,</span> coord<span class="p">,</span> na.rm<span class="o">=</span>na.rm<span class="p">)</span>
<span class="p">},</span>ses<span class="p">)</span>
<span class="kp">do.call</span><span class="p">(</span><span class="s">"gList"</span><span class="p">,</span>grobs<span class="p">)</span>
<span class="p">},</span>
draw_key <span class="o">=</span> draw_key_polygon
<span class="p">)</span>
</pre></div>
<p>Ok, now I use it. I construct some data in multiple groups, splitting into
column and row facets by some random variable. Within each facet there are two
groups. The standard error is proportional to square root x. I then plot the
clouds. I put these in square root space to convince myself I had not
goofed up scale transformations. (Well, still not sure, really.)</p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
nobs <span class="o"><-</span> <span class="m">1000</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">2134</span><span class="p">)</span>
mydat <span class="o"><-</span> <span class="kt">data.frame</span><span class="p">(</span>grp<span class="o">=</span><span class="kp">sample</span><span class="p">(</span><span class="kt">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">1</span><span class="p">),</span>nobs<span class="p">,</span>replace<span class="o">=</span><span class="kc">TRUE</span><span class="p">),</span>
colfac<span class="o">=</span><span class="kp">sample</span><span class="p">(</span><span class="kc">letters</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">2</span><span class="p">],</span>nobs<span class="p">,</span>replace<span class="o">=</span><span class="kc">TRUE</span><span class="p">),</span>
rowfac<span class="o">=</span><span class="kp">sample</span><span class="p">(</span><span class="kc">letters</span><span class="p">[</span><span class="m">10</span> <span class="o">+</span> <span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="m">3</span><span class="p">)],</span>nobs<span class="p">,</span>replace<span class="o">=</span><span class="kc">TRUE</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>x<span class="o">=</span><span class="kp">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">1</span><span class="p">,</span>length.out<span class="o">=</span>nobs<span class="p">)</span> <span class="o">+</span> <span class="m">0.33</span> <span class="o">*</span> grp<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>y<span class="o">=</span><span class="m">0.25</span><span class="o">*</span>rnorm<span class="p">(</span>nobs<span class="p">)</span> <span class="o">+</span> <span class="m">2</span><span class="o">*</span>grp<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>grp<span class="o">=</span><span class="kp">factor</span><span class="p">(</span>grp<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>se<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>x<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>ymin<span class="o">=</span>y<span class="o">-</span>se<span class="p">,</span>ymax<span class="o">=</span>y<span class="o">+</span>se<span class="p">)</span>
offs <span class="o"><-</span> <span class="m">2</span>
ph <span class="o"><-</span> mydat <span class="o">%>%</span>
mutate<span class="p">(</span>y<span class="o">=</span>y<span class="o">+</span>offs<span class="p">,</span>ymin<span class="o">=</span>ymin<span class="o">+</span>offs<span class="p">,</span>ymax<span class="o">=</span>ymax<span class="o">+</span>offs<span class="p">)</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>x<span class="o">=</span>x<span class="p">,</span>y<span class="o">=</span>y<span class="p">,</span>ymin<span class="o">=</span>ymin<span class="p">,</span>ymax<span class="o">=</span>ymax<span class="p">,</span>color<span class="o">=</span>grp<span class="p">,</span>fill<span class="o">=</span>grp<span class="p">))</span> <span class="o">+</span>
facet_grid<span class="p">(</span>rowfac <span class="o">~</span> colfac<span class="p">)</span> <span class="o">+</span>
scale_y_sqrt<span class="p">()</span> <span class="o">+</span> geom_line<span class="p">()</span> <span class="o">+</span>
geom_cloud<span class="p">(</span>aes<span class="p">(</span>fill<span class="o">=</span>grp<span class="p">),</span>steps<span class="o">=</span><span class="m">15</span><span class="p">,</span>max_alpha<span class="o">=</span><span class="m">0.85</span><span class="p">,</span>color<span class="o">=</span><span class="kc">NA</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># important! set color to NA!</span>
labs<span class="p">(</span>title<span class="o">=</span><span class="s">'geom cloud'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/geom_cloud_first_plot-1.png" title="plot of chunk first_plot" alt="plot of chunk first_plot" width="800px" height="700px" /></p>
<p>At some point I will put this in a package or try to push it off into one of
the fine <code>ggplot2</code> extension packages. There are still some bugs to be worked
out: I would like the fill to default to the color if the fill is not given.
I want to force the color to be <code>NA</code>, otherwise <code>geom_ribbon</code> adds really
ugly lines, and so on.</p>Spy vs Spy vs Wald Wolfowitz.2017-09-05T21:34:15-07:002017-09-05T21:34:15-07:00Steven E. Pavtag:www.gilgamath.com,2017-09-05:/spy-vs-wald-wolfowitz.html<p>I turned my kids on to the great Spy vs Spy cartoon from Mad Magazine.
This strip is pure gold for two young boys: Rube Goldberg plus
explosions with not much dialog (one child is still too young to read).
I became curious whether the one Spy had the upper hand, whether
Prohias worked to keep the score 'even', and so on.
<!-- PELICAN_END_SUMMARY --></p>
<p>Not finding any data out there, I collected the data to the best
of my ability from the Spy vs Spy Omnibus, which collects all
248 strips that appeared in Mad Magazine (plus two special issues).
I think there are more strips out there by Prohias that appeared
only in collected books, but have not collected them yet.
I entered the data into a google spreadsheet, then converted into
CSV, then into <a href="http://www.github.com/shabbychef/SPYvsSPY">an R data package</a>.
Now you can play along at home.</p>
<p>On to the simplest form of my question: did Prohias alternate between
Black and White Spy victories? or did he choose at random?
Up until 1968 it was common for two strips to appear in one issue
of Mad, with one victory per Spy. In some cases <em>three</em> strips
appeared per issue, with the Grey Spy appearing in the third;
the Black and White Spies always receive a comeuppance when she
appears, and so the balance of power was maintained.
After 1972, it seems that only a single strip appeared per issue,
and we can examine the time series of victories. </p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>SPYvsSPY<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
data<span class="p">(</span>svs<span class="p">)</span>
<span class="c1"># show that there are multiple per strip</span>
svs <span class="o">%>%</span>
group_by<span class="p">(</span>Mad_no<span class="p">,</span>yrmo<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>nstrips<span class="o">=</span>n<span class="p">(),</span>
net_victories<span class="o">=</span><span class="kp">sum</span><span class="p">(</span><span class="kp">as.numeric</span><span class="p">(</span>white_comeuppance<span class="p">)</span> <span class="o">-</span> <span class="kp">as.numeric</span><span class="p">(</span>black_comeuppance<span class="p">)))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
select<span class="p">(</span>yrmo<span class="p">,</span>nstrips<span class="p">,</span>net_victories<span class="p">)</span> <span class="o">%>%</span>
<span class="kp">head</span><span class="p">(</span>n<span class="o">=</span><span class="m">20</span><span class="p">)</span> <span class="o">%>%</span>
kable<span class="p">()</span>
</pre></div>
<table>
<thead>
<tr>
<th align="left">yrmo</th>
<th align="right">nstrips</th>
<th align="right">net_victories</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1961-01</td>
<td align="right">3</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1961-03</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-04</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-06</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-07</td>
<td align="right">2 …</td></tr></tbody></table><p>I turned my kids on to the great Spy vs Spy cartoon from Mad Magazine.
This strip is pure gold for two young boys: Rube Goldberg plus
explosions with not much dialog (one child is still too young to read).
I became curious whether the one Spy had the upper hand, whether
Prohias worked to keep the score 'even', and so on.
<!-- PELICAN_END_SUMMARY --></p>
<p>Not finding any data out there, I collected the data to the best
of my ability from the Spy vs Spy Omnibus, which collects all
248 strips that appeared in Mad Magazine (plus two special issues).
I think there are more strips out there by Prohias that appeared
only in collected books, but have not collected them yet.
I entered the data into a google spreadsheet, then converted into
CSV, then into <a href="http://www.github.com/shabbychef/SPYvsSPY">an R data package</a>.
Now you can play along at home.</p>
<p>On to the simplest form of my question: did Prohias alternate between
Black and White Spy victories? or did he choose at random?
Up until 1968 it was common for two strips to appear in one issue
of Mad, with one victory per Spy. In some cases <em>three</em> strips
appeared per issue, with the Grey Spy appearing in the third;
the Black and White Spies always receive a comeuppance when she
appears, and so the balance of power was maintained.
After 1972, it seems that only a single strip appeared per issue,
and we can examine the time series of victories. </p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>SPYvsSPY<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
data<span class="p">(</span>svs<span class="p">)</span>
<span class="c1"># show that there are multiple per strip</span>
svs <span class="o">%>%</span>
group_by<span class="p">(</span>Mad_no<span class="p">,</span>yrmo<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>nstrips<span class="o">=</span>n<span class="p">(),</span>
net_victories<span class="o">=</span><span class="kp">sum</span><span class="p">(</span><span class="kp">as.numeric</span><span class="p">(</span>white_comeuppance<span class="p">)</span> <span class="o">-</span> <span class="kp">as.numeric</span><span class="p">(</span>black_comeuppance<span class="p">)))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
select<span class="p">(</span>yrmo<span class="p">,</span>nstrips<span class="p">,</span>net_victories<span class="p">)</span> <span class="o">%>%</span>
<span class="kp">head</span><span class="p">(</span>n<span class="o">=</span><span class="m">20</span><span class="p">)</span> <span class="o">%>%</span>
kable<span class="p">()</span>
</pre></div>
<table>
<thead>
<tr>
<th align="left">yrmo</th>
<th align="right">nstrips</th>
<th align="right">net_victories</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">1961-01</td>
<td align="right">3</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1961-03</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-04</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-06</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-07</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-09</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1961-12</td>
<td align="right">1</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1962-03</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1962-04</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1962-06</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1962-07</td>
<td align="right">2</td>
<td align="right">0</td>
</tr>
<tr>
<td align="left">1962-09</td>
<td align="right">2</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1962-10</td>
<td align="right">2</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1962-12</td>
<td align="right">2</td>
<td align="right">1</td>
</tr>
<tr>
<td align="left">1963-03</td>
<td align="right">2</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1963-04</td>
<td align="right">2</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1963-06</td>
<td align="right">3</td>
<td align="right">-1</td>
</tr>
<tr>
<td align="left">1963-09</td>
<td align="right">2</td>
<td align="right">1</td>
</tr>
<tr>
<td align="left">1963-10</td>
<td align="right">2</td>
<td align="right">1</td>
</tr>
<tr>
<td align="left">1963-12</td>
<td align="right">3</td>
<td align="right">0</td>
</tr>
</tbody>
</table>
<p>Here I plot the 'net black score', the cumulative sum of White comeuppances minus those of Black. Note that
when the Grey Spy appears (or in the rare cases where neither Spy seems to suffer, of which I found two),
there is no net movement of the score. It seems that the Black Spy was the net loser, suffering most
in the 1970's, with a comeback in the 1980's. </p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
ph <span class="o"><-</span> svs <span class="o">%>%</span>
mutate<span class="p">(</span>snapdate<span class="o">=</span><span class="kp">as.Date</span><span class="p">(</span><span class="kp">paste0</span><span class="p">(</span>yrmo<span class="p">,</span><span class="s">'-01'</span><span class="p">),</span>format<span class="o">=</span><span class="s">'%Y-%m-%d'</span><span class="p">),</span>
black_victory<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>white_comeuppance<span class="p">)</span> <span class="o">-</span> <span class="kp">as.numeric</span><span class="p">(</span>black_comeuppance<span class="p">))</span> <span class="o">%>%</span>
group_by<span class="p">(</span>Mad_no<span class="p">,</span>snapdate<span class="p">)</span> <span class="o">%>%</span>
summarize<span class="p">(</span>black_victory<span class="o">=</span><span class="kp">sum</span><span class="p">(</span>black_victory<span class="p">))</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>black_score<span class="o">=</span><span class="kp">cumsum</span><span class="p">(</span>black_victory<span class="p">))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>snapdate<span class="p">,</span>black_score<span class="p">))</span> <span class="o">+</span>
geom_line<span class="p">()</span> <span class="o">+</span> geom_point<span class="p">(</span>alpha<span class="o">=</span><span class="m">0.5</span><span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>title<span class="o">=</span><span class="s">'Spy vs Spy tally'</span><span class="p">,</span>
y<span class="o">=</span><span class="s">'net Y victories'</span><span class="p">,</span>
x<span class="o">=</span><span class="s">'issue date'</span><span class="p">)</span>
ph
</pre></div>
<p><img src="http://www.gilgamath.com/figure/main_plot1-1.png" title="plot of chunk main_plot1" alt="plot of chunk main_plot1" width="1100px" height="800px" /></p>
<h2>Wald Wolfowitz</h2>
<p>The <a href="https://en.wikipedia.org/wiki/Wald%E2%80%93Wolfowitz_runs_test">Wald Wolfowitz test</a>
is a non-parametric test for the presence of serial correlation that is appropriate for
binary series like this. The test is performed by computing the number of 'runs', which
is to say the number of clusters of consecutive victories by one of the Spies. When the
test statistic is too high
(compared to what would be observed if the data were serially independent), then the data are too 'flippy',
often reversing.
This would be the case if Prohias tried to keep score balance by always reversing the previous outcome.
If the test statistic is too low, the data are too 'sticky', with long periods of one Spy prevailing
over the other. This could happen if Prohias got moody and picked favorites, perhaps.</p>
<p>The test is easy enough to run in R:</p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>randtests<span class="p">)</span>
subdata <span class="o"><-</span> svs <span class="o">%>%</span>
filter<span class="p">(</span>Mad_no <span class="o">></span> <span class="m">152</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>black_victory<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>white_comeuppance<span class="p">)</span> <span class="o">-</span> <span class="kp">as.numeric</span><span class="p">(</span>black_comeuppance<span class="p">))</span> <span class="o">%>%</span>
filter<span class="p">(</span><span class="kp">abs</span><span class="p">(</span>black_victory<span class="p">)</span> <span class="o">></span> <span class="m">0.5</span><span class="p">)</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span>
resu <span class="o"><-</span> randtests<span class="o">::</span>runs.test<span class="p">(</span>subdata<span class="o">$</span>black_victory<span class="p">,</span>threshold<span class="o">=</span><span class="m">0</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>resu<span class="p">)</span>
</pre></div>
<div class="highlight"><pre><span></span>##
## Runs Test
##
## data: subdata$black_victory
## statistic = 0.8532, runs = 50, n1 = 44, n2 = 46, n = 90, p-value =
## 0.394
## alternative hypothesis: nonrandomness
</pre></div>
<p>We get back a test statistic of 0.85, indicating slightly
greater than random amount of reversal. However, this is not statistically
significantly different than the expected value of 0, with a p-value of
0.39. </p>
<p>In conclusion, we have no evidence that Prohias kept running tally of Black
and White Spy victories, and the data are consistent with the victor being
chosen independently of the previous victories.</p>R in Finance 20172017-05-19T09:30:24-07:002017-05-19T09:30:24-07:00Steventag:www.gilgamath.com,2017-05-19:/rfin2017.html<p>Review of R in Finance 2017 conference</p><!-- cf. http://stackoverflow.com/a/255182 -->
<p><img style="float: right;" src="http://www.gilgamath.com/images/rfin2017_cover.png" width="200px" title="It is not the eighth annual conference, that was last year." alt="second annual eighth annual">
This was my second year at the eighth annual R in Finance conference, my sixth year
in Chicago. The weather was a bummer, but the content was fresh.
What follows are my (biased, incomplete, sketchy) notes. You should go
look up <a href="https://github.com/robertzk/rfinance17-notes">Robert Krzyzanowski's notes</a>
for another view. The talks were, for the first time, recorded, and I think
can be <a href="https://event.microsoft.com/events/2017/1705/RFinance/">viewed later</a>.</p>
<!-- Date: Fri May 19 2017 09:30:24 -->
<!-- PELICAN_END_SUMMARY -->
<h3>Day One, Morning Lightning Round</h3>
<p>Without a morning keynote, the welcome quickly segued into the lightning round.</p>
<ol>
<li>
<p>Marcelo Perlin kicked off with a talk about his
<a href="https://cran.r-project.org/web/packages/GetHFData/index.html"><code>getHFData</code></a> package, which
gets high frequency data from BOVESPA, the Brazilian exchange. You read that correctly, you
can get trade and quote data for Brazilian equities and derivatives for free via an
R command, with order book data 'coming soon'. I'll take it! He also has a
<a href="https://sites.google.com/site/marceloperlin/my-books">free book</a>.</p>
</li>
<li>
<p>Jeffrey Mazar talked about the <a href="https://github.com/jmazar/obmodeling"><code>obmodeling</code></a>
package for modeling Order Book data, which provides some 'standard' analytics on OB: spread,
imbalance, depth, VPIN and so on. </p>
</li>
<li>
<p>Yuting Tan then gave a presentation on institutional investors and volatility. The key technical
point here is that the 'classical' measure of volatility of <em>returns</em> is confounded by noise
in the <em>price</em>. This is a problem I have looked at in the context of the 'bid-ask bounce', which
is a phantom negative correlation in returns. Suppose that <span class="math">\(P_t = L_t + z_t\)</span>, where <span class="math">\(L_t\)</span> is the
'latent' or 'true' price, and <span class="math">\(z_t\)</span> is a price noise. Suppose the log returns of the latent
price follow some stochastic law, where the <span class="math">\(z_t\)</span> might, for example, reflect whether the last
print was at the bid or ask. Then log returns of <span class="math">\(P_t\)</span> will show a negative autocorrelation
because of price noise, with the magnitude determined by the volatility in price noise and
the volatility of latent returns. Popping the stack on that diversion, Yuting presented an
estimator of volatility, TSRV, that is less affected (not affected?) by price noise. Then
she used it to analyze whether trading by institutional investors is correlated with certain
market characteristics, finding positive correlation with volatility, and price noise. As a
future work, computing Amihud Liquidity still seems to be confused by price noise.</p>
</li>
<li>
<p><a href="https://twitter.com/ProbablePattern">Stephen Rush</a> opened his talk with some
shade on data scientists. He analyzed the trade execution quality at
different trade execution venues: execution speed, spread, and best execution,
finding that market share affects the same, but, if I have understood this correctly,
some of the smaller venues give better execution at the expense of slower execution
time. Stephen can correct me on this.</p>
</li>
<li>
<p>Jerzy Pawlowski was a no-show due to time zones, but maybe he will reschedule.</p>
</li>
</ol>
<h2>Day One, Morning Talks</h2>
<p>Michael Hirsh talked about some analysis of trades data from ASX, the Australian Stock Exchange.
He has about a year of high frequency trade (and book?) data, which also includes the IDs of the
buyer and seller, presumably anonymized. Having IDs allows one to perform longitudinal analyses.
Michael characterized trades as active or passive, then presented some graphs showing changes
in behavior for certain HFT players on certain stocks over time. Some players were liquidity
makers, but on the whole they appeared to take liquidity. One interesting analysis was the
construction of 2D plots of the 'time to next trade' versus 'time from previous trade', though
I did not catch all the details, but it would seem this would be a basic tool for
reverse-engineering market players. (I realize that makes it sound like it would be easy;
it shouldn't be.) Then followed some network plots where the nodes were pie charts, and the
jet lag started in.</p>
<hr>
<p>Ross Bennett gave a presentation on his work on building factor portfolios for the Arizona
Pension system. He started by mentioning the
<a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2249314">factor zoo</a>, <em>i.e.</em> the
growing list of published results on nearly 60 factors which are purported to drive
investor behavior or otherwise affect returns.
It turns out that different index providers have different ideas about how to build
indices on these:</p>
<blockquote class="twitter-tweet" data-lang="en">
<p lang="en" dir="ltr">Amazing! Methodology matters when constructing factors. Reviewer 2 is vindicated. <a href="https://twitter.com/hashtag/RFin2017?src=hash">#RFin2017</a></p>— \m/ (-_-) \m/ (@ProbablePattern) <a href="https://twitter.com/ProbablePattern/status/865591933590163456">May 19, 2017</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>Another takeaway is that many of these factors have very short out-of-sample
histories from time of publication, and presumably the authors have dredged the
hell out of the history up to publication; moreover, some index providers appear
to have avoided huge drawdowns in the 2008 crisis by making questionable
choices in weighting functions, as the indices were constructed in the interim.
Then followed some kabuki about random portfolio construction with constraints
and different objectives, and bam, everyone loves Low Vol. As a refugee from a
Quant Fund, I will attest that you
cannot charge your "two and twenty" for replicating SPLV unless you're really
good at bullshitting clients. That aside, analysis of portfolios is
<a href="https://arxiv.org/abs/1312.0557">my jam</a>, so I was happy to see.</p>
<hr>
<p>Seoyung Kim gave a very nice talk on sentiment analysis of a corpus of
Enron emails, connecting characteristics of emails to external data
on the Enron collapse: public news, the stock price, and so on. She was
able to find a shift in email length, but not necessarily sentiment,
at the onset of collapse. Joint work with Sanjiv Das, who maybe
never sleeps?</p>
<p>Break.</p>
<hr>
<p>Szilard Pafka gave the first keynote, on 'No Bullshit Data Science'. He compared
the performance of different databases for
<a href="https://github.com/szilard/benchm-databases">different databases</a>. And he
<a href="https://github.com/szilard/benchm-ml">benchmarked implementations</a> of
standard machine learning techniques across platforms: Spark, <a href="http://h20.ai/">H2O</a>, python, R, xgboost.
The time performance and achieved objective were roughly compared, with
H2O and xgboost looking pretty. There was a bunch of prosletyzing about using
open source; I'm a convert already, but the takeaway is to fight your problem, don't
fight your tools.</p>
<p>Lunch. </p>
<h2>Day One, Afternoon</h2>
<h3>Lightning Round.</h3>
<ol>
<li>
<p>Jerzy Pawlowski snuck in a lightning talk: Machine Learning, backtesting, and cross validation.
His talk was marred by technical difficulties, so I need to take
a <a href="https://github.com/algoquant/presentations/blob/master/Jerzy_Pawlowski_RFinance_2017.html">second look</a>. (Mental note: never use HTML for presentations.)
He looks at the chances of selecting a zero-alpha strategy or manager based on in-sample performance,
as a function of sample size and breadth. Again, this is my jam. In a previous job I looked at
this situation in the context of selecting an allocation on sub-strategies (with a positivity
constraint! It's hard to explain to an investor that you've built 10 alpha models, but are
shorting two of them!), and tested many of the same heuristics: winner take all, equal dollar,
one over var, Markowitz with positivity and so on. The statistical question of the out-of-sample
performance is still really interesting.</p>
</li>
<li>
<p>Francesco Bianchi, lightning talk on coGARCH and the <a href="https://r-forge.r-project.org/projects/cogarch-rm/"><code>cogarch.rm</code></a>
package. This one flew by, and we landed on forecasting VaR and Expected Shortfall, forward,
using a co-GARCH. </p>
</li>
<li>
<p>Eina Ooka, from The Energy Authority, on modeling the risk of a portfolio of energy producing
assets, using Monte Carlo methods. (Side note: power prices are sometimes negative?) The
technical challenge was tuning the number of Random Forest trees to achieve the 'proper' amount
of volatility? There were a lot of plots here of different kinds,
check out the <a href="https://github.com/einaooka/useR2016/blob/master/useR2016-Conference-Poster.pdf">poster</a>.
She lamented the lack of a good scoring function for simulations.</p>
</li>
<li>
<p>Matteo Crimella, from GS, on Operational Risk Stress Testing. This touches near to my
professional interests. The setup: summarize historical macroeconomic variables using PCA, use
latent variablse as independent variables to forecast loss, and then apply the PCA weightings
to scenarios for walking forward.
Then the talk got faster: we had a driveby overview of
Multivariate Adaptive Regression Splines (Breimen 1996),
Neural Networks,
Error Correction Models;
then plots of scenario forecasts for different models;
then recommend bagged MARS. It's always nice to see a bunch of different models give similar
results (again, something I am starting to appreciate more in my professional life).</p>
</li>
<li>
<p>Thomas Zakrzewski, from S&P Global Market Intelligence, talking about using R for Stress testing.
This is my bread and butter now. Review of the regulatory environment: DFAST and CCAR. Then and
advertorial on Risk Services from S&P Market Intelligence. Using ARIMA to model PD & LGD,
from market, yield, inflation, and so on. Using BUGS?</p>
</li>
<li>
<p>Andy Tang, from William Blair, asking, "how much structure is best?" In particular, structure
of the covariance matrix. At a high level, compare unstructured and structured covariance
estimates. Unstructured include sample covariance, Shrinkage estimators, etc. Structured
include some factor structure like Barra, or something like CO-GARCH. The structured models
consume fewer degrees of freedom, but may be misspecified. A third approach, apparently, is
that of a 'conditionally structured' covariance? DGM (Dynamic Graph Methods?) were used to model 100
stocks during the European Debt Crisis, somehow with the Fama French factors seeding the
structure? Made a nice plot showing error versus the amount of structure, with DGM in the
middle, balancing dimensionality and model error.</p>
</li>
</ol>
<h3>Keynote</h3>
<p>Bob McDonald comes back to RFin. He talked about categorized rataings, such as Morningstar
(bond) ratings, whcih are ordinal factors. How are investors affected by ratings, change
in ratings, differences in rating standards across asset classes, and whether they can
improve decision making for investors using category ratings. A sociological experiment
was conducted where subjects were given 12 dollars in cash to invest in
six investments (plus a seventh, 'hold cash'). Subjects in two groups were presented the
same data on these investments, with the experimental group given an additional 'Category
rating' breaking the investments into two arbitrary categories, with a double line separating
the categories. The investments are sometimes presented with stars, based on the Sharpe within
asset classes (<em>i.e.</em> over all investments when no categorical division is given). Investors
are told how the stars are defined.
So, are investors affected by stars, and how does the effect change when 'grading on a curve'
based on asset category? Findings: adding or losing a star
affects, in the direction you would expect, amount invested in an asset. Lots of stuff
to unpack here, including a replication of the study on faculty and staff (the first round
was on undergrads at U. Iowa). I love behavioural studies like this.
Mental note, look up the <a href="https://cran.r-project.org/web/packages/texreg/index.html"><code>texreg</code> package</a>
for smoothing the R to LaTeX bridge.</p>
<p>Break.</p>
<hr>
<p>Dries Cornilly on co-moment estimation with factors and linear shrinkage. That's a lot of
words: you can think of co-moments as a generalization of the covariance matrix of
multiple assets. The coskewness array, for example is 2 by 2 by 2, but usually expressed
as a 2 by 4 matrix. Symmetry usually eats a lot of the independence in these: the co-kurtosis
matrix is 2 by 8, but has only 5 unique elements? Expressing these quantities as matrices
is useful in the following way: If <span class="math">\(\Phi\)</span> is the co-skewness matrix of some assets,
the skewness of the returns of portfolio <span class="math">\(w\)</span> allocated dollarwise on <span class="math">\(x\)</span> is
</p>
<div class="math">$$w^{\top} \Phi \left(w \otimes w\right).$$</div>
<p>
See how that generalizes covariance? Nice, right? That aside, how do you <em>estimate</em>
these matrices? There are a few issues here: </p>
<ol>
<li>One should abuse symmetry rather than perform redundant computations.</li>
<li>How do you make these computations <em>numerically</em> robust. (That's my paranoia.)</li>
<li>Given that one is estimating potentially thousands of co-moments,
will the errors swamp your application?</li>
</ol>
<p>To deal with estimation error, assume some structure. This can reduce error, but subjects
you to model misspecification error. One way to impose structure is via shrinkage. Another
is to use observed (Boudt, Lu & Peeters, 2015)
or unobserved factors (working paper).</p>
<hr>
<p>Bernhard Pfaff returns again to Rfin. Consider ERC, or 'risk parity' (<em>c.f.</em> Qian, 2005, 2006, 2011,
Maillerd <em>et al.</em> 2010, Roncalli, 2013) where each element in the portfolio contributes equal
standard deviation of risk. This segued into a multiple criteria risk optimization. How do you
do multiple risk criteria optimization? Take a linear combination of your objectives, for some
choice of weights, and then something something? How do you choose the weights?
He then presents the <a href="https://github.com/bpfaff/mcrp">mcrp package</a>, and takes it for a spin
with six objectives on five assets. </p>
<h3>Lightning Round.</h3>
<ol>
<li>
<p>Oliver Haynold on Practical Options Modeling with the
<a href="http://azzalini.stat.unipd.it/SN/"><code>sn</code> package</a>.
Heavy stuff here: modeling volatility that captures smirk, using four parameters. Somehow this is described as a
skew-t distribution, which has parameters for location, scale, stretch and tails? Ended
with a market forecast for December 2017.</p>
</li>
<li>
<p>Shuang Zhou on Nonparametric Estimate of the Risk-Neutral Density for options. I missed the
first part, and then it was a lot of tables, and then some demos, and it's a wrap.</p>
</li>
<li>
<p>Luis Damiano: A Quick Intro to Hidden Markov Models Applied to Stock Volatility. The
key insight is that markets do not behave the same every day, and so have some kind of
hidden state. Thus a hidden Markov Model. Nice
<a href="github.com/luisdamiano/rfinance17">slides</a>, no equations, good talk. </p>
</li>
<li>
<p>Oleg Bondarenko: Rearrangement Algorithm and Maximum Entropy. "Can you infer dependence of
variables given marginal distributions of the assets and of their weighted sum?" Cue the
Block Rearrangement Algorithm (BRA). Create a matrix of quantiles of returns, then
perform a greedy rearrangement. (Am I really describing this on my blog? It's too technical!)</p>
</li>
<li>
<p>Xin Chen: Risk and Performance Estimator Standard Errors for Serially Correlated Returns.
Hey, guess what? Performance estimators, like Sharpe, are random variables, and come with
noise, which can be biased or large. OK, I knew that. Let's check out some Hedge Fund
returns data. I didn't get what the method was, but check out
<a href="https://github.com/chenx26/EstimatorStandardError">the package</a>, and look forward to
results from GSoC 2017.</p>
</li>
</ol>
<hr>
<p><img style="float: right;" src="http://www.gilgamath.com/images/rfin2017_kk_lstm.png" width="325px" title="Thou knee strafks now Genetty" alt="Mostensio">
Qiang Kou ('KK') on Text analysis using Apache MxNet. MxNet is a deep learning platform, with
buy-in from Amazon. <a href="http://mxnet.io">MxNet</a> is designed to work on multiple platforms, with
bindings in many languages besides just R.
As an example usage, consider Amazon product review data: a bunch of text with some
characteristics around them, with a whole lot of data. They get very good accuracy on binary
classification, merely by considering the reviews as <em>matrices</em>, not by extracting keywords.
That is, an <span class="math">\(n\)</span> character review is encoded as a <span class="math">\(k \times n\)</span> matrix of 0/1 values, where <span class="math">\(k\)</span> is
the alphabet size (like <span class="math">\(k=63\)</span> to include upper and lower Roman letters and numerals and a few
symbols). This seems like voodoo. He then trains an LSTM model on Shakespeare (?) and generates
some random text using it. This is going to raise the authorship debate to a new level.
Check out his <a href="http://github.com/thirdwing/r_fin_2017">talk</a>.</p>
<p>Robert Krzyzanowski on <a href="http://syberia.io">Syberia</a>, a 'development framework for R'.
I was disappointed he fell sick last year and couldn't give a talk. The impetus for
Syberia: R workflows are loosely organized scripts, which are a hairball. Sharing
code and making it reproducible is hard as the number of developers grow.
The solution: impose order via a framework. That would be Syberia. Introduces
'adapters' to abstract away sinking and sourcing of data. And code? So
"everything is a resource". He mentions 'wide' vs. 'narrow' transformations.
Idempotent files? Package dependency management for reproducibility using a
lockbox. Check out his <a href="http://github.com/robertzk/rfinance17">talk</a>, and
go look up <a href="http://syberia.io">Syberia</a>. In talking with Rob, we agree, I think,
that a lot of R programming has a gunslinger nature to it, while the more CS-oriented
people seem to land in Python or Haskell, and so on. Syberia is Rob's answer, and I
think I'm sold.</p>
<h3>Lightning Round.</h3>
<ol>
<li>
<p>Matt Dancho, New Tools for Performing Financial Analysis Within the 'Tidy' Ecosystem.
That would be <code>tidyquant</code>. Apparently 'tidy' is a dirty word in these parts of
Chicago. The idea is to use the flexibility of the
<a href="http://tidyverse.org/">tidyverse</a> and the speed of xts. Keep your eye on this package.</p>
</li>
<li>
<p>Leonardo Silvestri, on <a href="http://www.ztsdb.org/"><code>ztsdb</code></a>, a time-series DBMS for R users.
It seems pretty cool: with R seamless integration, C/C++ bindings, and too much to mention in
6 minutes. Take it for a spin in <a href="https://hub.docker.com/r/lsilvest/ztsdb/">docker</a>.</p>
</li>
</ol>
<p>On to drinks! Not at the tower that shall not be named!</p>
<h2>Day Two</h2>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Stephen Bronder: Integrating Forecasting and Machine Learning in the
<a href="https://github.com/mlr-org/mlr"><code>mlr</code></a> Framework.
I am sad to say I overslept and missed this one. I believe the talks were taped,
(they certainly were <a href="https://event.microsoft.com/events/2017/1705/RFinance/">live streamed</a>)
so I am going to go back and look this one up, since <code>mlr</code> seems like a nice framework
for ML work.</p>
</li>
<li>
<p>Leopoldo Catania: Generalized Autoregressive Score Models in R: The <code>GAS</code> Package.
I have never seen the Generalized Autoregressive Score idea before. The idea seems to be
to compute the likelihoodist's Score function at each time point, and then compute a kind
of moving average of scores as an estimate of the underlying population parameter of interest.
I do want to follow up on this. See the <a href="https://github.com/LeopoldoCatania/GAS">package</a></p>
</li>
<li>
<p>Guanhao Feng: Regularizing Bayesian Predictive Regressions. Again, too early in the morning
for me, but it looked like Bayesian something something.</p>
</li>
<li>
<p>Jonas Rende on <a href="https://cran.r-project.org/web/packages/partialCI/index.html"><code>partialCI</code></a>: An R package for the analysis of
partially cointegrated time series. Partial cointegration generalizes the notion of
cointegration to allow the residual series to contain mean reverting and random walk.
See <a href="https://ideas.repec.org/p/zbw/iwqwdp/052017.html">the paper</a>.</p>
</li>
<li>
<p>Carson Sievert: Interactive visualization for multiple time series. I was hoping for
new methods of visualizing <em>multivariate</em> time series. The talk was nice, however, covering
some tools for plotting, via <a href="https://cpsievert.github.io/plotly_book/">plotly</a>, time series.
One upside is that the underlying engine, WebGL, can comfortably handle many plot points.
(The downside is that it runs in a web browser, so will never work in the constraints of
my professional work environment.)</p>
</li>
</ol>
<h3>Talks</h3>
<p>In one of the more memorable moments of the conference, Emanuele Guidotti presented a movie of
using the <code>yuimaGUI</code> package in the style of 'The Matrix'.
<a href="https://cran.r-project.org/web/packages/yuimaGUI/index.html"><code>yuimaGUI</code></a>, a wrapper on
<code>yuima</code>, is intended to be enabling to users, rather than just another interface. I am married
to my <code>vim</code> setup, so am unlikely to switch, but I am curious about <code>yuima</code>, which seems to be
(yet another) framework for defining models, estimating parameters, simulating models, and so
on.</p>
<p>Daniel Kowal then gave a talk on Bayesian Multivariate Functional Dynamic Linear Model,
with code in the <a href="https://github.com/drkowal/FDLM"><code>FDLM</code> package</a>. The idea appears to
be to model evolution in observed <em>functional</em> relationships. The running example was the
U.S. Yield curve, modeled as a function of time to maturity, with that function changing
over time. There is an added constraint of some kind of autocorrelation of the functional
relationship, and its decomposition into basis functions. Besides having the blessing of
the Bayesian Elders, I was a bit confused why you would run to a Gibbs sampler for this kind
of thing, when it seems you could set up a simple matrix factorization with regularization
and call it a day. </p>
<p>Break</p>
<hr>
<p>Jason Foster on Scenario Analysis of Risk Parity using <code>RcppParallel</code>. OK, this one
kind of burned my butter. Jason wrote and maintains the <code>roll</code> package for rolling
computations using <code>RcppParallel</code>. As was noted in the talk, and I have lamented
<a href="https://gist.github.com/shabbychef/04dfc10edd1cb492301b991b233788f2">elsewhere</a>,
the runtimes of <code>roll</code> functions grow with window size, and they should not. On the
upside, I should say that <code>roll</code> functions are 'obviously' correct, since they apply
the correct upstream functions on windowed views of the vector. But since they do not
reuse computations, they are slower than they should be.<br>
(Full disclosure/advertisement: I maintain <a href="https://github.com/shabbychef/fromo"><code>fromo</code></a>, which
is an alternative to <code>roll</code>.)
Already triggered, I nearly had an aneurysm when Jason trotted out
Euler's decomposition as an expression of the risk of each asset in a portfolio.
(<em>c.f.</em> <a href="http://www.optimization-online.org/DB_FILE/2013/10/4089.pdf">Bai, Scheinberg, Tutuncu</a>,
oh hey, my former Optimization prof!) While this is fine for some uses, it is <strong>not</strong>
the "risk in each asset". It can be <em>negative</em>. You do not have -8% risk in an asset,
that makes no godamned sense. </p>
<p>If I can calm down for a moment, this is a problem I have considered before. If you hold
a dollarwise portfolio <span class="math">\(w\)</span> on assets with covariance <span class="math">\(\Sigma\)</span>, your volatility is
<span class="math">\(\sigma=\sqrt{w^{\top}\Sigma w}\)</span>. Then if you let <span class="math">\(\Sigma^{1/2}\)</span> be the symmetric square root
of <span class="math">\(\Sigma\)</span>, define <span class="math">\(r=\left|\Sigma^{1/2}w\right|\)</span>. I will claim that the elements of <span class="math">\(r\)</span> are the
'risks' of each asset: we have <span class="math">\(\sigma = \sqrt{r^{\top} r}\)</span>, or the total volatility is
the length of <span class="math">\(r\)</span>. The elements of <span class="math">\(r\)</span> are positive, so no negative risks.
And furthermore, when you use the symmetric square root, the answers
you get are not dependent on the ordering of the assets in your vector (which is
arbitrary), as would be the case for a Cholesky factorization. <em>That's</em> how you define
the risk of each asset. (Sorry for the rant.)</p>
<p><strong>EDIT</strong> My rant here was undeservedly pissy. Rather than pretend I didn't write it, I will
leave it here, but will write a more restrained and balanced followup, and invite Jason to
give his view. I'll put the link here when it exists.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Michael Weylandt: Convex Optimization for High-Dimensional Portfolio Construction
The idea is to recast Portfolio optimization with an <span class="math">\(L_0\)</span> constraint (which is NP hard)
as a statistical problem (presumably also hard), which can be transformed into a LASSO,
which is <span class="math">\(L_1\)</span> regularization.</p>
</li>
<li>
<p>Lukas Elmiger: Risk Parity Under Parameter Uncertainty.
This was a comparison of common portfolio construction techniques on returns from the
S&P 500 universe, and from global futures data. The gist appears to be to measure
the uncertainty in the outcome as a function of the uncertainty in the portfolio.</p>
</li>
<li>
<p><a href="https://quantstrattrader.wordpress.com/">Ilya Kipnis</a>: Global Adaptive Asset Allocation,
and the Possible End of Momentum. Ilya presented a strategy on a list of ETFs given
in Meb Faber's 2015 book, tuning the lookback window for the momentum computation and the risk
estimation, finding that the former affects performance while the latter does not.
That said, there is a huge gap between in- and out-of-sample Sharpe.</p>
</li>
<li>
<p>Vyacheslav Arbuzov traveled from Siberia to give a talk on a
Dividend strategy. The interesting idea here is looking at price changes
around the ex-divided date for an equity. It turns out the gap in stock
prices are not explained by dividend size, rather they find that the
stock price drops less than the dividend payment would suggest. This
implies a trading strategy, which he analyzes. There is uncertainty
around the execution costs and so on, but a nice talk.
around pri</p>
</li>
</ol>
<!-- He finds a negative correlation between gap 'returns' and the dividend yield -->
<ol>
<li>Nabil Bouamara on The Alpha and Beta of Equity Hedge UCITS Funds.
I am late to the party here, but 'UCITs' are European Mutual funds
which follow guidelines for transparency, liquidity, risk management,
regulatory oversight and so on. Nabil used them as building blocks (like ETFs, say)
for a fund of funds portfolio. There was a focus on false discovery.</li>
</ol>
<h3>Keynote</h3>
<p>Dave DeMers gave a talk on Risk Fast and Slow. This talk was chock-full of
war stories ('bedtime stories,' as the speaker put it) about risk management
at a few very large funds over the last 20 years. He talked about the
ways of forecasting risk in real time: liquidity risk, a kind of crowded-market
risk, and so on. He gave some interesting heuristics which could/should be
used by a fund. One of these is the 'Absorption ratio', which is the ratio of
short to long term "percent variance explained by first <span class="math">\(k\)</span> PCA factors",
or something along those lines. He mentioned how his fund had started to de-lever
before the 'Quantquake of 2007', though not quickly enough.
As a funny aside note, at my first hedge fund job, we launched our quant fund on
August 1, 2007, taking off right <em>into</em> the shitstorm. Good times.</p>
<p>Lunch. A different lunch than last year. I somehow got the low sodium option.
The desert was coconut gloop with fruit, which is actually not too bad. And more
coffee.</p>
<h2>Day Two, Afternoon</h2>
<p>Matthew Dixon on
<a href="https://github.com/mfrdixon/MLEMVD"><code>MLEMVD</code></a>, an R Package for Maximum Likelihood
Estimation of Multivariate Diffusion Models. Matthew is a math guy, my people, and he
talked about techniques for modeling multivariate time-homogeneous stochastic
diffusions. I don't think I caught all the details: there was something interesting
about using the transition function directly, as developed by Sahalia, instead of
relying on least squares (in general the errors do not seem to follow a nice form,
so summed quadratic error is not necessarily the objective).</p>
<hr>
<p>Jonathan Regenstein on Reproducible Finance with R: A Global ETF Map.
This is a shiny app with clickable maps and time series,
then some stuff about code blocks and markdown. As a side
note, there is a lot of good work going on around mapping
and geospatial data in R. The speaker brought out the
<code>naturalEarth</code> package, which apparently simplifies making
maps. Also, as a mental note,
I should check out the <a href="https://github.com/edzer/sfr"><code>sf</code> package</a> for
representing 'simple features'.</p>
<hr>
<p>David Ardia on Markov-Switching GARCH Models in R via the <code>MSGARCH</code> package.
MSGARCH generalizes GARCH, which models volatility clustering via
conditional variance, by adding regime switching, or 'Markov-Switching',
to deal with structural breaks.
The vanilla GARCH is then declared a 'single regime' model. David compares
MS and SR, built using MCMC and MLE, and using different models for the
underlying (Normal? t for fat tails? skewed t?) for some returns data.
MSGARCH comes out looking good compared to brand Z. </p>
<p>Keven Bluteau followed up with a talk describing the capabilities
of the <a href="https://cran.r-project.org/web/packages/MSGARCH/index.html"><code>MSGARCH</code></a> package.
See also the <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2845809">paper</a>.</p>
<h3>Lightning Round</h3>
<ol>
<li>
<p>Riccardo Porreca. The title was 'Efficient, Consistent and Flexible Credit Default
Simulation', but the real action in this talk was the use of a new Pseudo Random
Number Generator, 'TRNG', which is wrapped by
the [<code>rTRNG</code> package](https://github.com/miraisolutions/rTRNG]. This fancy
PRNG was needed to allow parallelized Monte Carlo simulations which are
reproducible, which is a neat trick.</p>
</li>
<li>
<p>Maisa Aniceto: Machine Learning and the Analysis of Consumer Lending.
The speaker applied a bunch of ML techniques on the classification
problem of predicting default in consumer lending. She used a database
of around 100K consumers, with 21 independent variables, from a
Brazilian financial institution. She compared logistic regression,
RF, bagging, SVM, boosting. Conclusion: the ensemble methods work
better than logistic, though not by terribly much.</p>
</li>
</ol>
<h3>Talks</h3>
<p>David Smith: Detecting Fraud at 1 Million Transactions per Second.
This was a demo of the R computation stack provided by RevolutionR,
integrated with the MS R server, some kind of Data Science VM. I'm not
a Windows person, and I cannot use frameworks like this in my professional
life, but it was interesting nonetheless.
For the finale, David paralellized his computations by using three
remote desktop sessions.</p>
<p>Break </p>
<hr>
<p>Thomas Harte on the <code>PE</code> package, 'Modeling private equity in the 21st century.'
PE is the high yield part of alternative assets. Total AUM in PE exceeds
US 3 trillion or so. <em>But</em> few quants play in this space.
A rundown of PE: limited partners (LP) invest in funds managed by the
General Partners (GP), who invest in portfolio companies; talked about
the timeline of PE funds.
The landscape of PE data is: TVE/Thomson ONE was the former gold standard;
Cepres is a question mark;
Cambridge Associates provides indices;
and Preqin, who use FOIAs to get data.
The GPs use discounted cash flow (DCF) models, and report modeled NAV.
LPs build models from scratch, if that.
The Yale Endowment Model is the model used by secondary parties.</p>
<p>Why is PE difficult: PE investment are long term and illiquid, with
fund lifetimes around 10 years or so. In addition to market and
liquidity risks, there is the funding risk: the LPs are on the hook
to pay cash in (up to some amount) to the GPs when requested. This
cash piece apparently complicates computation of VaR. So there
are adjustments to the VaR to deal with liquidity and so on.</p>
<p>He does some computations to aggregate the various kinds of fees
which have to be added on to any VaR computation. As a mental note:
investing in PE seems like a good way to lose your lunch money.</p>
<hr>
<p>Guanhao Feng on The Market for English Premier League (EPL) Odds.
Using math to beat bookies in Soccer? Feng wanted to prove you cannot
do this. EPL gambling is a trillion dollar market. He creates
a model for real time odds of game outcomes, and calibrates to
real data. They build a model of the scores of each team as
Poisson processes with different intensities, but some
correlated component. The difference of Poissons is a
<a href="https://en.wikipedia.org/wiki/Skellam_distribution">Skellam</a>, apparently.
This gives the probabilities of outcomes in terms of Skellams.
Calibration is performed by taking the empirical odds,
provided by the bookies, and using the identities of the Skellam
mean and variance to get the intensities. I really enjoyed this
talk. It seemed to follow from the stochastic processes/applied probability
school of thought, rather than, say, a typical statistical framework.</p>
<hr>
<p>Bryan Lewis closes the conference with <em>Project and conquer!</em>
Bryan talked about the idea of projection onto a subspace
(a line in his toy examples), pointing out that points
in a space are at least as far away from each other
as they are in the projection. Leveraging this idea allows
you to quickly perform a number of computations, somewhat
surprisingly.
For example, fast thresholded distances and correlations can
be so computed, apparently in subquadratic time.
As an example (from the <code>tcor</code> vignette)
to find all paris of columns of a 1K x 20K matrix
with greater than 0.99 Pearson correlation requires more than
200 Gflop via brute force, but around 1 Gflop using <code>tcor</code>.
Bryan has extended this analysis to cases of <em>millions</em> of columns.
He also mentioned Krylov subspaces, the span of
<span class="math">\(\beta, X\beta, X^2\beta, ..., X^k\beta\)</span>, which is a real
old school Numerical Analysis trick (IIRC, they are used in
the analysis of the conjugate gradient method).
Look up the <a href="https://github.com/bwlewis/tcor"><code>tcor</code> package</a> for
more information.</p>
<p>I had to leave after that to catch the flight home.
In all, another great conference. I hope to be back next year.</p>
<p><strong> EDITS </strong></p>
<ol>
<li><em>Mon May 22 2017 21:05:22</em> edit affiliation of Matteo Crimella.
Add note regarding 'Risk Parity', with intention of writing
further blog post.</li>
</ol>
<!-- modelines -->
<!-- vim:ts=2:sw=2:tw=96:fdm=marker:syn=markdown:ft=markdown:ai:nocin:nu:fo=ncroqlt:cms=<!--%s-->
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Calendar plots in ggplot2.2017-05-18T17:21:54-07:002017-05-18T17:21:54-07:00Steven E. Pavtag:www.gilgamath.com,2017-05-18:/calendar-plots-ggplot2.html<p>I like the calendar 'heatmap' plots of commits you can see on
<a href="https://github.com/shabbychef">github user pages</a>, and wanted to play around with some.
Of course, if I just wanted to make some plots, I could have just googled around, and then
followed <a href="http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html#Calendar%20Heat%20Map">this recipe</a>,
or maybe used the <a href="https://github.com/ramnathv/rChartsCalmap">rChartsCalmap package</a>.
Instead I set out, as an exercise, to make my own using ggplot2. </p>
<!-- PELICAN_END_SUMMARY -->
<p>For data, I am using the daily GHCND observations data for station <code>USC00047880</code>, which is
located in the San Rafael, CA, Civic Center. I downloaded this data as part of a project
to join weather data to campground data (yes, it's been done before), directly from
the <a href="ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily">NOAA FTP site</a>, then read the fixed width
file. I then processed the data, subselected to 2016 and beyond, and converted the units.
I am left with a dataframe of dates, the element name, and the value, which is a temperature
in Celsius. The first ten values I show here:</p>
<table>
<thead>
<tr>
<th align="left">date</th>
<th align="left">element</th>
<th align="right">value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">2016-01-01</td>
<td align="left">TMAX</td>
<td align="right">9.4</td>
</tr>
<tr>
<td align="left">2016-01-01</td>
<td align="left">TMIN</td>
<td align="right">0.0</td>
</tr>
<tr>
<td align="left">2016-01-02</td>
<td align="left">TMAX</td>
<td align="right">10.0</td>
</tr>
<tr>
<td align="left">2016-01-02</td>
<td align="left">TMIN</td>
<td align="right">3.9</td>
</tr>
<tr>
<td align="left">2016-01-03</td>
<td align="left">TMAX</td>
<td align="right">11.7</td>
</tr>
<tr>
<td align="left">2016-01-03</td>
<td align="left">TMIN</td>
<td align="right">6.7</td>
</tr>
<tr>
<td align="left">2016-01-04</td>
<td align="left">TMAX</td>
<td align="right">12.8</td>
</tr>
<tr>
<td align="left">2016-01-04</td>
<td align="left">TMIN</td>
<td align="right">6.7</td>
</tr>
<tr>
<td align="left">2016-01-05</td>
<td align="left">TMAX</td>
<td align="right">12.8</td>
</tr>
<tr>
<td align="left">2016-01-05</td>
<td align="left">TMIN</td>
<td align="right">8.3</td>
</tr>
</tbody>
</table>
<p>Here is the code to produce the heatmap itself. I first use the <code>date</code> field
to compute the x axis labels and locations: the dates are converted essentially
to 'Julian' days since January 4, 1970 (a Sunday), then divided by seven to
get a 'Julian' week number. The week number containing the tenth of the month is
then set as the location of the month name in the x axis labels. I add years to
the January labels.</p>
<p>I then compute the Julian week number and day number of the week. I create a variable
which alternates between …</p><p>I like the calendar 'heatmap' plots of commits you can see on
<a href="https://github.com/shabbychef">github user pages</a>, and wanted to play around with some.
Of course, if I just wanted to make some plots, I could have just googled around, and then
followed <a href="http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html#Calendar%20Heat%20Map">this recipe</a>,
or maybe used the <a href="https://github.com/ramnathv/rChartsCalmap">rChartsCalmap package</a>.
Instead I set out, as an exercise, to make my own using ggplot2. </p>
<!-- PELICAN_END_SUMMARY -->
<p>For data, I am using the daily GHCND observations data for station <code>USC00047880</code>, which is
located in the San Rafael, CA, Civic Center. I downloaded this data as part of a project
to join weather data to campground data (yes, it's been done before), directly from
the <a href="ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily">NOAA FTP site</a>, then read the fixed width
file. I then processed the data, subselected to 2016 and beyond, and converted the units.
I am left with a dataframe of dates, the element name, and the value, which is a temperature
in Celsius. The first ten values I show here:</p>
<table>
<thead>
<tr>
<th align="left">date</th>
<th align="left">element</th>
<th align="right">value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">2016-01-01</td>
<td align="left">TMAX</td>
<td align="right">9.4</td>
</tr>
<tr>
<td align="left">2016-01-01</td>
<td align="left">TMIN</td>
<td align="right">0.0</td>
</tr>
<tr>
<td align="left">2016-01-02</td>
<td align="left">TMAX</td>
<td align="right">10.0</td>
</tr>
<tr>
<td align="left">2016-01-02</td>
<td align="left">TMIN</td>
<td align="right">3.9</td>
</tr>
<tr>
<td align="left">2016-01-03</td>
<td align="left">TMAX</td>
<td align="right">11.7</td>
</tr>
<tr>
<td align="left">2016-01-03</td>
<td align="left">TMIN</td>
<td align="right">6.7</td>
</tr>
<tr>
<td align="left">2016-01-04</td>
<td align="left">TMAX</td>
<td align="right">12.8</td>
</tr>
<tr>
<td align="left">2016-01-04</td>
<td align="left">TMIN</td>
<td align="right">6.7</td>
</tr>
<tr>
<td align="left">2016-01-05</td>
<td align="left">TMAX</td>
<td align="right">12.8</td>
</tr>
<tr>
<td align="left">2016-01-05</td>
<td align="left">TMIN</td>
<td align="right">8.3</td>
</tr>
</tbody>
</table>
<p>Here is the code to produce the heatmap itself. I first use the <code>date</code> field
to compute the x axis labels and locations: the dates are converted essentially
to 'Julian' days since January 4, 1970 (a Sunday), then divided by seven to
get a 'Julian' week number. The week number containing the tenth of the month is
then set as the location of the month name in the x axis labels. I add years to
the January labels.</p>
<p>I then compute the Julian week number and day number of the week. I create a variable
which alternates between plus and minus one each month, then color the 'grout' between
my tiles in different gray colors to delineate the month boundaries. I then <code>geom_tile</code>
the values, (using <code>viridis</code> to get scales visible to the colorblind);
create facet rows for the minimum and maximum temperature;
add the x breaks and labels;
fiddle with the guides and impose a minimal theme;
then set the coordinates to 'fixed' to make the tiles square. Voila:</p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>lubridate<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>viridis<span class="p">)</span>
jules <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> <span class="kp">as.numeric</span><span class="p">(</span>base<span class="o">::</span><span class="kp">julian</span><span class="p">(</span>x<span class="p">,</span>origin<span class="o">=</span><span class="kp">as.Date</span><span class="p">(</span><span class="s">'1970-01-04'</span><span class="p">)))</span> <span class="p">}</span>
<span class="c1"># get weeknumber containing the 10th of each month;</span>
aseq <span class="o"><-</span> <span class="kt">data.frame</span><span class="p">(</span>alldate<span class="o">=</span><span class="kp">seq</span><span class="p">(</span><span class="kp">min</span><span class="p">(</span>pltdat<span class="o">$</span><span class="kp">date</span><span class="p">),</span><span class="kp">max</span><span class="p">(</span>pltdat<span class="o">$</span><span class="kp">date</span><span class="p">),</span>by<span class="o">=</span><span class="m">1</span><span class="p">))</span> <span class="o">%>%</span>
filter<span class="p">(</span>lubridate<span class="o">::</span>day<span class="p">(</span>alldate<span class="p">)</span> <span class="o">==</span> <span class="m">10</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>moname<span class="o">=</span>month<span class="p">(</span>alldate<span class="p">,</span>label<span class="o">=</span><span class="kc">TRUE</span><span class="p">),</span>wnum<span class="o">=</span>jules<span class="p">(</span>alldate<span class="p">)</span><span class="o">/</span><span class="m">7.0</span><span class="p">,</span>yrnum<span class="o">=</span>year<span class="p">(</span>alldate<span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>label<span class="o">=</span><span class="kp">ifelse</span><span class="p">(</span>moname<span class="o">==</span><span class="s">'Jan'</span><span class="p">,</span><span class="kp">paste0</span><span class="p">(</span>moname<span class="p">,</span><span class="s">'\n'</span><span class="p">,</span>yrnum<span class="p">),</span><span class="kp">as.character</span><span class="p">(</span>moname<span class="p">)))</span>
<span class="c1"># now the plit itself</span>
ph <span class="o"><-</span> pltdat <span class="o">%>%</span>
mutate<span class="p">(</span>juld<span class="o">=</span>jules<span class="p">(</span><span class="kp">date</span><span class="p">),</span>
mono<span class="o">=</span>month<span class="p">(</span><span class="kp">date</span><span class="p">,</span>label<span class="o">=</span><span class="kc">FALSE</span><span class="p">),</span>
dayname<span class="o">=</span><span class="kp">factor</span><span class="p">(</span><span class="kp">weekdays</span><span class="p">(</span><span class="kp">date</span><span class="p">,</span>abbreviate<span class="o">=</span><span class="kc">TRUE</span><span class="p">),</span>levels<span class="o">=</span><span class="kp">rev</span><span class="p">(</span><span class="kt">c</span><span class="p">(</span><span class="s">'Sun'</span><span class="p">,</span><span class="s">'Mon'</span><span class="p">,</span><span class="s">'Tue'</span><span class="p">,</span><span class="s">'Wed'</span><span class="p">,</span><span class="s">'Thu'</span><span class="p">,</span><span class="s">'Fri'</span><span class="p">,</span><span class="s">'Sat'</span><span class="p">))))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>wnum<span class="o">=</span><span class="kp">floor</span><span class="p">(</span>juld<span class="p">)</span> <span class="o">%/%</span> <span class="m">7</span><span class="p">,</span>
moalt<span class="o">=</span><span class="kp">factor</span><span class="p">(</span><span class="kp">sign</span><span class="p">((</span><span class="m">-1</span><span class="p">)</span><span class="o">^</span>mono<span class="p">)))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>wnum<span class="p">,</span>dayname<span class="p">,</span>fill<span class="o">=</span>value<span class="p">))</span> <span class="o">+</span>
geom_tile<span class="p">(</span>aes<span class="p">(</span>color<span class="o">=</span>moalt<span class="p">),</span> na.rm<span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span>size<span class="o">=</span><span class="m">0.7</span><span class="p">)</span> <span class="o">+</span>
scale_fill_viridis<span class="p">(</span>option<span class="o">=</span><span class="s">'viridis'</span><span class="p">,</span>na.value<span class="o">=</span><span class="s">'white'</span><span class="p">)</span> <span class="o">+</span>
scale_color_manual<span class="p">(</span>values<span class="o">=</span><span class="kt">c</span><span class="p">(</span><span class="s">'gray40'</span><span class="p">,</span><span class="s">'gray80'</span><span class="p">))</span> <span class="o">+</span>
scale_y_discrete<span class="p">(</span>breaks<span class="o">=</span><span class="kt">c</span><span class="p">(</span><span class="s">'Mon'</span><span class="p">,</span><span class="s">'Wed'</span><span class="p">,</span><span class="s">'Fri'</span><span class="p">))</span> <span class="o">+</span>
scale_x_continuous<span class="p">(</span>breaks<span class="o">=</span>aseq<span class="o">$</span>wnum<span class="p">,</span>labels<span class="o">=</span>aseq<span class="o">$</span>label<span class="p">)</span> <span class="o">+</span>
facet_grid<span class="p">(</span>element <span class="o">~</span> <span class="m">.</span><span class="p">)</span> <span class="o">+</span>
guides<span class="p">(</span>color<span class="o">=</span><span class="s">'none'</span><span class="p">)</span> <span class="o">+</span> theme_minimal<span class="p">()</span> <span class="o">+</span> theme<span class="p">(</span>panel.grid<span class="o">=</span>element_blank<span class="p">())</span> <span class="o">+</span> coord_fixed<span class="p">()</span> <span class="o">+</span>
labs<span class="p">(</span>y<span class="o">=</span><span class="s">''</span><span class="p">,</span>x<span class="o">=</span><span class="s">''</span><span class="p">,</span>fill<span class="o">=</span><span class="s">'temp C'</span><span class="p">,</span>title<span class="o">=</span><span class="s">'Min and Max daily temperature, NOAA station USC00047880, San Rafael, California'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/first_plot-1.png" title="plot of chunk first_plot" alt="plot of chunk first_plot" width="1200px" height="400px" /></p>
<p>Well, that was fun. So much fun, I will run it again on my win rate from
<a href="chesstempo-intro">Chesstempo tactics attempts</a>, which at least might show
some weekday/weekend pattern: </p>
<p><img src="http://www.gilgamath.com/figure/second_plot-1.png" title="plot of chunk second_plot" alt="plot of chunk second_plot" width="1200px" height="400px" /></p>Elo and Draws.2017-05-04T20:57:54-07:002017-05-04T20:57:54-07:00Steven E. Pavtag:www.gilgamath.com,2017-05-04:/elo-ties.html<p>I still had some nagging thoughts after my recent
<a href="elo-distribution">examination of the distribution of Elo</a>. In that
blog post, I recognized that a higher probability of a draw would lead
to tighter standard error around the true 'ability' of a player, as
estimated by an Elo ranking. Without any data, I punted on what that
probability should be. So I decided to look at some real data. </p>
<!-- PELICAN_END_SUMMARY -->
<p>I started working in a risk role about a year ago. Compared to my
previous gig, there is a much greater focus on discrete event
modeling than on continuous outcomes. Logistic regression and
survival analysis are the tools of the trade. However,
financial risk modeling is more complex than the textbook
presentation of these methods. As is chess. A loan holder might
go bankrupt, stop paying, die, <em>etc.</em> Similarly, a chess player
might win, lose or draw.</p>
<p>There are two main ways of approaching multiple outcome discrete
models that leverage the simpler binary models: the <em>competing hazards</em>
view, and the <em>sequential hazards</em> view. Briefly, risk under
competing hazards would be like traversing the Fire Swamp: at any time,
the spurting flames, the lightning sand or the rodents of unusual
size might harm you. The risks all come at you at once.
An example of a sequential hazard is undergoing
surgery: you might die in surgery, and if you survive you might incur
an infection and die of complications; the risks present themselves
conditional on surviving other risks. (Both of these
views are mostly just conveniences, and real risks are never so
neatly defined.)</p>
<p>Returning to chess, I will consider sequential hazards.
Assume two players, and let the difference in true abilities between
them be denoted <span class="math">\(\Delta a\)</span>.
As with Elo, we want the difference in abilities is such that
the odds that the …</p><p>I still had some nagging thoughts after my recent
<a href="elo-distribution">examination of the distribution of Elo</a>. In that
blog post, I recognized that a higher probability of a draw would lead
to tighter standard error around the true 'ability' of a player, as
estimated by an Elo ranking. Without any data, I punted on what that
probability should be. So I decided to look at some real data. </p>
<!-- PELICAN_END_SUMMARY -->
<p>I started working in a risk role about a year ago. Compared to my
previous gig, there is a much greater focus on discrete event
modeling than on continuous outcomes. Logistic regression and
survival analysis are the tools of the trade. However,
financial risk modeling is more complex than the textbook
presentation of these methods. As is chess. A loan holder might
go bankrupt, stop paying, die, <em>etc.</em> Similarly, a chess player
might win, lose or draw.</p>
<p>There are two main ways of approaching multiple outcome discrete
models that leverage the simpler binary models: the <em>competing hazards</em>
view, and the <em>sequential hazards</em> view. Briefly, risk under
competing hazards would be like traversing the Fire Swamp: at any time,
the spurting flames, the lightning sand or the rodents of unusual
size might harm you. The risks all come at you at once.
An example of a sequential hazard is undergoing
surgery: you might die in surgery, and if you survive you might incur
an infection and die of complications; the risks present themselves
conditional on surviving other risks. (Both of these
views are mostly just conveniences, and real risks are never so
neatly defined.)</p>
<p>Returning to chess, I will consider sequential hazards.
Assume two players, and let the difference in true abilities between
them be denoted <span class="math">\(\Delta a\)</span>.
As with Elo, we want the difference in abilities is such that
the odds that the first player wins a match between them
is <span class="math">\(10^{\Delta a / 400}\)</span>. But we have to consider ties. So
instead, let <span class="math">\(E_A\)</span> be the <em>expected value</em> of the game from
the viewpoint of player <span class="math">\(A\)</span>. That is 1 times the probability
of an outright for <span class="math">\(A\)</span> win plus one-half the probability of a draw.
Now we want
</p>
<div class="math">$$
\frac{E_A}{1 - E_A} = 10^{\Delta a /400}.
$$</div>
<p>Now suppose that <span class="math">\(p_1\)</span> is the probability of an outright victory for
<span class="math">\(A\)</span>, and <span class="math">\(p_2\)</span> is the probability of a draw, <em>conditional on</em> <span class="math">\(A\)</span> <em>not
achieving victory.</em> These two are related
to <span class="math">\(E_A\)</span> by
</p>
<div class="math">$$
E_A = p_1 + \frac{1}{2} \left(1 - p_1\right) p_2
$$</div>
<p>
Moreover, the <em>variance</em> of the outcome is also a function of <span class="math">\(p_1, p_2\)</span>.
We need the variance to talk about the noise in the Elo rating.</p>
<p>If one had the outcomes from a bunch of games where the true
abilities of the players were known, one would then proceed as follows:</p>
<ol>
<li>Use logistic regression to estimate <span class="math">\(\log \frac{p_1}{1-p_1}\)</span> as a function
of the abilities of the players. The 'True' events are those where the
first player wins, while 'False' are losses and draws.</li>
<li>Use logistic regression to estimate <span class="math">\(\log \frac{p_2}{1-p_2}\)</span> as a function
of the abilities of the players. The 'True' events are those where
the game was a draw, while the 'False' events are those where the first
player loses. This second logistic regression has a smaller sample size
than the first, and they have correlated errors.</li>
</ol>
<p>Note however, that we are to keep the calibration of <span class="math">\(E_A\)</span> in terms of
<span class="math">\(\Delta a\)</span> we don't actually have to perform the second logistic regression.
That is, if <span class="math">\(E_A\)</span> and <span class="math">\(p_1\)</span> are known, we can compute <span class="math">\(p_2\)</span>.</p>
<h2>Data Please</h2>
<p>Enough of the talky bits, let's look at some data. I put together
<a href="https://github.com/shabbychef/chessdata">some scripts</a> for
downloading, extracting and processing some pgn data from
the <a href="http://www.top-5000.nl/dl">top-5000 site</a>. The upstream data
has the outcomes of nearly 2 <em>million</em> games played. I restrict
my attention to games where the Elo of both players was measured
at the time the match was played, which brings my sample down to
1179200 games played. Here I load the
<a href="../data/millionbase-2.22_summa.csv.gz">data</a>, then perform
some transforms to compute the mean 'level' of the game (Grandmaster,
FIDE master, Expert, or other) based on average Elo, and then
detect whether the win and loss conditions.</p>
<div class="highlight"><pre><span></span><span class="kp">suppressMessages</span><span class="p">({</span>
<span class="kn">library</span><span class="p">(</span>readr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>forcats<span class="p">)</span>
<span class="p">})</span>
indat <span class="o"><-</span> readr<span class="o">::</span>read_csv<span class="p">(</span><span class="s">'../data/millionbase-2.22_summa.csv.gz'</span><span class="p">,</span>
col_types<span class="o">=</span>cols<span class="p">(</span>Year <span class="o">=</span> col_integer<span class="p">(),</span>
deltaElo <span class="o">=</span> col_integer<span class="p">(),</span>
meanElo <span class="o">=</span> col_integer<span class="p">(),</span>
result <span class="o">=</span> col_double<span class="p">()))</span>
sdat <span class="o"><-</span> indat <span class="o">%>%</span>
filter<span class="p">(</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>deltaElo<span class="p">),</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>meanElo<span class="p">),</span><span class="o">!</span><span class="kp">is.na</span><span class="p">(</span>result<span class="p">))</span> <span class="o">%>%</span>
filter<span class="p">(</span>deltaElo <span class="o"><=</span> <span class="m">400</span><span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>skills<span class="o">=</span><span class="kp">ifelse</span><span class="p">(</span>meanElo <span class="o">></span> <span class="m">2500</span><span class="p">,</span><span class="s">"GM"</span><span class="p">,</span> <span class="c1"># the more tidy way is via case_when</span>
<span class="kp">ifelse</span><span class="p">(</span>meanElo <span class="o">></span> <span class="m">2300</span><span class="p">,</span><span class="s">"FM"</span><span class="p">,</span>
<span class="kp">ifelse</span><span class="p">(</span>meanElo <span class="o">></span> <span class="m">2000</span><span class="p">,</span><span class="s">"Expert"</span><span class="p">,</span><span class="s">"other"</span><span class="p">))))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>skills<span class="o">=</span>forcats<span class="o">::</span>fct_reorder<span class="p">(</span><span class="kp">factor</span><span class="p">(</span>skills<span class="p">),</span>meanElo<span class="p">,</span><span class="m">.</span>desc<span class="o">=</span><span class="kc">TRUE</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>is_tie<span class="o">=</span><span class="p">(</span>result<span class="o">==</span><span class="m">0.5</span><span class="p">),</span>
is_win<span class="o">=</span><span class="p">(</span>result<span class="o">==</span><span class="m">1.0</span><span class="p">))</span>
</pre></div>
<p>Now I plot the data. I use a non-parametric fit to give a smooth estimate of
the probability of a win, and the probability of a tie,
from the point of view of the player with the higher Elo rating,
as a function of Elo. I plot facets for the four different 'levels' of play.
Note that the probability of a tie is <em>unconditional</em> on whether a
win was had for the first player.</p>
<div class="highlight"><pre><span></span><span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>scales<span class="p">)</span>
<span class="c1"># scales: elo logit</span>
elogit <span class="o"><-</span> scales<span class="o">::</span>trans_new<span class="p">(</span><span class="s">'inverse elo logit'</span><span class="p">,</span><span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> <span class="m">1</span> <span class="o">/</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> <span class="kp">exp</span><span class="p">(</span><span class="o">-</span>x<span class="o">/</span><span class="m">400</span><span class="p">))</span> <span class="p">},</span><span class="kr">function</span><span class="p">(</span>y<span class="p">)</span> <span class="p">{</span> <span class="m">400</span><span class="o">*</span><span class="kp">log</span><span class="p">(</span>y<span class="o">/</span><span class="p">(</span><span class="m">1</span><span class="o">-</span>y<span class="p">))</span> <span class="p">})</span>
<span class="c1"># plot it</span>
ph <span class="o"><-</span> sdat <span class="o">%>%</span>
tidyr<span class="o">::</span>gather<span class="p">(</span>key<span class="o">=</span><span class="s">'iswhat'</span><span class="p">,</span>value<span class="o">=</span><span class="s">'yesno'</span><span class="p">,</span>is_tie<span class="p">,</span>is_win<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>iswhat<span class="o">=</span><span class="kp">gsub</span><span class="p">(</span><span class="s">'_'</span><span class="p">,</span><span class="s">' '</span><span class="p">,</span>iswhat<span class="p">),</span>yesno<span class="o">=</span><span class="kp">as.numeric</span><span class="p">(</span>yesno<span class="p">))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>x<span class="o">=</span>deltaElo<span class="p">,</span>y<span class="o">=</span>yesno<span class="p">,</span>color<span class="o">=</span>iswhat<span class="p">))</span> <span class="o">+</span>
stat_smooth<span class="p">()</span> <span class="o">+</span>
facet_grid<span class="p">(</span>skills <span class="o">~</span> <span class="m">.</span><span class="p">)</span> <span class="o">+</span>
scale_x_continuous<span class="p">(</span>trans<span class="o">=</span>elogit<span class="p">)</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'delta elo'</span><span class="p">,</span>y<span class="o">=</span><span class="s">'probability'</span><span class="p">,</span>color<span class="o">=</span><span class="s">'outcome'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/plot_data-1.png" title="plot of chunk plot_data" alt="plot of chunk plot_data" width="1000px" height="750px" /></p>
<p>The pattern you see is an increased chance of a draw at higher levels of play.
For two equally matched Grandmasters, the probability of a tie is even
greater than one half. This should match one's experience observing
championship tournaments, where it feels like all the matches end in a draw.
It is not clear which way the causality goes with this: it could be that
players more likely to draw will experience fewer hits to their Elo and so make it
to the Grandmaster level; the less cynical interpretation is that better
players are better able to avoid a loss. Maybe those aren't really different
explanations for the effect.</p>
<p>Whatever the explanation, this does introduce a complication into the rating
systems: previously one only needed the <em>difference</em> in abilities to estimate
<span class="math">\(E_A\)</span>. Thus a rating could only be defined up to some arbitrary additive
constant (kind of like the operation of taking an indefinite integral). The
additive constant problem was 'solved' by starting all players at 1500 and
letting them basically fight each other for Elo points. This seems insensitive
to changes in the overall pool (causing inflation or deflation, perhaps),
and does not seem like a good way to compare players across eras.</p>
<p>However, the plot above seems to suggest there is a measurable effect that
depends not on the <em>difference</em> in abilities, but in the <em>average</em> ability.
Using this marker would in effect 'anchor' a rating system, a feature that Elo
lacks. However, it presents a degree of freedom that we would have to
fill somehow. That is, just as Elo was defined somewhat arbitrarily to scale
decades every 400 points, we can define our anchor point based on a certain
probability of a tie. From eyeballing the plot it would be something like
the following: "a one half probability of an outright win for either player
occurs when two players both of rating 2400 play."</p>
<p>I shudder to think of the perverse incentives that might arise from players
being rated under such a system, as it might encourage more junior players
to settle for a draw if it would lift both their ratings. This is a topic
for another time. First let us see if these ideas jibe with the data.</p>
<h2>Next Logistic</h2>
<p>Here I perform the first logistic regression, and print its summary. </p>
<div class="highlight"><pre><span></span>mod1 <span class="o"><-</span> glm<span class="p">(</span>is_win <span class="o">~</span> deltaElo <span class="o">+</span> meanElo<span class="p">,</span>data<span class="o">=</span>sdat<span class="p">,</span>family<span class="o">=</span>binomial<span class="p">())</span>
<span class="kp">print</span><span class="p">(</span><span class="kp">summary</span><span class="p">(</span>mod1<span class="p">))</span>
</pre></div>
<div class="highlight"><pre><span></span>##
## Call:
## glm(formula = is_win ~ deltaElo + meanElo, family = binomial(),
## data = sdat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.09 -1.03 -0.79 1.13 1.71
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.019787 0.026946 -0.73 0.46
## deltaElo 0.006867 0.000024 286.22 <2e-16 ***
## meanElo -0.000421 0.000011 -38.17 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1574897 on 1137821 degrees of freedom
## Residual deviance: 1465087 on 1137819 degrees of freedom
## AIC: 1465093
##
## Number of Fisher Scoring iterations: 4
</pre></div>
<div class="highlight"><pre><span></span><span class="c1"># sequential hazard:</span>
<span class="c1"># mod2 <- glm(is_tie ~ deltaElo + meanElo,data=sdat %>% filter(!is_win),family=binomial())</span>
is_2400 <span class="o"><-</span> <span class="p">(</span><span class="kp">log</span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="m">3</span><span class="p">)</span> <span class="o">-</span> coef<span class="p">(</span>mod1<span class="p">)[</span><span class="s">'(Intercept)'</span><span class="p">])</span> <span class="o">/</span> <span class="p">(</span>coef<span class="p">(</span>mod1<span class="p">)[</span><span class="s">'meanElo'</span><span class="p">])</span>
</pre></div>
<p>We see that both the difference in ability, and the outright mean ability
are significant in this logistic regression. The sign on the mean ability
coefficient is negative: better players are less likely to secure an outright
win, all things equal, but the effect is small. We can compute the 'anchor'
point by finding the rating at which the probability of a win by the first
player is exactly one quarter: this corresponds to one to three odds, and
we estimate that anchor point as an Elo of 2560.077419, not terribly far
from our rough estimate of 2400.</p>
<p>This suggests the following definitions:
</p>
<div class="math">$$
p_1 = \frac{1}{1 + e^{0.0198 - 0.00687 \Delta_a + 0.000421 \mu_a}},
$$</div>
<p>
where <span class="math">\(\mu_a\)</span> is the average ability. Then define <span class="math">\(p_2\)</span> as
</p>
<div class="math">$$
p_2 = 2 \frac{E_A - p_1}{1 - p_1},
$$</div>
<p>
and, as always,
</p>
<div class="math">$$
E_A = \frac{1}{1 + 10^{- \Delta_a / 400}}.
$$</div>
<p>The variance of the outcome, from the first player's perspective, is then
</p>
<div class="math">$$
p_1 + \frac{1}{4} (1-p_1)p_2 - E_A^2.
$$</div>
<p>
This number can then feed into an estimate of standard error around Elo
ratings,
which will have to wait for a future blog post.</p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Distribution of Elo.2017-04-15T20:30:10-07:002017-04-15T20:30:10-07:00Steven E. Pavtag:www.gilgamath.com,2017-04-15:/elo-distribution.html<p>I have been thinking about Elo ratings recently, after
<a href="chesstempo-intro">analyzing my tactics ratings</a>. I have a lot of
questions about Elo: is it really predictive of performance? why don't we
calibrate Elo to a quantitative strategy? can we really compare players
across different eras? why not use an extended Kalman Filter instead of
Elo? <em>etc.</em> One question I had which I consider here is, "what is the
standard error of Elo?"</p>
<!-- PELICAN_END_SUMMARY -->
<p>Consider two players. Let the difference in true abilities between
them be denoted <span class="math">\(\Delta a\)</span>, and let the difference in their
Elo ratings be <span class="math">\(\Delta r\)</span>. The difference in abilities is such that
the odds that the first player wins a match between them
is <span class="math">\(10^{\Delta a / 400}\)</span>. Note that the raw abilities and ratings
will not be used here, only the differences, since they are only
defined up to an arbitrary additive offset.</p>
<p>When the two play a game, both their scores are updated according
to the outcome. Let <span class="math">\(z\)</span> be the outcome of the match from the
point of view of the first player. That is <span class="math">\(z=1\)</span> if the first player
wins, <span class="math">\(0\)</span> if they lose, and <span class="math">\(1/2\)</span> in the case of a draw. We update
their Elo ratings by
</p>
<div class="math">$$
\Delta r \Leftarrow \Delta r + 2 k \left(z - g\left(\Delta r\right) \right),
$$</div>
<p>
where <span class="math">\(k\)</span> is the <span class="math">\(k\)</span>-factor (typically between 10 and 40), and <span class="math">\(g\)</span>
gives the expected value of the outcome based on the difference in
ratings, with
</p>
<div class="math">$$
g(x) = \frac{10^{x/400}}{1 + 10^{x/400}}.
$$</div>
<p>
Because we add and subtract the same update to both players' ratings, the
difference between them gets twice that update, thus the <span class="math">\(2\)</span>.</p>
<p>Let <span class="math">\(\epsilon\)</span> be the error in the ratings: <span class="math">\(\Delta r = \Delta a + \epsilon\)</span>.
Then the error updates as
</p>
<div class="math">$$
\epsilon …</div><p>I have been thinking about Elo ratings recently, after
<a href="chesstempo-intro">analyzing my tactics ratings</a>. I have a lot of
questions about Elo: is it really predictive of performance? why don't we
calibrate Elo to a quantitative strategy? can we really compare players
across different eras? why not use an extended Kalman Filter instead of
Elo? <em>etc.</em> One question I had which I consider here is, "what is the
standard error of Elo?"</p>
<!-- PELICAN_END_SUMMARY -->
<p>Consider two players. Let the difference in true abilities between
them be denoted <span class="math">\(\Delta a\)</span>, and let the difference in their
Elo ratings be <span class="math">\(\Delta r\)</span>. The difference in abilities is such that
the odds that the first player wins a match between them
is <span class="math">\(10^{\Delta a / 400}\)</span>. Note that the raw abilities and ratings
will not be used here, only the differences, since they are only
defined up to an arbitrary additive offset.</p>
<p>When the two play a game, both their scores are updated according
to the outcome. Let <span class="math">\(z\)</span> be the outcome of the match from the
point of view of the first player. That is <span class="math">\(z=1\)</span> if the first player
wins, <span class="math">\(0\)</span> if they lose, and <span class="math">\(1/2\)</span> in the case of a draw. We update
their Elo ratings by
</p>
<div class="math">$$
\Delta r \Leftarrow \Delta r + 2 k \left(z - g\left(\Delta r\right) \right),
$$</div>
<p>
where <span class="math">\(k\)</span> is the <span class="math">\(k\)</span>-factor (typically between 10 and 40), and <span class="math">\(g\)</span>
gives the expected value of the outcome based on the difference in
ratings, with
</p>
<div class="math">$$
g(x) = \frac{10^{x/400}}{1 + 10^{x/400}}.
$$</div>
<p>
Because we add and subtract the same update to both players' ratings, the
difference between them gets twice that update, thus the <span class="math">\(2\)</span>.</p>
<p>Let <span class="math">\(\epsilon\)</span> be the error in the ratings: <span class="math">\(\Delta r = \Delta a + \epsilon\)</span>.
Then the error updates as
</p>
<div class="math">$$
\epsilon \Leftarrow \epsilon + 2 k \left(z - g\left(\Delta a + \epsilon\right) \right).
$$</div>
<p>
Using Taylor's Theorem, linearize the function <span class="math">\(g\)</span> to get approximately
</p>
<div class="math">$$
\epsilon \Leftarrow \epsilon + 2 k \left(z - g\left(\Delta a\right) - \epsilon g'\left(\Delta a\right) \right).
$$</div>
<p>Now note that the expected value of <span class="math">\(z\)</span> is exactly <span class="math">\(g\left(\Delta a \right)\)</span>.
Let <span class="math">\(\sigma^2\)</span> be the variance of <span class="math">\(z\)</span>. We have defined an AR(1) process on
<span class="math">\(\epsilon\)</span>. It's asymptotic expected value is zero, thus the Elo rating difference
is unbiased. The asymptotic variance is
</p>
<div class="math">$$
\frac{k\sigma^2}{g'\left(\Delta a\right) \left(1 - k g'\left(\Delta a\right)\right)}.
$$</div>
<p>Let's test this here. We will assume that <span class="math">\(z\)</span> is <span class="math">\(1/2\)</span> with fixed probability <span class="math">\(p\)</span>.
(Note that more than half of high level tournament matches end in draws.) This
implies that <span class="math">\(z=1\)</span> with probability <span class="math">\(g\left(\Delta a\right) - p\)</span> and equals
zero otherwise. The variance of <span class="math">\(z\)</span> is
</p>
<div class="math">$$
\sigma^2 = g\left(\Delta a\right)\left(1 - g\left(\Delta a\right)\right) - \frac{p}{4}.
$$</div>
<p>Now I set up a simulation for fixed <span class="math">\(\Delta a\)</span>, <span class="math">\(k\)</span>, and <span class="math">\(p\)</span> and compare the
standard deviation of the rating error to the empirical value. I perform
simulations for varying values of <span class="math">\(p\)</span>, the probability of a tie, as well as
varying <span class="math">\(k\)</span> values. I simulate 50,000 matches for each setting of the
parameters, then measure the empirical standard deviation, plotting as a
point. The theoretical values are drawn as lines. Note that the theory is
based on the linear approximation from Taylor's theorem, so there should
be some discrepancy between the two of them. </p>
<div class="highlight"><pre><span></span>gfunc <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span>
lod <span class="o"><-</span> <span class="m">10</span><span class="o">^</span><span class="p">(</span>x<span class="o">/</span><span class="m">400</span><span class="p">)</span>
lod <span class="o">/</span> <span class="p">(</span><span class="m">1</span> <span class="o">+</span> lod<span class="p">)</span>
<span class="p">}</span>
gprime <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>x<span class="p">)</span> <span class="p">{</span> <span class="p">(</span><span class="kp">log</span><span class="p">(</span><span class="m">10</span><span class="p">)</span><span class="o">/</span><span class="m">400</span><span class="p">)</span> <span class="o">*</span> gfunc<span class="p">(</span>x<span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> gfunc<span class="p">(</span>x<span class="p">))</span> <span class="p">}</span>
ratsim <span class="o"><-</span> <span class="kr">function</span><span class="p">(</span>nsim<span class="o">=</span><span class="m">10000</span><span class="p">,</span>delta_a<span class="o">=</span><span class="m">0</span><span class="p">,</span>k<span class="o">=</span><span class="m">10</span><span class="p">,</span>p<span class="o">=</span><span class="m">0.5</span><span class="p">,</span>delta_r<span class="o">=</span><span class="m">0</span><span class="p">)</span> <span class="p">{</span>
sims <span class="o"><-</span> <span class="kp">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span>nsim<span class="p">)</span>
probs <span class="o"><-</span> <span class="kt">c</span><span class="p">(</span><span class="m">1</span><span class="o">-</span>gfunc<span class="p">(</span>delta_a<span class="p">)</span><span class="m">-0.5</span><span class="o">*</span>p<span class="p">,</span>p<span class="p">,</span>gfunc<span class="p">(</span>delta_a<span class="p">)</span><span class="m">-0.5</span><span class="o">*</span>p<span class="p">)</span>
zvals <span class="o"><-</span> <span class="kp">sample</span><span class="p">(</span><span class="kt">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">0.5</span><span class="p">,</span><span class="m">1</span><span class="p">),</span>size<span class="o">=</span>nsim<span class="p">,</span>replace<span class="o">=</span><span class="kc">TRUE</span><span class="p">,</span>prob<span class="o">=</span>probs<span class="p">)</span>
<span class="kr">for</span> <span class="p">(</span>mmm <span class="kr">in</span> <span class="m">1</span><span class="o">:</span>nsim<span class="p">)</span> <span class="p">{</span>
delta_r <span class="o"><-</span> delta_r <span class="o">+</span> <span class="m">2</span> <span class="o">*</span> k <span class="o">*</span> <span class="p">(</span>zvals<span class="p">[</span>mmm<span class="p">]</span> <span class="o">-</span> gfunc<span class="p">(</span>delta_r<span class="p">))</span>
sims<span class="p">[</span>mmm<span class="p">]</span> <span class="o"><-</span> delta_r
<span class="p">}</span>
<span class="c1"># recall that epsilon is Delta r - Delta a</span>
retv <span class="o"><-</span> sims <span class="o">-</span> delta_a
retv
<span class="p">}</span>
<span class="kn">library</span><span class="p">(</span>tibble<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>tidyr<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>dplyr<span class="p">)</span>
<span class="kp">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span>
simv <span class="o"><-</span> tribble<span class="p">(</span><span class="o">~</span>k<span class="p">,</span><span class="o">~</span>p<span class="p">,</span><span class="o">~</span>delta_a<span class="p">,</span>
<span class="m">5</span><span class="p">,</span><span class="m">0</span><span class="p">,</span><span class="m">0</span><span class="p">,</span>
<span class="m">10</span><span class="p">,</span><span class="m">0.25</span><span class="p">,</span><span class="m">0</span><span class="p">,</span>
<span class="m">20</span><span class="p">,</span><span class="m">0.375</span><span class="p">,</span><span class="m">0</span><span class="p">,</span>
<span class="m">40</span><span class="p">,</span><span class="m">0.5</span><span class="p">,</span><span class="m">0</span><span class="p">)</span> <span class="o">%>%</span>
complete<span class="p">(</span>k<span class="p">,</span>p<span class="p">,</span>delta_a<span class="p">)</span> <span class="o">%>%</span>
group_by<span class="p">(</span>k<span class="p">,</span>p<span class="p">,</span>delta_a<span class="p">)</span> <span class="o">%>%</span>
mutate<span class="p">(</span>empirical<span class="o">=</span><span class="kt">list</span><span class="p">(</span>sd<span class="p">(</span>ratsim<span class="p">(</span>nsim<span class="o">=</span><span class="m">50000</span><span class="p">,</span>delta_a<span class="o">=</span>delta_a<span class="p">,</span>k<span class="o">=</span>k<span class="p">,</span>p<span class="o">=</span>p<span class="p">))))</span> <span class="o">%>%</span>
unnest<span class="p">()</span> <span class="o">%>%</span>
ungroup<span class="p">()</span> <span class="o">%>%</span>
mutate<span class="p">(</span>sigmasq<span class="o">=</span>gfunc<span class="p">(</span>delta_a<span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> gfunc<span class="p">(</span>delta_a<span class="p">))</span> <span class="o">-</span> <span class="p">(</span>p<span class="o">/</span><span class="m">4</span><span class="p">))</span> <span class="o">%>%</span>
mutate<span class="p">(</span>theoretical<span class="o">=</span><span class="kp">sqrt</span><span class="p">(</span>k<span class="o">*</span>sigmasq <span class="o">/</span> <span class="p">(</span>gprime<span class="p">(</span>delta_a<span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="m">1</span> <span class="o">-</span> k <span class="o">*</span> gprime<span class="p">(</span>delta_a<span class="p">)))))</span>
<span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span>
<span class="kn">library</span><span class="p">(</span>forcats<span class="p">)</span>
ph <span class="o"><-</span> simv <span class="o">%>%</span>
mutate<span class="p">(</span>p<span class="o">=</span>forcats<span class="o">::</span>fct_reorder<span class="p">(</span><span class="kp">factor</span><span class="p">(</span>p<span class="p">),</span>empirical<span class="p">,</span><span class="m">.</span>desc<span class="o">=</span><span class="kc">TRUE</span><span class="p">))</span> <span class="o">%>%</span>
ggplot<span class="p">(</span>aes<span class="p">(</span>x<span class="o">=</span>k<span class="p">,</span>y<span class="o">=</span>empirical<span class="p">,</span>colour<span class="o">=</span>p<span class="p">))</span> <span class="o">+</span>
geom_point<span class="p">()</span> <span class="o">+</span>
geom_line<span class="p">(</span>aes<span class="p">(</span>y<span class="o">=</span>theoretical<span class="p">))</span> <span class="o">+</span>
scale_x_sqrt<span class="p">()</span> <span class="o">+</span>
labs<span class="p">(</span>x<span class="o">=</span><span class="s">'k factor'</span><span class="p">,</span>y<span class="o">=</span><span class="s">'empirical standard deviation of Elo'</span><span class="p">,</span>
title<span class="o">=</span><span class="s">'empirical and theoretical standard deviation of Elo'</span><span class="p">)</span>
<span class="kp">print</span><span class="p">(</span>ph<span class="p">)</span>
</pre></div>
<p><img src="http://www.gilgamath.com/figure/elo_sims-1.png" title="plot of chunk elo_sims" alt="plot of chunk elo_sims" width="600px" height="500px" /></p>
<p>Note that for very large values of <span class="math">\(p\)</span>, which suppresses the variance of <span class="math">\(z\)</span>,
and for small values of <span class="math">\(k\)</span> we see standard deviation on <span class="math">\(\epsilon\)</span> around
30. Note this is the standard deviation on the error of the <em>differences</em>
in ratings. The Elo rating for a player should have smaller error around it,
smaller by a factor of <span class="math">\(1/\sqrt{2}\)</span>. Thus a standard deviation of around 20
Elo points is to be expected for a player whose <span class="math">\(k\)</span> factor is <span class="math">\(10\)</span>, with the
error growing like the square root of <span class="math">\(k\)</span>.</p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = 'https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>