gilgamath


Markowitz Portfolio Covariance, Elliptical Returns

Mon 12 March 2018 by Steven E. Pav

In a previous blog post, I looked at asymptotic confidence intervals for the Signal to Noise ratio of the (sample) Markowitz portfolio, finding them to be deficient. (Perhaps they are useful if one has hundreds of thousands of days of data, but are otherwise awful.) Those confidence intervals came from revision four of my paper on the Asymptotic distribution of the Markowitz Portfolio. In that same update, I also describe, albeit in an obfuscated form, the asymptotic distribution of the sample Markowitz portfolio for elliptical returns. Here I check that finding empirically.

Suppose you observe a \(p\) vector of returns drawn from an elliptical distribution with mean \(\mu\), covariance \(\Sigma\) and 'kurtosis factor', \(\kappa\). Three times the kurtosis factor is the kurtosis of marginals under this assumed model. It takes value \(1\) for a multivariate normal. This model of returns is slightly more realistic than multivariate normal, but does not allow for skewness of asset returns, which seems unrealistic.

Nonetheless, let \(\hat{\nu}\) be the Markowitz portfolio built on a sample of \(n\) days of independent returns:

$$ \hat{\nu} = \hat{\Sigma}^{-1} \hat{\mu}, $$

where \(\hat{\mu}, \hat{\Sigma}\) are the regular 'vanilla' estimates of mean and covariance. The vector \(\hat{\nu}\) is, in a sense, over-corrected, and we need to cancel out a square root of \(\Sigma\) (the population value). So we will consider the distribution of \(Q \Sigma^{\top/2} \hat{\nu}\), where \(\Sigma^{\top/2}\) is the upper triangular Cholesky factor of \(\Sigma\), and where \(Q\) is an orthogonal matrix (\(Q Q^{\top} = I\)), and where \(Q\) rotates \(\Sigma^{-1/2}\mu\) onto \(e_1\), the first basis vector:

$$ Q \Sigma^{-1/2}\mu = \zeta e_1, $$

where \(\zeta\) is the Signal to Noise ratio of the population Markowitz portfolio: \(\zeta = \sqrt{\mu^{\top}\Sigma^{-1}\mu} = \left\Vert …

read more

A Lack of Confidence Interval

Thu 15 February 2018 by Steven E. Pav

For some years now I have been playing around with a certain problem in portfolio statistics: suppose you observe \(n\) independent observations of a \(p\) vector of returns, then form the Markowitz portfolio based on those returns. What then is the distribution of what I call the 'signal to noise ratio' of that Markowitz portfolio, defined as the true expected return divided by the true volatility. That is, if \(\nu\) is the Markowitz portfolio, built on a sample, its 'SNR' is \(\nu^{\top}\mu / \sqrt{\nu^{\top}\Sigma \nu}\), where \(\mu\) is the population mean vector, and \(\Sigma\) is the population covariance matrix.

This is an odd problem, somewhat unlike classical statistical inference because the unknown quantity, the SNR, depends on population parameters, but also the sample. It is random and unknown. What you learn in your basic statistics class is inference on fixed unknowns. (Actually, I never really took a basic statistics class, but I think that's right.)

Paulsen and Sohl made some progress on this problem in their 2016 paper on what they call the Sharpe Ratio Information Criterion. They find a sample statistic which is unbiased for the portfolio SNR when returns are (multivariate) Gaussian. In my mad scribblings on the backs of envelopes and scrap paper, I have been trying to find the distribution of the SNR. I have been looking for this love, as they say, in all the wrong places, usually hoping for some clever transformation that will lead to a slick proof. (I was taught from a young age to look for slick proofs.)

Having failed that mission, I pivoted to looking for confidence intervals for the SNR (and maybe even prediction intervals on the out-of-sample Sharpe ratio of the in-sample Markowitz portfolio). I realized that some of the work I had done …

read more

geom cloud.

Thu 21 September 2017 by Steven E. Pav

I wanted a drop-in replacement for geom_errorbar in ggplot2 that would plot a density cloud of uncertainty. The idea is that typically (well, where I work), the ymin and ymax of an errorbar are plotted at plus and minus one standard deviation. A 'cloud' where the alpha is proportional to a normal density with the same standard deviations could show the same information on a plot with a little less clutter. I found out how to do this with a very ugly function, but wanted to do it the 'right' way by spawning my own geom. So the geom_cloud.

After looking at a bunch of other ggplot2 extensions, some amount of tinkering and hair-pulling, and we have the following code. The first part just computes standard deviations which are equally spaced in normal density. This is then used to create a list of geom_ribbon with equal alpha, but the right size. A little trickery is used to get the scales right. There are three parameters: the steps, which control how many ribbons are drawn. The default value is a little conservative. A larger value, like 15, gives very smooth clouds. The se_mult is the number of standard deviations that the ymax and ymin are plotted at, defaulting to 1 here. If you plot your errorbars at 2 standard errors, change this to 2. The max_alpha is the alpha at the maximal density, i.e. around y.

# get points equally spaced in density 
equal_ses <- function(steps) {
    xend <- c(0,4)
    endpnts <- dnorm(xend)
# perhaps use ppoints instead?
    deql <- seq(from=endpnts[1],to=endpnts[2],length.out=steps+1)
    davg <- (deql[-1] + deql[-length(deql)])/2
# invert
    xeql <- unlist(lapply(davg,function(d) {
                     uniroot(f=function(x) { dnorm(x) - d },interval=xend)$root
    }))
    xeql
}

library(ggplot2)
library(grid)

geom_cloud <- function(mapping …
read more

Spy vs Spy vs Wald Wolfowitz.

Tue 05 September 2017 by Steven E. Pav

I turned my kids on to the great Spy vs Spy cartoon from Mad Magazine. This strip is pure gold for two young boys: Rube Goldberg plus explosions with not much dialog (one child is still too young to read). I became curious whether the one Spy had the upper hand, whether Prohias worked to keep the score 'even', and so on.

Not finding any data out there, I collected the data to the best of my ability from the Spy vs Spy Omnibus, which collects all 248 strips that appeared in Mad Magazine (plus two special issues). I think there are more strips out there by Prohias that appeared only in collected books, but have not collected them yet. I entered the data into a google spreadsheet, then converted into CSV, then into an R data package. Now you can play along at home.

On to the simplest form of my question: did Prohias alternate between Black and White Spy victories? or did he choose at random? Up until 1968 it was common for two strips to appear in one issue of Mad, with one victory per Spy. In some cases three strips appeared per issue, with the Grey Spy appearing in the third; the Black and White Spies always receive a comeuppance when she appears, and so the balance of power was maintained. After 1972, it seems that only a single strip appeared per issue, and we can examine the time series of victories.

library(SPYvsSPY)
library(dplyr)
data(svs)

# show that there are multiple per strip
svs %>%
    group_by(Mad_no,yrmo) %>%
        summarize(nstrips=n(),
                            net_victories=sum(as.numeric(white_comeuppance) - as.numeric(black_comeuppance))) %>%
    ungroup() %>%
    select(yrmo,nstrips,net_victories) %>%
    head(n=20) %>%
    kable()
yrmo nstrips net_victories
1961-01 3 -1
1961-03 2 0
1961-04 2 0
1961-06 2 0
1961-07 2 …
read more