Conditional Portfolios with Feature Flattening

Wed 19 June 2019 by Steven E. Pav

Conditional Portfolios

When I first started working at a quant fund I tried to read about portfolio theory. (Beyond, you know, "Hedge Funds for Dummies.") I learned about various objectives and portfolio constraints, including the Markowitz portfolio, which felt very natural. Markowitz solves the mean-variance optimization problem, as well as the Sharpe maximization problem, namely

$$ \operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}. $$

This is solved, up to scaling, by the Markowitz portfolio \(\Sigma^{-1}\mu\).

When I first read about the theory behind Markowitz, I did not read anything about where \(\mu\) and \(\Sigma\) come from. I assumed the authors I was reading were talking about the vanilla sample estimates of the mean and covariance, though the theory does not require this.

There are some problems with the Markowitz portfolio. For us, as a small quant fund, the most pressing issue was that holding the Markowitz portfolio based on the historical mean and covariance was not a good look. You don't get paid "2 and twenty" for computing some long term averages.

Rather than holding an unconditional portfolio, we sought to construct a conditional one, conditional on some "features". (I now believe this topic falls under the rubric of "Tactical Asset Allocation".) We stumbled on two simple methods for adapting Markowitz theory to accept conditioning information: Conditional Markowitz, and "Flattening".

Conditional Markowitz

Suppose you observe some \(l\) vector of features, \(f_i\) prior to the time you have to allocate into \(p\) assets to enjoy returns \(x_i\). Assume that the returns are linear in the features, but the covariance is a long term average. That is

$$ E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma. $$

Note that Markowitz theory never really said how to estimate mean …

read more

No Parity like a Risk Parity.

Sun 09 June 2019 by Steven E. Pav

Portfolio Selection and Exchangeability

Consider the problem of portfolio selection, where you observe some historical data on \(p\) assets, say \(n\) days worth in an \(n\times p\) matrix, \(X\), and then are required to construct a (dollarwise) portfolio \(w\). You can view this task as a function \(w\left(X\right)\). There are a few different kinds of \(w\) function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.

How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.

One way to consider these different methods of portfolio construction is via the lens of exchangeability. Roughly speaking, how does the function \(w\left(X\right)\) react under certain systematic changes in \(X\) that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of \(X\) alphabetically, so now you must reorder your \(X\). Assuming no new data has been observed, shouldn't \(w\left(X\right)\) simply reorder its output in the same way?

Put another way, suppose a method \(w\) systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where AAPL happened to be the first stock in the universe. Your \(w\) might seem to outperform other methods for no good reason.

Equivariance to order is a kind of exchangeability condition. The 'right' kind of \(w\) is 'order …

read more

fromo 0.2.0

Sun 13 January 2019 by Steven E. Pav

I recently pushed version 0.2.0 of my fromo package to CRAN. This package implements (relatively) fast, numerically robust computation of moments via Rcpp.

The big changes in this release are:

  • Support for weighted moment estimation.
  • Computation of running moments over windows defined by time (or some other increasing index), rather than vector index.
  • Some modest improvements in speed for the 'dangerous' use cases (no checking for NA, no weights, etc.)

The time-based running moments are supported via the t_running_* operations, and we support means, standard deviation, skew, kurtosis, centered and standardized moments and cumulants, z-score, Sharpe, and t-stat. The idea is that your observations are associated with some increasing index, which you can think of as the observation time, and you wish to compute moments over a fixed time window. To bloat the API, the times from which you 'look back' can optionally be something other than the time indices of the input, so the input and output size can be different.

Some example uses might be:

  • Compute the volatility of an asset's returns over the previous 6 months, on every trade day.
  • Compute the total monthly sales of a company at month ends.

Because the API also allows you to use weights as implicit time deltas, you can also do weird and unadvisable things like compute the Sharpe of an asset over the last 1 million shares traded.

Speed improvements come from my random walk through c++ design idioms. I also implemented a 'swap' procedure for the running standard deviation which incorporates a Welford's method addition and removal into a single step. I do not believe that Welford's method is the fastest algorithm for a summarizing moment computation: probably a two pass solution to compute the mean first, then the centered moments is faster. However, for the …

read more

Markowitz Portfolio Covariance, Elliptical Returns

Mon 12 March 2018 by Steven E. Pav

In a previous blog post, I looked at asymptotic confidence intervals for the Signal to Noise ratio of the (sample) Markowitz portfolio, finding them to be deficient. (Perhaps they are useful if one has hundreds of thousands of days of data, but are otherwise awful.) Those confidence intervals came from revision four of my paper on the Asymptotic distribution of the Markowitz Portfolio. In that same update, I also describe, albeit in an obfuscated form, the asymptotic distribution of the sample Markowitz portfolio for elliptical returns. Here I check that finding empirically.

Suppose you observe a \(p\) vector of returns drawn from an elliptical distribution with mean \(\mu\), covariance \(\Sigma\) and 'kurtosis factor', \(\kappa\). Three times the kurtosis factor is the kurtosis of marginals under this assumed model. It takes value \(1\) for a multivariate normal. This model of returns is slightly more realistic than multivariate normal, but does not allow for skewness of asset returns, which seems unrealistic.

Nonetheless, let \(\hat{\nu}\) be the Markowitz portfolio built on a sample of \(n\) days of independent returns:

$$ \hat{\nu} = \hat{\Sigma}^{-1} \hat{\mu}, $$

where \(\hat{\mu}, \hat{\Sigma}\) are the regular 'vanilla' estimates of mean and covariance. The vector \(\hat{\nu}\) is, in a sense, over-corrected, and we need to cancel out a square root of \(\Sigma\) (the population value). So we will consider the distribution of \(Q \Sigma^{\top/2} \hat{\nu}\), where \(\Sigma^{\top/2}\) is the upper triangular Cholesky factor of \(\Sigma\), and where \(Q\) is an orthogonal matrix (\(Q Q^{\top} = I\)), and where \(Q\) rotates \(\Sigma^{-1/2}\mu\) onto \(e_1\), the first basis vector:

$$ Q \Sigma^{-1/2}\mu = \zeta e_1, $$

where \(\zeta\) is the Signal to Noise ratio of the population Markowitz portfolio: \(\zeta = \sqrt{\mu^{\top}\Sigma^{-1}\mu} = \left\Vert …

read more