Conditional Portfolios with Feature Flattening

Wed 19 June 2019 by Steven E. Pav

Conditional Portfolios

When I first started working at a quant fund I tried to read about portfolio theory. (Beyond, you know, "Hedge Funds for Dummies.") I learned about various objectives and portfolio constraints, including the Markowitz portfolio, which felt very natural. Markowitz solves the mean-variance optimization problem, as well as the Sharpe maximization problem, namely

$$ \operatorname{argmax}_w \frac{w^{\top}\mu}{\sqrt{w^{\top} \Sigma w}}. $$

This is solved, up to scaling, by the Markowitz portfolio \(\Sigma^{-1}\mu\).

When I first read about the theory behind Markowitz, I did not read anything about where \(\mu\) and \(\Sigma\) come from. I assumed the authors I was reading were talking about the vanilla sample estimates of the mean and covariance, though the theory does not require this.

There are some problems with the Markowitz portfolio. For us, as a small quant fund, the most pressing issue was that holding the Markowitz portfolio based on the historical mean and covariance was not a good look. You don't get paid "2 and twenty" for computing some long term averages.

Rather than holding an unconditional portfolio, we sought to construct a conditional one, conditional on some "features". (I now believe this topic falls under the rubric of "Tactical Asset Allocation".) We stumbled on two simple methods for adapting Markowitz theory to accept conditioning information: Conditional Markowitz, and "Flattening".

Conditional Markowitz

Suppose you observe some \(l\) vector of features, \(f_i\) prior to the time you have to allocate into \(p\) assets to enjoy returns \(x_i\). Assume that the returns are linear in the features, but the covariance is a long term average. That is

$$ E\left[x_i \left|f_i\right.\right] = B f_i,\quad\mbox{Var}\left(x_i \left|f_i\right.\right) = \Sigma. $$

Note that Markowitz theory never really said how to estimate mean …

read more

No Parity like a Risk Parity.

Sun 09 June 2019 by Steven E. Pav

Portfolio Selection and Exchangeability

Consider the problem of portfolio selection, where you observe some historical data on \(p\) assets, say \(n\) days worth in an \(n\times p\) matrix, \(X\), and then are required to construct a (dollarwise) portfolio \(w\). You can view this task as a function \(w\left(X\right)\). There are a few different kinds of \(w\) function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.

How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.

One way to consider these different methods of portfolio construction is via the lens of exchangeability. Roughly speaking, how does the function \(w\left(X\right)\) react under certain systematic changes in \(X\) that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of \(X\) alphabetically, so now you must reorder your \(X\). Assuming no new data has been observed, shouldn't \(w\left(X\right)\) simply reorder its output in the same way?

Put another way, suppose a method \(w\) systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where AAPL happened to be the first stock in the universe. Your \(w\) might seem to outperform other methods for no good reason.

Equivariance to order is a kind of exchangeability condition. The 'right' kind of \(w\) is 'order …

read more

fromo 0.2.0

Sun 13 January 2019 by Steven E. Pav

I recently pushed version 0.2.0 of my fromo package to CRAN. This package implements (relatively) fast, numerically robust computation of moments via Rcpp.

The big changes in this release are:

  • Support for weighted moment estimation.
  • Computation of running moments over windows defined by time (or some other increasing index), rather than vector index.
  • Some modest improvements in speed for the 'dangerous' use cases (no checking for NA, no weights, etc.)

The time-based running moments are supported via the t_running_* operations, and we support means, standard deviation, skew, kurtosis, centered and standardized moments and cumulants, z-score, Sharpe, and t-stat. The idea is that your observations are associated with some increasing index, which you can think of as the observation time, and you wish to compute moments over a fixed time window. To bloat the API, the times from which you 'look back' can optionally be something other than the time indices of the input, so the input and output size can be different.

Some example uses might be:

  • Compute the volatility of an asset's returns over the previous 6 months, on every trade day.
  • Compute the total monthly sales of a company at month ends.

Because the API also allows you to use weights as implicit time deltas, you can also do weird and unadvisable things like compute the Sharpe of an asset over the last 1 million shares traded.

Speed improvements come from my random walk through c++ design idioms. I also implemented a 'swap' procedure for the running standard deviation which incorporates a Welford's method addition and removal into a single step. I do not believe that Welford's method is the fastest algorithm for a summarizing moment computation: probably a two pass solution to compute the mean first, then the centered moments is faster. However, for the …

read more

Twelve Dimensional Chess is Stupid

Tue 16 October 2018 by Steven

Chess and the Curse of Dimensionality

read more