Portfolio Selection and Exchangeability

Consider the problem of portfolio selection, where you observe some historical data on \(p\) assets, say \(n\) days worth in an \(n\times p\) matrix, \(X\), and then are required to construct a (dollarwise) portfolio \(w\). You can view this task as a function \(w\left(X\right)\). There are a few different kinds of \(w\) function: Markowitz, equal dollar, Minimum Variance, Equal Risk Contribution ('Risk Parity'), and so on.

How are we to choose among these competing approaches? Their supporters can point to theoretical underpinnings, but these often seem a bit shaky even from a distance. Usually evidence is provided in the form of backtests on the historical returns of some universe of assets. It can be hard to generalize from a single history, and these backtests rarely offer theoretical justification for the differential performance in methods.

One way to consider these different methods of portfolio construction is via the lens of exchangeability. Roughly speaking, how does the function \(w\left(X\right)\) react under certain systematic changes in \(X\) that "shouldn't" matter. For example, suppose that the ticker changed on one stock in your universe. Suppose you order the columns of \(X\) alphabetically, so now you must reorder your \(X\). Assuming no new data has been observed, shouldn't \(w\left(X\right)\) simply reorder its output in the same way?

Put another way, suppose a method \(w\) systematically overweights the first element of the universe (This seems more like a bug than a feature), and you observe backtests over the 2000's on U.S. equities where AAPL happened to be the first stock in the universe. Your \(w\) might seem to outperform other methods for no good reason.

Equivariance to order is a kind of exchangeability condition. The 'right' kind of \(w\) is 'order exchangeable'. Other examples come from considering rotations or basketization. Suppose that today your universe consists of stocks A and B, which you can hold long or short, but tomorrow you can only buy basket C which is equal dollars long in A and B, and basket D which is equal dollars long A and short B. Tomorrow you can achieve the same holdings that you wanted today, but by holding the baskets. Your portfolio function should be exchangeable with respect to this transformation, suggesting you hold the same equivalent position.

In math, let \(Q\) be an invertible \(p\times p\) matrix. We will consider what should happen if returns are transformed by \(Q^{\top}\). Exchangeability holds when

$$ w\left(X Q\right) = Q^{-1}w\left(X\right). $$

If this holds for all invertible \(Q\) then the \(w\) satisfies the exchangeability condition. Some \(w\) might maintain the above relationship for some kinds of \(Q\), leading to weaker forms of exchangeability. Here we name them with the class of \(Q\):

  • A \(w\) satisfies 'order exchangeability' property if it exchangeable for all permutation matrices \(Q\);
  • 'leverage exchangeability' if it is exchangeable for all diagonal \(Q\);
  • 'rotational exchangeability' if it is exchangeable for all orthogonal \(Q\).

Leverage exchangeability is illustrated by considering what would happen if each asset was replaced by, say, a 2x or 3x levered version of the same asset.

One consequence of exchangeability is that "only returns matter". That is if we exchange returns \(x\mapsto Q^{\top}x\), and portfolio \(w\mapsto Q^{-1}w\), then the returns achieved by that portfolio map to \(x^{\top}w \mapsto x^{\top}Q Q^{-1}w = x^{\top}w\). The returns you achieve are the same under the transformation. This dependence of \(w\) on returns was a key assumption in my work on portfolio quality bounds.

It should be recognized that "only returns matter" is questionable in practical portfolio construction, since the real world often imposes constraints (long only, max concentration), exhibits different costs and frictions for different assets, and contains other oddities like tax implications, and so on.

Constraints in particular complicate the general definition of exchangeability because in the transformation by \(Q\) the original constraints should also be translated. In some cases, say where the constraint is an upper bound on risk, the constraint definition is identical under the transformation. However, the image of a long-only constraint under a general linear transformation by \(Q\) will not in general still be a long-only constraint. In a long-only world, we can perhaps only expect order- or (positive) leverage exchangeability, and not the general form.

Setting aside the issues with constraints, it is still useful, I think, to consider the objectives of portfolio construction techniques with respect to exchangeability, inasmuch as they have them.

For example, the "one over N" (or "equal dollar", "Talmudic", etc.) rule clearly does not satisfy general exchangeability, nor leverage exchangeability. The Equal Risk Contribution portfolio, which we will describe below, also fails exchangeability. The Markowitz Portfolio, however, does satisfy exchangeability:

$$ \Sigma^{-1}\mu \mapsto Q^{-1}\Sigma^{-1}Q^{-\top}Q^{\top}\mu = Q^{-1}\Sigma^{-1}\mu, $$

as needed.

In fact, "equal dollar" seems not so much an objective as a constraint of the portfolio allocation. There is no objective beyond perhaps "make it seem like we are doing something with client money." The same complaint will apply to ERC. In fact, you can express Markowitz, Mean Variance, ERC (and I believe equal dollar) as similar optimization problems with risk constraints. However, the objectives do look a lot like make-work.

Equal Risk Contribution

The set-up for Equal Risk Contribution portfolio, or Risk Parity, is as follows: define the risk of a portfolio \(w\) as the standard deviation of returns, \(r = \sqrt{w^{\top}\Sigma w}\). This function is homogeneous of order 1 meaning that if you positively rescale your whole portfolio by \(k\), the risk scales by \(k\). That is if you map \(w \mapsto k w\) then \(\sqrt{w^{\top}\Sigma w} \mapsto k \sqrt{w^{\top}\Sigma w}\) for positive \(k\).

Using Euler's Homogeneous function theorem, we can express the risk as

$$r = w^{\top} \nabla_{w}r = w^{\top} \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}.$$

The theory behind Risk Parity then says because of this equation, the vector \(w \odot \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}\) is the "risk in each asset," where \(\odot\) is the Hadamard (elementwise) multiplication. This is very tempting because the sum of the elements of this vector is exactly \(r\) by Euler's Theorem. The Equal Risk Portfolio is the one such that each element of \(w \odot \frac{\Sigma w}{\sqrt{w^{\top}\Sigma w}}\) is the same. It has "equal risk in each asset".

However, I can see no principled reason to view this vector as the risk in each asset. By definition it happens to be the marginal contribution to risk from each asset due to a proportional change in holdings. That is, it is equal to \(\nabla_{\log(w)}r\), and expresses how risk would change under a small proportional change in weight in your portfolio. However, it is clearly not the risk in each asset because it can contain negative elements! If you hold an asset that diversifies (i.e. has negative correlation with) existing holdings, then increasing your contribution can decrease risk. The fact that the elements of this vector sum to the total risk is also not convincing: one could just as easily say that each asset has \(r / p\) risk in it, and capture the same property.

As mentioned above, the risk contribution vector does not satisfy an exchangeability condition. Taking \(x\mapsto Q^{\top}x\) and assuming exchangeability, \(w\mapsto Q^{-1}w\), then \(r \mapsto r\) and

$$ w \odot \frac{\Sigma w}{r} \mapsto Q^{-1} w \odot \frac{Q^{\top}\Sigma w}{r}. $$

That is, if \(w\) was the ERC portfolio, then \(Q^{-1}w\) is not the ERC in transformed space.

You can confirm this in code, which I have lifted from the riskParityPortfolio vignette. The ERC is not exchangeable for general \(Q\) or orthogonal \(Q\), but is for diagonal \(Q\). We check them here:

suppressMessages({
    library(riskParityPortfolio)
    library(mvtnorm)
})

risk <- function(w,Sigma) { sqrt(as.numeric(w %*% Sigma %*% w)) }
riskcon <- function(w,Sigma) {
    Sw <- Sigma %*% w
    as.numeric(w * (Sw) / sqrt(as.numeric(w %*% Sw)))
}

# from the excellent vignette:
# generate synthetic data
set.seed(42)
N <- 5
V <- matrix(rnorm(N*(N+50)), ncol = N)
Sigma <- cov(V)
portfolio <- riskParityPortfolio(Sigma=Sigma) 

# print('check general exchangeability\n')
Q <- rWishart(1,50,Sigma=diag(N))
dim(Q) <- c(N,N)
knitr::kable(data_frame(start=riskcon(portfolio$w,Sigma),
                                                Q_trans=riskcon(solve(Q,portfolio$w),t(Q) %*% Sigma %*% Q)))
start Q_trans
0.076136 0.076276
0.076136 0.078032
0.076136 0.070880
0.076136 0.077396
0.076136 0.078098
# print('check orthogonal exchangeability\n')
set.seed(123)
B <- rWishart(1,50,Sigma=diag(N))
dim(B) <- c(N,N)
Q <- eigen(B)$vectors
knitr::kable(data_frame(start=riskcon(portfolio$w,Sigma),
                                                Q_trans=riskcon(solve(Q,portfolio$w),t(Q) %*% Sigma %*% Q)))
start Q_trans
0.076136 0.046762
0.076136 0.021432
0.076136 0.226237
0.076136 0.082872
0.076136 0.003379
# print('check leverage exchangeability\n')
set.seed(17)
Q <- diag(runif(N,min=0.5,max=2.0))

knitr::kable(data_frame(start=riskcon(portfolio$w,Sigma),
                                                Q_trans=riskcon(solve(Q,portfolio$w),t(Q) %*% Sigma %*% Q)))
start Q_trans
0.076136 0.076136
0.076136 0.076136
0.076136 0.076136
0.076136 0.076136
0.076136 0.076136

The Symmetric Square Root

One of the reasons I wanted to write this post was to draw attention to the symmetric square root, which we typically do not use for portfolio construction, but is useful for risk decomposition. We can express the risk of a portfolio as

$$ r = \| \Sigma^{1/2} w \|_2^2, $$

where \(\Sigma^{1/2}\) is any matrix square root of \(\Sigma\). Then the elements of \(\Sigma^{1/2} w\) would seem to decompose the risk of your portfolio, in a squared error sense. That is, the elements of \(\Sigma^{1/2} w\), when squared sum to the risk squared. That vector may contain negative elements, but this does not affect the square sum. We can just square the elements of \(\Sigma^{1/2} w\), and claim we have "decomposed risk". Whether this is a useful decomposition, or has any real meaning, is debatable. We can check if this is an exchangeable function.

If you use the Cholesky square root, this risk decomposition does not satisfy order exchangeability! This clearly seems like a bad way to express risk. If, however, you use the symmetric square root, then the decomposition is exchangeable with resect to reordering, relevering, and even to rotation, but perhaps not to general transformation by \(Q\). Under a orthogonal \(Q\) we have \(\Sigma^{1/2} \mapsto Q^{\top}\Sigma^{1/2}Q\) and so if \(w\mapsto Q^{-1}w\), then \(\Sigma^{1/2} w \mapsto Q^{\top}\Sigma^{1/2}w\).

Again it is not clear this is a meaningful decomposition of risk. Whether it is or not, I am not aware of this definition being used to construct an ERC portfolio, though I suspect it is only a matter of time.