Gilgamath

It's Madness!

Sat 02 January 2016 by Steven E. Pav

I recently released a package to CRAN called madness. The eponymous object supports 'multivariate' automatic differentiation by forward accumulation. By 'multivariate', I mean it allows you to track (and automatically computes) the derivative of a scalar, or vector, or matrix, or multidimensional array with respect to a scalar, vector, matrix or multidimensional array.

The primary use case in mind is the multivariate delta method, where one has an estimate of a population quantity and the variance-covariance of the same, and wants to perform inference on some transform of that population quantity. With the values stored in a madness object, one merely performs the transforms directly on the estimate, and the derivatives are computed automatically. A secondary use case would be for the automatic computation of gradients when optimizing some complex function, e.g. in the computation of the MLE of some quantity.

A madness object contains a value, val, as well as the derivative of val with respect to some \(X\), called dvdx. The derivative is stored as a matrix in 'numerator layout' convention: if val holds \(m\) values, and \(X\) holds \(n\) values, then dvdx is a \(m \times n\) matrix. This unfortunately means that a gradient is stored as a row vector. Numerator layout feels more natural (to me, at least) when propagating derivatives via the chain rule.

For convenience, one can also store the 'tags' of the value and \(X\), in vtag and xtag, respectively. The vtag will be modified when computations are performed, which can be useful for debugging. One can also store the variance-covariance matrix of \(X\) in varx.

Here is an example session showing the use of a madness object. Note that by default if one does not feed in dvdx, the object constructor assumes that the value is equal to \(X\), and so …

Inference on Sorts

Wed 30 December 2015 by Steven E. Pav

Previously, I described a model for taste preference appropriate for some experiments in cocktail design I conducted years ago. I noted that this model was so elegant and simple, it must have been discovered previously, and have a rich theory around it. In the two weeks since then, I discovered a new paper on arxiv about inference on ranks from comparisons. They review a model much like the one I outlined, calling it the Bradley-Terry-Luce model. (Hey, look, there is indeed a package on CRAN for this with a vignette!)

The paper by Shah and Wainright outlines a very simple method for estimating the top \(k\) of \(n\) participants when the contests include exactly two participants each. If I am reading it correctly, you take the average number of observed wins for each contestant, then grab the top \(k\). They prove that this algorithm is optimal under certain conditions. This seems to me like an ideal outcome for a research result: the algorithm is dead simple, and people have likely been using it for years, while the proof is somewhat intricate. Unfortunately, it does not seem straightforward to generalize the algorithm to the case where there are covariates, or 'features' about the various contestants, nor necessarily to the case of multiple contestants in a given contest. The Bradley-Terry model, on the other hand, is readily adaptable to these modifications.

No Accounting for Taste

Sat 19 December 2015 by Steven E. Pav

Many years ago, before I had kids, I was afflicted with a mania for Italian bitters. A particularly ugly chapter of this time included participating (at least in my own mind) in a contest to design a cocktail containing [obnoxious brand] Amaro. I was determined to win this contest, and win it with science. After weeks of scattershot development (with permanent damage to our livers), the field of potential candidates was winnowed down to around 12 or so. I then planned a 'party' with a few dozen friend-tasters to determine the final entrant into the contest.

As I had no experience with market research or experimental design, I was nervous about making rookie mistakes. I was careful, or so I thought, about the experimental design--assigning raters to cocktails in a balanced design, assigning random one-time codes to the cocktails, adding control cocktails, double blinding the tastings, and so on. The part that I was completely fanatical about was that tasters should not assign numerical ratings to the cocktails. I reasoned that the intra-rater and inter-rater reliability was far too poor. Instead, each rater would be presented with two cocktails, and state their preference. While in some situations, with experienced raters using the same rating scale, this might result in a loss of power, in my situation, with a gaggle of half-drunk friends, it solved the problem of inconsistent application of numerical ratings. The remaining issue was how to interpret the data to select a winner.

Tell me what you really think.

You can consider my cocktail party as a series of 'elections', each with two or more 'candidates' (in my case exactly two every time), and a single winner for each election. For each candidate in each election, you have some 'covariates', or 'features' as the ML people would call …

Why not Matlab

Sat 04 October 2014 by Steven E. Pav

A long time Matlab user, stuck in a marriage of convenience, mumbles in his beer.