No Accounting for Taste

Sat 19 December 2015 by Steven E. Pav

Many years ago, before I had kids, I was afflicted with a mania for Italian bitters. A particularly ugly chapter of this time included participating (at least in my own mind) in a contest to design a cocktail containing [obnoxious brand] Amaro. I was determined to win this contest, and win it with science. After weeks of scattershot development (with permanent damage to our livers), the field of potential candidates was winnowed down to around 12 or so. I then planned a 'party' with a few dozen friend-tasters to determine the final entrant into the contest.

As I had no experience with market research or experimental design, I was nervous about making rookie mistakes. I was careful, or so I thought, about the experimental design--assigning raters to cocktails in a balanced design, assigning random one-time codes to the cocktails, adding control cocktails, double blinding the tastings, and so on. The part that I was completely fanatical about was that tasters should not assign numerical ratings to the cocktails. I reasoned that the intra-rater and inter-rater reliability was far too poor. Instead, each rater would be presented with two cocktails, and state their preference. While in some situations, with experienced raters using the same rating scale, this might result in a loss of power, in my situation, with a gaggle of half-drunk friends, it solved the problem of inconsistent application of numerical ratings. The remaining issue was how to interpret the data to select a winner.

Tell me what you really think.

You can consider my cocktail party as a series of 'elections', each with two or more 'candidates' (in my case exactly two every time), and a single winner for each election. For each candidate in each election, you have some 'covariates', or 'features' as the ML people would call them. The covariates could be things like alcohol content, pH, temperature, sweetness, specific gravity, and so on. Or they might just be indicator variables for each of the 12 cocktails in my stable of candidates. The goal of the experiment is to model probability of winning an election as some function of the covariates.

There is a decent basic model for this kind of experiment. To simplify the exposition, just consider a single election with $n$ candidates. Let $X_i$ be the vector of covariates about candidate $i$, a $p$-vector. Let $\beta$ be some $p$-vector of population coefficients, which are unknown in practice and have to be estimated. Then model some 'proto-probabilities' as log-linear in the covariates. That is, define

$$ \log\left(\pi_i\right) = \beta^{\top}X_i. $$

If $p_i$ is the probability that candidate $i$ wins the election, just normalize the proto-probabilities to sum to 1, so you set

$$ p_i = \frac{\exp\left(\beta^{\top}X_i\right)}{\sum_{1\le j \le n} \exp\left(\beta^{\top}X_j\right)}. $$

From this definition, it is fairly straightforward to write out the log-likelihood function when one observes data from multiple elections. With a bit more elbow grease the gradient and Hessian can also be computed. From there, one can then rely on any of the MLE helper packages on CRAN (or an equivalent in your favorite language) to compute the MLE of $\beta$ and the estimated variance-covariance.

It's logistic (boogie woogie woogie).

One notices immediately that the probability of winning an election is invariant with respect to the 'DC level' of each of the covariates. That is, if you add the fixed vector $Z$ to each of the covariates, $X_i$ in one election, then the proto-probabilities are scaled by the same multiplicative factor, $\exp\left(\beta^{\top}Z\right)$, and the probabilities, $p_i$ are unaffected due to the normalization. So, without loss of generality, you can always assume, say, that the sum of the $X_i$ in a given election is the zero vector (or anything else for that matter).

Consider what happens, then, in a two candidate election under the zero-sum-normalization. Because $X_2 = - X_1$, we have

$$ p_i = \frac{\exp\left(\beta^{\top}X_1\right)}{% \exp\left(\beta^{\top}X_1\right) + \exp\left(-\beta^{\top}X_1\right)} = \frac{\exp\left(2\beta^{\top}X_1\right)}{% 1 + \exp\left(2\beta^{\top}X_1\right)}. $$

That's just logistic regression. As this model is so simple, and naturally generalizes logistic regression, I would wager dollars to donuts that it has been analyzed before and has a proper name and rich theory around it. My coy posting on cross validated about this problem received no reply, however, which I think is only an indicator that Cross Validated is no longer terribly useful. At the very least, it can be expressed as a specific case of multinomial logistic regression, although I suspect their might be some difficulties forcing a multinomial fitting routine to take the structure I have adopted here. In the multinomial case, the data are 'about' the election, and not the candidates, so it is a bit awkward to make the translation.

A bitter aftertaste

Having a good model will not take you as far as having some street smarts and domain knowledge. For example, in the cocktail selection experiment, if you use the alcohol content as one of the covariates, your model might suggest, depending on the sign of the corresponding element of $\beta$, that the optimal cocktail is either pure ethanol or a Shirley Temple. You need to convert the raw data into covariates which could give you an acceptable and sane answer.

Regrettably, it was this dash of domain knowledge that was missing from my final cocktail contest submission. When [obnoxious company] held this contest, their obvious motiviation was to raise awareness of their obnoxious brand, which they did by encouraging professional bartenders to use their product. Moreover, they could, one suspects, engender a large amount of goodwill in influential bartenders by feting them with prizes for their creations. One should suspect that the lack of any professional affiliation next to my name means that my contest submission received less than a moment's attention, science be damned!

As an appendix to this sad tale, I present the top two winners from our night of a thousand comparisons. For 'medium amaro', I recommend you use Ramazzotti or Abano. Better yet, save yourself the trouble and just have a shot of pure Braulio and call it a night.

the 325

1 oz medium amaro
1 oz der Lachs Goldwasser (not Goldschlager. totally different)
3/4 oz Cynar
pastis wash
tonic
orange bitters
blood orange peel garnish

Wash a cocktail glass out with pastis. Dump it out. Shake the amaro, Goldwasser and Cynar with ice; strain into cocktail glass. Top with tonic water (fizz to taste), add a dash of orange bitters. Garnish.

the Waller Fizz

1 oz medium amaro
1 oz Aperol
1 oz Campari
0.5 oz sweet vermouth (Carpano Antica works)
2 oz tonic water
4 oz plain kombucha (make sure its fresh).
lemon peel garnish

Shake the booze together with ice; strain into a 12 oz highball glass. Stir in the tonic and kombucha. Fizzy and refreshing. Garnish.

Gilgamath

No Accounting for Taste

Tell me what you really think.

It's logistic (boogie woogie woogie).

A bitter aftertaste

Comments