gilgamath

## Inference on Sorts

Wed 30 December 2015 by Steven E. Pav

Previously, I described a model for taste preference appropriate for some experiments in cocktail design I conducted years ago. I noted that this model was so elegant and simple, it must have been discovered previously, and have a rich theory around it. In the two weeks since then, I discovered a new paper on arxiv about inference on ranks from comparisons. They review a model much like the one I outlined, calling it the Bradley-Terry-Luce model. (Hey, look, there is indeed a package on CRAN for this with a vignette!)

The paper by Shah and Wainright outlines a very simple method for estimating the top $$k$$ of $$n$$ participants when the contests include exactly two participants each. If I am reading it correctly, you take the average number of observed wins for each contestant, then grab the top $$k$$. They prove that this algorithm is optimal under certain conditions. This seems to me like an ideal outcome for a research result: the algorithm is dead simple, and people have likely been using it for years, while the proof is somewhat intricate. Unfortunately, it does not seem straightforward to generalize the algorithm to the case where there are covariates, or 'features' about the various contestants, nor necessarily to the case of multiple contestants in a given contest. The Bradley-Terry model, on the other hand, is readily adaptable to these modifications.

## Using vim as an IDE

Tue 29 December 2015 by Steven E. Pav

For a number of years now, I have been using vim as a lightweight IDE. The ecosystem of vim addons is rich. There are numerous plugins for creating tags to navigate a project, browse files in directories, highlight syntax and so on. What really makes it an IDE is the ability to execute code within the context of vim. I realize this probably sounds 'charming' to disciples of that other text editor, but it might seem like an unnatural urge to my vim correligionists. The piece that glues it all together is vim-conque. The easiest way to get conque in ubuntu is via apt as follows:

sudo apt-get install vim-addon-manager vim-conque


The skinny on using conque is that you can visual-select code that you are editing, hit <F9>, and it will be transfered to the execution window, newlines and all. So you can test out code while you are writing it. You can also work the other way, testing out code in a REPL, then, when it is working as expected, escape insert mode in the REPL, yank the working code to a register, and copy it into the file you are working on.

## Dockerfile or it didn't happen!

This kind of advice is a bit abstract, so I put a working example on github and dockerhub. You can run it yourself via docker:

# this might take a little while to download
docker pull shabbychef/vim-conque
docker run --rm -it shabbychef/vim-conque


This will feel a bit odd: when you run the last command, you are in vim, but you are in vim in a docker container. When you terminate, your changes will not be saved (this is the --rm flag). Directions are given in the file on how to start conque with a screen …

## You Deserve Expensive Champagne ... If You Buy It.

Sat 26 December 2015 by Steven E. Pav

I received some taster ratings from the champagne party we attended last week. I joined the raw ratings with the bottle information to create a single aggregated dataset. This is a 'non-normal' form, but simplest to distribute. Here is a taste:

library(dplyr)
champ %>% select(winery,purchase_price_per_liter,raternum,rating) %>%

winery purchase_price_per_liter raternum rating
Barons de Rothschild 80.00000 1 10
Onward Petillant Naturel 2014 Malavasia Bianca 33.33333 1 4
Chandon Rose Method Traditionnelle 18.66667 1 8
Martini Prosecco from Italy 21.32000 1 8
Roederer Estate Brut 33.33333 1 8
Kirkland Asolo Prosecco Superiore 9.32000 1 7
Champagne Tattinger Brute La Francaise 46.66667 1 6
Schramsberg Reserver 2001 132.00000 1 6

Recall that the rules of the contest dictate that the average rating of each bottle was computed, then divided by 25 dollars more than the price (presumably for a 750ml bottle). Depending on whether the average ratings were compressed around the high end of the zero to ten scale, or around the low end, one would wager on either the cheapest bottles, or more moderately priced offerings. (Based on my previous analysis, I brought the Menage a Trois Prosecco, rated at 91 points, but available at Safeway for 10 dollars.) It is easy to compute the raw averages using dplyr:

avrat <- champ %>%
group_by(winery,bottle_num,purchase_price_per_liter) %>%
summarize(avg_rating=mean(rating)) %>%
ungroup() %>%
arrange(desc(avg_rating))

winery bottle_num purchase_price_per_liter avg_rating
Desuderi Jeio 4 22.66667 6.750000
Gloria Ferrer Sonoma Brut 19 20.00000 6.750000
Roederer Estate Brut 12 34.66667 6.642857
Charles Collin Rose 34 33.33333 6.636364
Roederer Estate Brut 13 33.33333 6.500000
Gloria Ferrer Sonoma Brut 11 21 …

## No Accounting for Taste

Sat 19 December 2015 by Steven E. Pav

Many years ago, before I had kids, I was afflicted with a mania for Italian bitters. A particularly ugly chapter of this time included participating (at least in my own mind) in a contest to design a cocktail containing [obnoxious brand] Amaro. I was determined to win this contest, and win it with science. After weeks of scattershot development (with permanent damage to our livers), the field of potential candidates was winnowed down to around 12 or so. I then planned a 'party' with a few dozen friend-tasters to determine the final entrant into the contest.

As I had no experience with market research or experimental design, I was nervous about making rookie mistakes. I was careful, or so I thought, about the experimental design--assigning raters to cocktails in a balanced design, assigning random one-time codes to the cocktails, adding control cocktails, double blinding the tastings, and so on. The part that I was completely fanatical about was that tasters should not assign numerical ratings to the cocktails. I reasoned that the intra-rater and inter-rater reliability was far too poor. Instead, each rater would be presented with two cocktails, and state their preference. While in some situations, with experienced raters using the same rating scale, this might result in a loss of power, in my situation, with a gaggle of half-drunk friends, it solved the problem of inconsistent application of numerical ratings. The remaining issue was how to interpret the data to select a winner.

## Tell me what you really think.

You can consider my cocktail party as a series of 'elections', each with two or more 'candidates' (in my case exactly two every time), and a single winner for each election. For each candidate in each election, you have some 'covariates', or 'features' as the ML people would call …