## Chess Tactics.

Thu 30 March 2017
by

Steven E. Pav
I have become more interested in chess in the last year, though I'm still
pretty much crap at it. Rather than play games, I am practicing tactics
at chesstempo. Basically you are presented
with a chess puzzle, which is selected based on your estimated tactical
'Elo' rating, and your rating (and the puzzle's) is adjusted based
on whether you solve it correctly. (Without time limit for standard
problems, though I believe one can also train in 'blitz' mode.)
I decided to look at the data.

read more
## Lego Pricing.

Mon 06 March 2017
by

Steven E. Pav
It is time to get kiddo a new Lego set, as he's been on a bender
this week, building everything he can get his hands on. I wanted to
optimize play time per dollar spent, so I set out to look
for Lego pricing data.

read more
## Odds of Winning Your Oscar Pool.

Mon 30 January 2017
by

Steven E. Pav
In a previous blog post, I used a Bradley-Terry
model to analyze Oscar Best Picture winners, using the
best picture dataset.
In that post I presented the results of likelihood tests
which showed 'significant' relationships between winning the Best
Picture category and conomination for other awards, MovieLens ratings, and
(spuriously) number of IMDb votes. It can be hard to interpret the
effect sizes and \(t\) statistics from a Bradley-Terry model. So here
I will try to estimate the probability of correctly guessing the
Best Picture winner using this model.

There is no apparent direct translation from the coefficients
of the model fit to the probability of correctly forecasting
a winner. Nor can you transform the maximized likelihood, or an
R-squared. Moreover, it will depend on the number of nominees
(traditionally there were only 5 Best Picture nominations--these
days it's upwards of 9), and how they differ in the independent
variables. Here I will keep it simple and use cross validation.

I modified the oslm code to include
a `predict`

method. So here, I load the data and the code,
and remove duplicates and restrict the data to the period after
1945. I construct the model formula, based on co-nomination, then
test in three ways:

- A purely 'in sample' validation where all the data are
used to build the model, then tested. (The film with the
highest forecast probability of winning is chosen as
the predicted winner, of course.) This should give the
most optimistic view of performance, even though the
likelihood maximization problem does not directly
select for this metric.
- A walk-forward cross validation where the data
up through year \(y-1\) are used to build the model,
then it is used to forecast the winners in year \(y\).
This is perhaps the most honest kind of cross validation
for ...

read more