Getting Hired as a Data Scientist

Thu 19 May 2016 by Steven

A few months back I wrote about my experiences trying to hire a data scientist. It took some amount of work on our part. When we finally found the right candidate, our parent company told us that there wasn't actually any money to pay a candidate. This came as rather a surprise to all of us at our three person startup. This was the first indication that the wheels were coming off the bus, and two months later, we were all laid off and the company dissolved. Within just three months I went from hiring to scrambling for a job. Would I follow my own advice for job candidates? What's the startup climate like? Is it easy to find a job in the field?

getting a foot in the door

I decided to make of habit of (nearly) always writing a cover letter, although I quickly settled on two or three templates of cover letter, depending on the job function and industry. I found the address of each company and included it in the letter, mostly to confirm that the office was in the city of San Francisco. When I submitted an application, I would save the letter in my (private) applications repo on github, with a message. I had been warned by a friend who works in HR that cover letters were ignored in her office. In my experience, a cover letter, even a somewhat generic one, set apart casual applicants from the serious candidates.

I also made the not uncontroversial decision to send out my lengthy CV, rather than a one or two page resume. My thinking here was that it is easier for a hiring manager to read more details in your CV if they are interested than try to infer details from a terse one page …

read more

ML in Trading Talk

Wed 11 May 2016 by Steven

My script for a talk on backtesting

read more

Backtest Boo-Boos.

Wed 09 March 2016 by Steven

I gave a talk today for the Bloomington Data Collective. You can view the slides or watch the recorded talk. The TL;DR is that profitable strategies are hard to make, but bugs are easy to make, so Bayes' Rule suggests profitable-looking backtests are likely bugs, and here's a catalogue of some of the errors I have seen and committed in my time.

read more

CRAN check like a bot with docker.

Tue 08 March 2016 by Steven

If you're like me, you just blindly check boxes when submitting packages to CRAN. (The 'submit' button should be labeled 'yolo' as far as I'm concerned.) After getting burned yet again for not actually checking my package with the development build of R, I decided to be slightly less stupid in the future. Rather than install R-devel, I made a docker base image for CRAN checking.

As an example, to check my sadists package, I made essentially the following Dockerfile:

# preamble#
FROM shabbychef/crancheck

# tweak this to force re-install
ENV DOCKER_INSTALL_NONCE 97c22800_9f88_4830_806a_2614e06600f2

# rinstall somethings...
RUN /usr/local/bin/install2.r PDQutils hypergeo orthopolynom shiny testthat ggplot2 xtable knitr

It starts FROM the crancheck image on docker hub. The general recipe would be to install any system packages via apt-get, then any CRAN packages via install2.r, then any github packages via /usr/local/bin/installGithub.r. The base image 'does the right thing' with respect to the entrypoint and you give the package file as the command.

I built it via:

docker build --rm -t shabbychef/sadists-crancheck docker/

Once the image is built, checking a package is as 'simple' as attaching the local directory as /srv in the container via a volume, and giving the name of the package file. (That is, when the command to the container is sadists_0.2.2.5000.tar.gz, it will try to check, as CRAN, the file /srv/sadists_0.2.2.5000.tar.gz. You had better make sure it is available there, so attach this directory here containing the package to /srv in the container.) In summary, run it like this:

docker run -it --rm --volume $(pwd):/srv:ro shabbychef/sadists-crancheck sadists_0.2.2.5000.tar.gz 

You get output as follows:

* using log directory …
read more