Getting Hired as a Data Scientist
Thu 19 May 2016
A few months back I wrote about my experiences trying to
hire a data scientist. It took some amount of work on our part. When we finally
found the right candidate, our parent company told us that there wasn't actually any
money to pay a candidate. This came as rather a surprise to all of us at our three person
startup. This was the first indication that the wheels were coming off the bus, and two
months later, we were all laid off and the company dissolved. Within just three months
I went from hiring to scrambling for a job. Would I follow my own advice for job
candidates? What's the startup climate like? Is it easy to find a job in the field?
getting a foot in the door
I decided to make of habit of (nearly) always writing a cover letter, although I quickly settled
on two or three templates of cover letter, depending on the job function and industry. I found
the address of each company and included it in the letter, mostly to confirm that the office
was in the city of San Francisco. When I submitted an application, I would save the letter
in my (private) applications repo on github, with a message. I had been warned by a friend
who works in HR that cover letters were ignored in her office. In my experience, a cover
letter, even a somewhat generic one, set apart casual applicants from the serious candidates.
I also made the not uncontroversial decision to send out my lengthy CV, rather than a one or two
page resume. My thinking here was that it is easier for a hiring manager to read more details
in your CV if they are interested than try to infer details from a terse one page …
ML in Trading Talk
Wed 11 May 2016
My script for a talk on backtesting
Wed 09 March 2016
I gave a talk today for the Bloomington Data Collective. You can view
the slides or watch
the recorded talk.
The TL;DR is that profitable strategies are hard to make, but bugs are easy to make,
so Bayes' Rule suggests profitable-looking backtests are likely bugs, and here's a
catalogue of some of the errors I have seen and committed in my time.
CRAN check like a bot with docker.
Tue 08 March 2016
If you're like me, you just blindly check boxes when submitting packages to CRAN. (The
'submit' button should be labeled 'yolo' as far as I'm concerned.) After getting
burned yet again for not actually checking my package with the development build
of R, I decided to be slightly less stupid in the future. Rather than install
R-devel, I made a docker base image
for CRAN checking.
As an example, to check my sadists package,
I made essentially the following Dockerfile:
MAINTAINER Steven E. Pav, email@example.com
# tweak this to force re-install
ENV DOCKER_INSTALL_NONCE 97c22800_9f88_4830_806a_2614e06600f2
# rinstall somethings...
RUN /usr/local/bin/install2.r PDQutils hypergeo orthopolynom shiny testthat ggplot2 xtable knitr
FROM the crancheck image on docker hub. The general recipe would be to install any
system packages via
apt-get, then any CRAN packages via
install2.r, then any github packages
/usr/local/bin/installGithub.r. The base image 'does the right thing' with respect to the
entrypoint and you give the package file as the command.
I built it via:
docker build --rm -t shabbychef/sadists-crancheck docker/
Once the image is built, checking a package is as 'simple' as attaching the local directory
/srv in the container via a volume, and giving the name of the package file. (That is,
when the command to the container is
sadists_0.2.2.5000.tar.gz, it will try to check, as CRAN,
/srv/sadists_0.2.2.5000.tar.gz. You had better make sure it is available there,
so attach this directory here containing the package to
/srv in the container.)
In summary, run it like this:
docker run -it --rm --volume $(pwd):/srv:ro shabbychef/sadists-crancheck sadists_0.2.2.5000.tar.gz
You get output as follows: