I like the calendar 'heatmap' plots of commits you can see on github user pages, and wanted to play around with some. Of course, if I just wanted to make some plots, I could have just googled around, and then followed this recipe, or maybe used the rChartsCalmap package. Instead I set out, as an exercise, to make my own using ggplot2.

For data, I am using the daily GHCND observations data for station USC00047880, which is located in the San Rafael, CA, Civic Center. I downloaded this data as part of a project to join weather data to campground data (yes, it's been done before), directly from the NOAA FTP site, then read the fixed width file. I then processed the data, subselected to 2016 and beyond, and converted the units. I am left with a dataframe of dates, the element name, and the value, which is a temperature in Celsius. The first ten values I show here:

date element value
2016-01-01 TMAX 9.4
2016-01-01 TMIN 0.0
2016-01-02 TMAX 10.0
2016-01-02 TMIN 3.9
2016-01-03 TMAX 11.7
2016-01-03 TMIN 6.7
2016-01-04 TMAX 12.8
2016-01-04 TMIN 6.7
2016-01-05 TMAX 12.8
2016-01-05 TMIN 8.3

Here is the code to produce the heatmap itself. I first use the date field to compute the x axis labels and locations: the dates are converted essentially to 'Julian' days since January 4, 1970 (a Sunday), then divided by seven to get a 'Julian' week number. The week number containing the tenth of the month is then set as the location of the month name in the x axis labels. I add years to the January labels.

I then compute the Julian week number and day number of the week. I create a variable which alternates between plus and minus one each month, then color the 'grout' between my tiles in different gray colors to delineate the month boundaries. I then geom_tile the values, (using viridis to get scales visible to the colorblind); create facet rows for the minimum and maximum temperature; add the x breaks and labels; fiddle with the guides and impose a minimal theme; then set the coordinates to 'fixed' to make the tiles square. Voila:

library(dplyr)
library(lubridate)
library(ggplot2)
library(viridis)


jules <- function(x) { as.numeric(base::julian(x,origin=as.Date('1970-01-04'))) }

# get weeknumber containing the 10th of each month;
aseq <- data.frame(alldate=seq(min(pltdat$date),max(pltdat$date),by=1)) %>%
    filter(lubridate::day(alldate) == 10) %>%
    mutate(moname=month(alldate,label=TRUE),wnum=jules(alldate)/7.0,yrnum=year(alldate)) %>%
    mutate(label=ifelse(moname=='Jan',paste0(moname,'\n',yrnum),as.character(moname)))

# now the plit itself
ph <- pltdat %>%
    mutate(juld=jules(date),
                 mono=month(date,label=FALSE),
                 dayname=factor(weekdays(date,abbreviate=TRUE),levels=rev(c('Sun','Mon','Tue','Wed','Thu','Fri','Sat')))) %>%
    mutate(wnum=floor(juld) %/% 7,
                 moalt=factor(sign((-1)^mono))) %>%
    ggplot(aes(wnum,dayname,fill=value)) +
    geom_tile(aes(color=moalt), na.rm=TRUE,size=0.7) + 
    scale_fill_viridis(option='viridis',na.value='white') + 
    scale_color_manual(values=c('gray40','gray80')) +
    scale_y_discrete(breaks=c('Mon','Wed','Fri')) +
    scale_x_continuous(breaks=aseq$wnum,labels=aseq$label) +
    facet_grid(element ~ .) + 
    guides(color='none') + theme_minimal() + theme(panel.grid=element_blank()) + coord_fixed() +
    labs(y='',x='',fill='temp C',title='Min and Max daily temperature, NOAA station USC00047880, San Rafael, California')
print(ph)

plot of chunk first_plot

Well, that was fun. So much fun, I will run it again on my win rate from Chesstempo tactics attempts, which at least might show some weekday/weekend pattern:

plot of chunk second_plot