Jonathan Olmsted

Advanced Statistical Programming Camp

Topics covered include:

  • parallel computing at the R and C++ levels
  • using HPC schedulers
  • basic C++ integration with R through Rcpp

These topics are demonstrated over the course of multiple days with a set of running examples:

  • efficiently calculating pairwise distances over the surface of the earth
  • parametric and non-parametric bootraps
  • cross-validation for “out-of-sample” prediction error
  • Bayesian Probit regression (EM/MAP and MCMC)


The first session was mostly a teaser demonstrating the performance gains associated with various techniques applied to a common example: calculating pairwise distances. This particlar set of slides (as a PDF) is here.

All of the materials for the workshop are comprised of:

  • slides for each session’s presentation [PDF]
  • handouts with further documentation/examples [PDF]
  • all R code from slides and handouts [R]
  • all C++ code from slides [CPP/HPP]
  • job submission scripts [SLURM]

and can be downloaded here as a ZIP-ed directory.

If you have any questions please don’t hesitate to ask. I’m happy to answer when I can.