Recent Posts

After teaching data science for several years, and through a global pandemic, I have decided to reboot my teaching philosophy. …

Reproducible presentations why and what tools? Reproducible reports, a single document contains both the code needed to create figures …

Food for thought as you approach your next writing project: Imagine you are painting your bedroom. How do you do this? Do you take the …



Python Packages

This book is aimed at intermediate Python users who want to package up their code to share it with their collaborators (including their future selves) and the wider Python community. It’s scope and intent is inspired by the R packages book written by Hadley Wickham and Jenny Bryan.


The goal of {canlang} is to easily share language data collected in the 2016 Canadian census. This data was retreived from the 2016 Canadian census data set using the {cancensus} R package.


rudaux sets up a course where students complete homework on a JupyterHub server that they access via a course management system (via LTI authentication), and homework is graded via nbgrader (which has both manual and autograding capabilities). Grades are posted to the course management system. In its current implementation these docs support only the Canvas course management system, but they could easily be extended to other platforms that use LTI and that have a gradebook API.


The goal of {ubccv} is to allow you to use R Markdown to create and edit your UBC Faculty CV without having to touch a word document.

Introduction to Data Science

An open source textbook aimed at introducing undergraduate students to data science. It was originally written for the University of British Columbia’s DSCI 100 - Introduction to Data Science course. In this book, we define data science as the study and development of reproducible, auditable processes to obtain value (i.e., insight) from data.

Accelerating Gene Discovery by Phenotyping WGS MMP Strains and Using SKAT

Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease.

Recent & Upcoming Talks

An example talk using Academic’s Markdown slides feature.