University of Alberta

July 12-13, 2017

9:00am - 4:00pm

Instructors: John Simpson, Brian Rusk, Alicia Cappello

Helpers: TBD

General Information

Data Carpentry aims to help researchers get their work done in less time and with less pain by teaching them basic research computing skills. This hands-on workshop will cover basic concepts and tools, including program design, version control, data management, and task automation. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

For more information on what we teach and why, please see our paper "Best Practices for Scientific Computing".

Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.

Where: Natural Resources Engineering (NRE) 2-125. Get directions with OpenStreetMap or Google Maps.

When: July 12-13, 2017. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organisers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email for more information.

No refreshments provided:: Contrary to our previous practices we are not charging a fee for this workshop. One consequence of this is that refreshments will not be provided. Coffee and snacks are available on the main floor of the building and a short distance away via a between-building bridge.

If you are looking for the GitHub repository that powers this site it is at



Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey


Day 1

9:00Welcome and Introduction
9:15Data organization in spreadsheets
10:30Coffee Break
10:45OpenRefine for data cleaning
12:00Lunch Break
1:00Introduction to the command line
2:30Coffee Break
2:45Command line for automation
4:00Wrap up

Day 2

9:00-9:15Introducing Day 2
9:15Getting started with RStudio and R
10:30Coffee Break
10:45Reading Data into R
12:00Lunch Break
1:00Cleaning Data in R
2:30Coffee Break
2:45Data Visualization in R
4:00Wrap up

Bonus R content is available HERE.


To participate in a Data Carpentry workshop, you will need access to the software described HERE. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.