Workshop Title: Introduction

Introduction

Spreadsheet programs are very useful graphical interfaces for designing data tables and handling basic data quality control functions.

Spreadsheet programs

Commands may differ a bit between programs, but general ideas for thinking about spreadsheets are the same.

Spreadsheets encompass a lot of the things we need to be able to do as researchers. We can use them for:

How many people have used spreadsheets in their research?

Spreadsheets can be very useful, but they can also be frustrating and can sometimes give us incorrect results.

What are some things that you’ve accidentally done in a spreadsheet, or have been frustrated that you can’t do easily?

Spreadsheet outline

In this lesson we’re going to talk about how to use spreadsheets to organize and clean data, we’re not going to go through the built-in functions available in most spreadsheet software.

The cardinal rules of using spreadsheet programs for organizing data:

1. Put all your variables in columns.
2. Put each observation in its own row.
3. Leave the raw data raw - don’t mess with it!
4. Export to a text based format like CSV.

In reality, though, many researchers use spreadsheet programs for much more than this. We use them to create data tables for publications, to generate summary statistics, and make figures.

In this lesson, we will assume that you are most likely using Excel as your primary spreadsheet program.

In this lesson, we’re going to talk about:

  1. Formatting data tables in spreadsheets.
  2. Common formatting mistakes by spreadsheet users.
  3. Dates as data.
  4. Basic quality control and data manipulation in spreadsheets.
  5. Exporting data from spreadsheets.
  6. Data export formats caveats

Next: Formatting data tables in spreadsheets.