#1 (2018-09-24)

VisStat HT18 #1

Homework

  • Read Wilson et al. 2017. Good Enough Practices in Scientific Computing. PLOS Computational Biology 13 (6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510 and prepare a question or comment for next meeting
  • Follow the instructions in the slides following to:

    1. Install the tidyverse library
    2. Experiment with writing ‘functions’ to package together a bunch of commands into a meaningful action
  • Work through notes from class. These are posted as 01.introduction-notes.Rmd. Please experiment! Email me any questions and I will address them in the next meeting.

Tidyverse

  • The tidyverse is a large collection of packages (called ‘libraries’ in R speak) that impose a consistent design philosophy on R
  • For people who have learned ‘old-fashioned’ R this counts as advanced
  • But for complete beginners it can offer a better entry point to the language
    • no need to deal with the legacy of messy, illogical design choices* that have been made over the years
  • To activate a library enter the following (you will need to do this once each session and at the start of each script)
library(tidyverse)

* not actually choices in many cases

Install a package

  • Install the tidyverse package (Warning: takes a long time!)
  • Go to “Tool” → “Install packages…” menu item
  • Leave the “Install from” field at the default
  • Make sure “Install dependencies” is checked
  • The default “Install to library” setting on my computer makes a global install (all users); on the lab computer it makes a local install (only for you). If you have problems installing packages we might need to fix this.

A note on R ‘names’

  • Functions, variables etc are the nouns and verbs of the R language.
  • Legal R names:

    • a sequence of letters digits, the period and underscores; and not a reserved word
    • otherwise must be in `backticks`

i.e. backticks are used to refer to names that are otherwise reserved or illegal

This is important for column names in tables, since they often have e.g. spaces in them

What’s a function?

sqrt(4) # square root
## [1] 2
sqrt(19)
## [1] 4.359
  • The [1] just appears in the console — just ignore it
  • R has lots of built-in functions
  • You can define your own functions with:
# This isn't a real function, so it won't work
FUNCTION_NAME <- function(ARGUMENTS) {ACTIONS TO CARRY OUT ON ARGS}

The grammar of functions

# This isn't a real function, so it won't work
want <- function(subject, object){
  subject "WANTS" object
}
want(👱, 🍰)
R> Michael wants cake
sit <- function(subject, destination=NA){
  subject "SITS" (IN destination)
  }
sit(🐨)
sit(🐨, 🌳)
R> The koala is sitting
R> The koala is sitting in the tree

Make a function

a² + b² = c²

get.hypotenuse <- function(a, b){
  a2 <- a ** 2
  b2 <- b ** 2
  c2 = a2 + b2
  c <- sqrt(c2) # square root
  c # this is the value that will be "returned"
}
get.hypotenuse(3, 4)
## [1] 5

Markdown notebooks

  • “File” → “New File” → New Notebook"

  • Exercises:

    • Redo the hypotenuse example in a notebook (play around, e.g. try defining the function in one line)
    • Can you think of other formulae that you could turn into a function (e.g. the area of a circle is πr²; volume of a sphere is 4/3πr³; hint: in R pi is pi)

Help function

  • help() is a function that returns information about other functions. Here’s how to get help for the read_delim family of functions:
help(read_delim)
  • You would get the same output for e.g. help(read_csv2), help(read_delim) etc.

  • The read_delim functions return a tibble, a special format of tabular data like a table from a spreadsheet

  • Help on tidyverse functions is only available if tidyverse is loaded