Getting Started - R

Author

ONS Data Science Campus

1 Packages

The purpose of this course is to explore how to structure code, rather than look in depth at analysis or other software packages.

However, due to the case study chosen there are some packages required to run all the code.

As the examples used are basic there is unlikely to be versioning issues. For completeness the dependencies are included.

When completing data analysis in R there are a wide range of packages to choose from which achieve similar outcomes. For consistency, this course uses packages from the tidyverse family.

The packages used are:

  • tidyr
  • dplyr
  • stringr
  • readr

Please ensure you have the following versions of the above package or newer:

package version
tidyr 1.1.2
dplyr 1.0.2
stringr 1.4.0
readr 1.4.0

In order to check what versions you currently have you can load the relevant library in the console and then check the package version under $otherPkgs.

>library("library_name")

>sessionInfo()

All of the required packages can be loaded at once, by loading the tidyverse package.

This is done on ONS windows machines by using the following command:

>install.packages("tidyverse", type = "win.binary", dependencies = TRUE)

Other devices and departments may have different options for installing packages. Please consult your IT team if you are not sure how this is meant to be done.

Most often to install a package in R we use:

>install.packages("tidyverse")

Your version of the tidyverse package should be 1.3.0 or greater.

If you would like to ensure that you are using the speciifed versions of the R packages above, you can use remotes to do so. However, this also has a dependency on Rtools and may be restricted on certain systems. If remotes and Rtools are available and unrestricted on your machine, you may run the below code to specify previous versions of R packages. This may help to avoid breaking changes.

library(remotes)
install_version("tidyr", version = "1.1.2")

2 Exercises

To complete this course you will need to work in R to edit .R files.

To complete the exercises you will need to run whole files, rather than line by line execution.

Within the folder data/ are all of the data files required to complete this course.

Example code, and the code used to complete exercises are contained within the folder example_code_r/.

The exercises are designed across multiple different sets of files, please ensure you are at the correct folder location described in the exercise.

Continue on to Modular Programming in R