Data Visualisation in R

Author

Government Analysis Function and ONS Data Science Campus

Government Analysis Function and Data Science Campus Logos.

1 Course Summary

This course is designed as a companion course to the Introduction to Data Visualisation theory courses provided by the Analysis Function, which adheres to the most up to date AF guidelines. It introduces a range of Data Visualisation techniques and practices within the programme language R, utilising the incredibly popular ggplot2 package as well as the gt package to create tables. Throughout, adherence to good practice guidelines will be followed and enforced, as well as strategies to reduce code repitition by setting elements of the design of visualisations to the global environment (so that they are reusable).

The course is split into core and reference material which is signposted within the materials in Chapter 1 - Introduction. Chapters 1-4 are considered the core material of this course, whereas chapters 5-7 are the reference material that introduce less applicable plot types, as well as tables. Since pie charts, donut charts and violin plots are less recommended for use in Data Science that other standard plot types, it was decided to split them out into the reference material for those who need to utilise them.

2 Course Materials

The course materials come in several formats:

  • HTML pages such as the one you are reading now

  • Data we will use during the course. It’s highly recommended you create a project with a ‘data’ folder and download all the required datasets before starting the course

You can also navigate to the course Github Repository and clone or fork the website structure for yourself. If you are new to programming and version control, we recommend you remain on the website to gain the best experience.

3 Learning Outcomes

  • To evaluate the capabilities of different visualisation tools and techniques to identify the most appropriate.

  • To apply visualisation techniques to produce a variety of plots that assist with the exploration of datasets.

  • To create static plots ready for publication that follow good practice and accessibility guidelines.

  • To implement clean code principles that reduce repititions by setting design elements as variables.

  • To examine and critically evaluate plots to draw out meaningful insight from data.

4 Pre-Requisite Summary

Some prior knowledge of R is expected, including knowledge of reading in data and manipulating it using the tidyverse package collection. As such, it is recommended that the learner has completed the Introduction to R course.

Learners are also expected to have taken Introduction to Data Visualisation before taking this course. This short course provides theory behind Data Visualisation, good practice guidelines as well as introducing the numerous plot types available for publication. This introduction serves as a great reference point for applying these techniques and creating these visualisations in programming languages, which lines up very well with the reference material of the Python/R courses that break the plots down to their visualisation layers.

5 Software Requirements

  • RStudio/Posit

  • Packages:

    • tidyverse (dplyr, ggplot2 and readr used frequently throughout)

    • janitor

    • showtext

    • Patchwork

    • ggthemes

    • Rcolorbrewer

    • Scales

    • gghighlight

    • gt

    • ggrepel

Reuse

Open Government Licence 3.0