Summary & Further Resources - R

In this course we have worked through:

By using the ideas and skills in the course your analysis will be:

This course has focussed on moving from scripts to functions and modules. To avoid this task you can start your projects written with all functions and in different files - this takes practice, but planning and experience will make it easier.

1 Further Reading

Structuring your code in projects is an important part of building reproducible analysis.

The concepts introduced in this material are a starting point to help break down your scripts into functions and files.

You can further improve your projects by taking into account other concepts contained within the Quality Assurance of Code for Analysis and Research.

Below is a brief summary of related concepts which will help further structure your code. These topics will be covered in later material once created, for now links to relevant resources are included.

1.1 Documentation

When we change our project to be across multiple files it becomes even more important than usual that we document our code well. This helps us understand what is happening across functions and modules without having to learn every single detail.

Minimum function documentation has been added to the case study analysis.

Better functions would contain full docstrings that explain the parameters, behaviour and outputs of each function in detail.

The best docstring format is the one already being used in a team/project. The next best is following a style guide such as:

R:

In this course we have grouped functions together into files based on what they do. For each file (module) we want to document what code is in that file, and what it does.

1.2 Packaging Code

Soon there will be a new course created to work through the process of creating packages of code.

The materials below cover working through the process of creating packages of code at a basic level.

R:

1.3 Environment Management

The code you write depends on the version of packages you have used. In order to have others run your code they need to have compatible, ideally identical package versions.

Recording what versions of packages used to run code allow other to run the code later on.

We can ensure the correct environments are loaded and recorded using “virtual environments”, which allow you to have a different, separate set of packages to be installed. You can then pick which set you want to use at any time. Note that localised libraries created for virtual environments do not affect the versions of packages installed to your global library. Using an older version of a package for a specific virtual environment will not affect your other projects.

In R there is one main environment manager; renv. You may see references to PackRat, which was a commonly used system, however this has been deprecated.

Continue on to Programming Styles in R