Intermediate Statistical Programming
To switch between light and dark modes, use the toggle in the top right
1 Introduction
Landing page for the Intermediate Statistical Programming Pathway training pathway
Existing experience in maintaining and writing code, need to upskill to intermediate level so they can expand existing pipelines and apply good practice principals to ensure their code is maintainable.
2 Prerequisites
If the following haven’t been completed as part of theIntroduction to Statistical Programming Pathway, then please do so before progressing:
3 Reproducible Reporting
Reproducibility is an important aspect of analytical projects. There are two methods available to ONS for creating accessible, reproducible documents from R or Python code.
3.1 Rmarkdown
Rmarkdown is a package that can be installed using your installation of RSudio. It is simple to setup and works extremely well with R code. It does allow use of Python so is worth learning regardless of your choice of language.
Complete Reproducible Reporting in Rmarkdown to understand the importance of reproducibility in your work, gain experience of linting code in Python and using parameterised reports.
3.2 Quarto
Quarto is an evolution of Rmarkdown, built on similar syntax and processes, however is language agnostic due to being command line driven. This means it integrates with Python as well as it does with R. The official documentation is the best place to start: https://quarto.org/docs/get-started/
4 Editing and Imputation
Editing and imputation are both methods of data processing. Editing refers to the detection and correction of errors in the data. Imputation refers to estimating values for missing or inconsistent data items. One way in which you can correct for errors in the data is by applying imputation.
Complete one of either:
5 Version Control using the Command Line
Many projects within ONS use a version control software called Git to record changes to files and enable collaboration with colleagues. The command line interface is a powerful tool used for working with computers and is essential to getting started with Git.
Complete Command Line Basics followed by Introduction to Git to gain experience working in version control system both locally and in collaboration.
6 Unit Testing
Unit testing is crucial in guaranteeing the quality of your code and helps to increase efficiency in development. Complete Introduction to Unit Testing to gain experience designing, creating and executing tests for your code in both Python and R.
7 Continuous Integration
Complete Continuous Integration to learn more about it.
8 Object Oriented Programming in Python
This course is only relevant to Python.
Object Orientated Programming is a fundamental part of Python, and one that every Python user, from those learning it for the first time, to those experienced performing complex data analysis and writing software packages, will have used - whether they know it or not.
Complete Objected Oriented Programming in Python to learn more about Python objects and classes in more detail.
9 Packaging and Documentation
Complete Packaging and Documentation to learn more about it!