Unit Testing in Python

1 What is the purpose of functions?

Functions should:

split up the tasks in the code.
not be too small or too specific.
not contain entire programs.

Functions that are too small or specific can be unnecessary; you don’t need to create functions for single line operations.

It would be much clearer to just + 1 to the variable rather than using a function.

def add_one(number):
    number += 1
    return number
    
number_n = 5
number_n_plus_one = add_one(number_n)

Putting entire programs, or large scripts into a function to call it makes the code hard to generalise and difficult to maintain.

Where possible we want to write functions that are deterministic and pure.

1.1 Deterministic Functions

For a given input there is a fixed output.

You can think of this like a mathematical function f(X) = y, where f is our function and X is the input. For a given input of parameters X into function f there is a fixed output or outcome y.

We want functions that do what we think they will do, so the result of any given input can be known. If this is not the case we are unable to expect the result of a function, and this will impact how we allow it to interact with other parts of our code base.

Below are examples of non-deterministic and deterministic functions that add an integer to an input number.

(They are simple examples for demonstration purposes, you would not write these functions in reality as shown by the previous section.)

Non-deterministic
Deterministic

import random

def add_single_integer(initial_integer):
    integer_to_add = random.randint(0, 10)
    sum_of_integers = initial_integer + integer_to_add
    return sum_of_integers

add_single_integer(5)

We cannot determine what the returned value for add_single_integer(5) will be.

def add_single_integer(initial_integer):
    integer_to_add = 7
    sum_of_integers = initial_integer + integer_to_add
    return sum_of_integers

add_single_integer(5)

We can predict what the value of add_single_integer(5) will be, or what the outcome of add_single_integer(initial_integer) will be for any reasonable input.

1.2 Pure Functions

Pure functions - are deterministic functions whose outputs don’t depend on variables that are not inputted into the function. For example, a pure function is not dependent on reading or writing a file, and it will not change the values of external variables.

If the output of our function is dependent on factors that are not input into the function then we cannot guarantee what the output or effect of the function will be. Nor do we want a function to impact other states in the program without our explicit request.

Non-pure
Pure

string_to_add = "Global"

def combine_string(initial_string):
    new_string = initial_string + string_to_add
    return new_string

combine_string("Argument")

'ArgumentGlobal'

We can determimine what the returned value for combine_string("Argument") will be, but the value of string_to_add could change and that will impact what our function does without us explicity telling it to. We cannot determine what the returned value is based only on the inputs of the function, as it depends on external variables.

def combine_string(initial_string, string_to_add):
    new_string = initial_string + string_to_add
    return new_string

combine_string("Argument_first", "Argument_second")

'Argument_firstArgument_second'

We can now determine what the the returned value will be when given the input arguments for any reasonable inputs.

These concepts are quite theoretical, the important takeaways are:

Will your function do what you expect it to?

Will your function impact, or be impacted by other parts of your program unintentionally?

It’s then up to you whether you want this to be the case.

There are some obvious exceptions to the goal of deterministic and pure functions. If a function requires random number generators to work it will not be deterministic. Functions that read and write files are necessary, but not pure. We want to make it clear when this is the case.

1.3 The Mean Function

Below is the initial version of our new function arithmetic_mean() to find the mean of a list/vector of numbers.

def arithmetic_mean(input_list):
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value

test_list = [1, 2, 3]

print(arithmetic_mean(test_list))

2.0

1.4 Making Functions with Clarity

As with all code it is important that what we write is clean and what it does is clear. The benefits are numerous; it is easier for you and others to understand, it is therefore easier to maintain, and easier to find bugs within.

These benefits are also true when writing functions. We want them to be clean, simple and what they do obvious. Without going through language specific syntax there are some principles to keep in mind.

Much of this content is generally described in programming language style guides, but the ideas are included in this course to point to relevant concepts to unit testing.

The python language has an official style guide known as PEP8, the eighth Python Enhancement Proposal. It is given here, and was written by the creator of python Guido van Rossum.

R does not have one single agreed upon style guide, but a commonly used one is the Google Style Guide. This is linked to here.

Both of these guides are definitely worth a read at some point if you are trying to write better code.

Naming Functions

What the function does should be clear from it’s name. The reader should be able to understand the purpose of the function by just seeing it called, which makes code easier to understand. For example, if we have a function that cleans a dataframe:

We do not want names too short and unclear about their purpose such as:

cdf()

We don’t want functions that are too long and confusing:

cleanDataframeByFillingMissingValuesButAlsoLowercasingAndReformattingTheColumns()

When something like cleanDataframe() would suffice. We can then clearly show how it does that in the function itself and documentation.

This idea follows on directly from the best practice in variable naming. We avoid names that:

are too short and non-descriptive (h)
are misleading or irrelevantly named
are very generic (This, That, The_Other)
conflict or nearly conflict with the languages base names
do not follow the same style as the rest of our code

Documentation

What your code is doing should be clear from the naming of variables.

It is useful to include comments as to why the code is doing what it is doing to help others understand it. Your code itself explains what it is doing, especially when it is clearly written.

What your comments add are what the code cannot tell the reader. Your comments can give context to those who didn’t write the code (and yourself down the line) which is invaluable. The comments can detail why you chose a certain approach which the code cannot explain.

Your code itself should strive to be self documenting, but it will never be completely. So write comments that fill in the gaps.

In addition to general documentation, functions should contain an additional level of description. This is called the docstring.

There is information we want to include in order to properly describe a function (where applicable):

What the function does.
What arguments the function takes.
The return values the function produces.
The side effects produced when invoked.
The exceptions involved (more on these later!).

Documentation Example

Again, we are going to use our arithmetic_mean() function to demonstrate how this works.

For basic functions we could use a single line docstring that explains what the function does. Docstrings are created using triple quotation marks: """your text goes here""". We create multi-line docstrings using the following syntax. This example is structured using the Google Style Guide.

def arithmetic_mean(input_list):
    """Function to calculate the mean of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The mean of the list of numbers as a float.
    
    """
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value

Clear Logic

There are many aspects to writing clear and logical code.

One of these is to ensure that there is a flow in how the code is designed.

Where possible the program’s statements are grouped naturally according to what they do and organised sequentially wherever possible. This is encouraged as regardless of how code is executed, the reader is likely to be reading sections from the top down.

Another aspect of having clear logic is to ensure that code within a function is not unnecessarily opaque, nor is it short at the expense of readability and flow.

There are quirks and features that make code shorter. These are useful and powerful tools, but should be used sparingly if they make the purpose of the code less clear.

For example, list comprehensions are useful and have their place in condensing code, but for many conditional statements and nestings make the code less clear than simple for loops and control flow.

original_list= [1,2,3,4,5]
new_list = [each_element + 1 for each_element in original_list]

There is no hard and fast rules for when to or not to use these sorts of features, but consider “Is my code more or less simple to read because of what I have done here?”

When first drafting code the resulting product may contain redundant parts, these are prime segments to remove/restructure which will increase the clarity of the code.

The best code is simple, if working code is written such that it cannot easily be read or takes significant effort to understand then it is not well written code.

Key point: write code that you can come back to in a few years and understand quickly.

You are doing your future self and others an important favour in writing legible and clearly constructed functions.

Logic Example

Here are some examples of complex statements and what they can be reduced to, which gives the reader less mental overhead, allowing more energy going into fixing whatever bug has been found.

Try out these two statements in a script of your own, assigning True/False values to boolean_variable in order to check whether or not the statements return the same values.

The principle of reducing complex logic where possible will save you and your colleagues time in the future.

boolean_variable = True

# Long
if not boolean_variable == False:
    print(boolean_variable)

# Clear
if boolean_variable == True:
    print(boolean_variable)

# Shortest
if boolean_variable:
    print(boolean_variable)

Appropriate Tasks

Functions should do one thing, and do it well. It is important when progressing through the software development cycle to be restructuring/rewriting your code, breaking it up where necessary and rewriting inefficient features. This helps to prevent unwieldy large functions from being produced.

In general we want each function to have one purpose, and for it to follow the Single Responsibility Principle. This means the function only needs to perform one task for the program.

There is no hard or fast rule regarding the size of a function, but by breaking down and making functions for tasks within existing functions, the clarity, modularity and testability increases.

Task Splitting Example

The following example shows an appropriate time to separate one function into multiple. The calculate_sd() function returns the standard deviation of a list/vector of numbers.

This way we can now use the arithmetic_mean() function in other areas of our code too, which is an additional bonus to making our code clearer.

This is a simple example to demonstrate how to rewrite functions in order to make them clearer and to have explicit purpose. The new functions can then be used by other processes in the code, and can be debugged separately.

Bad
Better

def calculate_sd(input_list):
    """Function to calculate the standard deviation of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The standard deviation of the list of numbers as a float.
    
    """
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    
    difference_squared_sum = 0
    for number in input_list:
        difference_squared_sum += (number - mean_value)**2
        
    variance = difference_squared_sum / number_of_values
    
    standard_deviation = variance**0.5
    return standard_deviation

def arithmetic_mean(input_list):
    """Function to calculate the mean of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The mean of the list of numbers as a float.
    
    """
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value


def calculate_sd(input_list):
    """Function to calculate the standard deviation of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The standard deviation of the list of numbers as a float.
    
    """
    mean_value = arithmetic_mean(input_list)
    
    difference_squared_sum = 0
    for number in input_list:
        difference_squared_sum += (number - mean_value)**2

    variance = difference_squared_sum / len(input_list)
    
    standard_deviation = variance**0.5
    return standard_deviation

1.5 Parameter Validation

By creating a function you define what happens within it, however, it is not possible to completely control what may be input into your function.

There may be cases when “bad” values are accidentally (or not!) input into the function.

This is especially important in dynamically typed languages such as python, as the data type of a variable is undefined when assigned.

To prevent this type of issue becoming a problem parameter validation is used. This means that we want to check whether the input of a function is what is expected.

Defining what is expected is dependent on the program, but this is typically factors such as:

the data type of the function argument (numeric, list, string and more).
the value range of the function argument (for example, non-negative, length greater than 5).
whether missing values are allowed in data.

The way these parameters are checked is typically using control flow; if, elif/else if, else statements in conjunction with exceptions. Exceptions terminate the flow of a function with a message explaining the type of error that occurred.

Error handling should:

Be able to handle all “bad” inputs into the function.
Be informative as to what has occurred when an error is encountered.
Be situated at an appropriate place in the code.

If our program is to fail due to bad inputs, we want it to fail properly and in an informative manner.

The parameter validation should occur as soon as the data enters the function, so unnecessary computation is avoided.

Example Syntax

Below is an example using the arithmetic_mean() function to check whether our input data structure is what it should be, whether it contains any data, and whether that data is all numeric.

Try to run this new code with a range of inputs, does it raise exceptions?

In python we use a conditional statement, and if it is evaluated to True then we raise a specific error containing a string statement explaining the error.

def arithmetic_mean(input_list):
    """Function to calculate the mean of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The mean of the list of numbers as a float.
        
    Raises:
        TypeError: if the data is not a list.
        ValueError: if the list is empty.
        ValueError: if the list is not all numbers.
    
    """
    if not isinstance(input_list, list):
        raise TypeError("The input data should be a list, it was a {}".format(
                                            type(input_list).__name__))
    if len(input_list) == 0:
        raise ValueError("The list is empty")
    if not all(isinstance(each_number, (int, float)) for each_number in input_list):
        raise TypeError("The list must contain ints and/or floats")
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value

2 Writing Unit Tests

Now that we are writing effective functions, we can test units of our code.

Unit tests are an automated way for us to test that our function does what we think it should do.

In general, we input an argument into a function, define what we think should happen as a result, and compare that with what actually does happen.

Whether this is an effective way to measure the quality of our code is dependent on the quality of the unit tests we write. The unit tests must:

Be accurate (does what we think it does)
Account for all possible behaviours of the function (different parameters, values, data types)
Be independent of one another (doesn’t rely on other tests)

Good and comprehensive unit tests allow the developer to be confident that the code performs as expected.

2.1 File Convention and Packages

There are many ways to organise the tests in different files. It is generally convention for each function tested to have it’s own script of tests, and if necessary, each type of test to have a separate script.

Within the /content/ folder there is a folder /content/unit_testing_practice/. Navigate in file explorer to this location. In the file there are two .py files that are relevant. The first, functions.py, will contain the functions we want to test. The second, test_functions.py, will contain the tests we will run.

functions.py contains the arithmetic_mean(input_list) function below. Perform a manual test (run it with some input) so you are certain it has been copied over correctly.

OPEN IDE (Spyder, VSCODE etc) of your choice and check that the code below is in functions.py. Check the function runs by calling it with different arguments

def arithmetic_mean(input_list):
    """Function to calculate the mean of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The mean of the list of numbers as a float.
        
    Raises:
        TypeError: if the data is not a list.
        ValueError: if the list is empty.
        ValueError: if the list is not all numbers.
    
    """
    if not isinstance(input_list, list):
        raise TypeError("The input data should be a list, it was a {}".format(
                                            type(input_list).__name__))
    if len(input_list) == 0:
        raise ValueError("The list is empty")
    if not all(isinstance(each_number, (int, float)) for each_number in input_list):
        raise TypeError("The list must contain ints and/or floats")
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value

There are a range of different packages that can be used to create unit tests in python, such as unittest which is included in the python standard library.

OPEN ANACONDA PROMPT to write the code below

The package chosen in this course, due to it’s clarity and versatility is pytest. This package can be installed using the following command in Anaconda Prompt:

#| eval: false
pip install pytest

Within the script you write the tests, the pytest module should be imported using import pytest. It contains a range of classes that aid in testing functions.

To call a group of tests the pytest command must be called in the command line. The pytest program will recursively search for folders, files, then functions within that contain names beginning with test_, then execute them. This is done starting from the working directory where the command is called. For simplicity we will do all of our work within the working directory.

The steps to design and run unit tests in this course are:

Create your functions to test in the working directory.
Create the unit tests in a script location accessible from the working directory.
Run pytest with optional arguments in the command line (Anaconda Prompt) from the relevant directory.

This is done by typing:

#| eval: false
>pytest

This is what actually runs your tests.

Analyse the results from the tests.

RUN pytest in ANACONDA PROMPT in the same directory as “functions.py”, you should see that there are no tests written

2.2 Checking Returned Values

This section uses the arithmetic_mean() function as a subject for testing.

One of the most useful tests we can do that checks whether our functions is returns the value we are expecting.

This is crucial as we are able to take known results and ensure that the output of our function meets those values.

Add the tests we work through to your test_functions.py file so you can check the results.

In python we define a function called test_<what you are testing> which will be a unit test. This is what is called when we run pytest in our command line later.

In order to check a returned value is as expected we use the assert command followed by a conditional statement. If the conditional statement returns a True value then the test passes, if False it fails.

We therefore typically write something of the form assert function(input) == expected_value.

# In file "test_functions.py"
# We import the function 
from functions import arithmetic_mean

def test_all_ones():
    assert arithmetic_mean([1, 1, 1, 1]) == 1

The above test produces the following output:

Here we can see a little more about what happens when we execute a test suite. The pytest first collects all the tests it can find, then it executes them one by one, file by file. Each . after the script name indicates a test passed.

Our function passes this test, however, lets see what happens when it fails a test that we know it should fail.

This is done for demonstration purposes to show what a failed test looks like.

# In file "test_functions.py"
# We import the function 
from functions import arithmetic_mean

def test_all_ones():
    assert arithmetic_mean([1, 1, 1, 1]) == 1

def test_all_twos():
    assert arithmetic_mean([2, 2, 2, 2]) == 3

This produces a new output, as expected the test_all_twos test fails.

This is denoted in the FAILURES section. It produces an AssertionError from this test. This means that we have asserted that one value is equal to another, and this is not the case. By changing the expected value of test_all_twos to 2 the tests will now all pass.

Now you can change the test_all_twos() function to check that the mean of [2, 2, 2, 2] == 2.

2.3 Check Variable Types

Similarly to the previous section, we can check other attributes of what the function returns.

One such example is the data type returned. As our languages are “dynamically typed” we cannot guarantee what type will be produced at the end of the function’s execution, therefore it makes sense to test this.

The syntax is similar to the previous examples of checking values, except we add another layer of logic to the statements involved.

Add the tests we work through to your test_functions.py file so you can check the results.

Again, we use assert statements to ensure that the type of the returned values is the same as our expected type.

Using the isinstance(data, type) syntax allows us to receive a boolean statement whether or not a data object matches a given data type.

We add the following tests to test_functions.py.

def test_returns_float_given_floats():
    assert isinstance(arithmetic_mean([3.0, 4.0, 5.0]), float)

def test_returns_float_given_ints():
    assert isinstance(arithmetic_mean([int(3), int(4), int(5)]), float)

Our function passes these two new tests, as regardless of the numeric data type passed into the function python will convert it to floats in order to perform certain actions such as division.

We are also able to use “type annotators” in functions, these alow us to hint in the code as to what type the data should be. It allows the reader to see what the argument types are supposed to be, and what the result of the function will be. This course will not cover their implementation, but they are useful to know about. For more information click here.

2.4 Checking for Errors Raised

Whether the value of the output of a function is correct is different from whether or not the data returned is the correct type.

As the languages used are “dynamically typed” we cannot guarantee the type of a variable, an incorrect data type could pass improper data from one process to the next.

Here is an overview of what “dynamically typed” means if you are interested in learning more about how programming languages work. In short, it means that we don’t explicity state variable types when assigning a variable.

We are going to check that our function produces appropriate errors when the wrong data type is passed to the function.

Well built functions will contain processes for generating proper errors when parameters are not valid, it is important to be able to check these errors at the right times.

Add the tests we work through to your test_functions.py file so you can check the results.

We will now introduce a concept called “context management”, which lets us selectively and explicitly allocate resources in certain areas of your code. For more information on context managers in python see here.

Context management in python lets us execute code that may raise errors without terminating the execution of the program. This is achieved by using the with command. It effectively allows the commands within the with context to be tried and passed if they fail.

pytest contains a method raises() which takes an Error as an argument. This allows the test to be carried out when an exception is expected as the result of the test. If the Error is raised then the test passes.

There are a range of error types that can be passed into the raises() functions. These include:

TypeError
ValueError
ImportError
NameError
ZeroDivisionError
OverflowError
and many others

from functions import arithmetic_mean
from test_functions import pytest

# Previous tests
def test_all_ones():
    assert arithmetic_mean([1, 1, 1, 1]) == 1

def test_all_twos():
    assert arithmetic_mean([2, 2, 2, 2]) == 2
    
def test_returns_float_given_floats():
    assert isinstance(arithmetic_mean([3.0, 4.0, 5.0]), float)

def test_returns_float_given_ints():
    assert isinstance(arithmetic_mean([int(3), int(4), int(5)]), float)
    
# New tests
def test_data_value_types():
    with pytest.raises(TypeError):
        arithmetic_mean([1, 2, 3, "four"])
        
def test_data_structure():
    with pytest.raises(TypeError):
        arithmetic_mean((1, 1, 1, 1))
        
def test_empty_data():
    with pytest.raises(ValueError):
        arithmetic_mean([])

If you add these new functions to the test_functions.py file and run pytest the function will pass all six tests. The new tests check if the right type of data is passed in and that the list is not empty.

To show what happens without our exceptions we can comment out the Error code and run our tests again.

The code without the relevant errors is given below.

def arithmetic_mean(input_list):
    """Function to calculate the mean of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The mean of the list of numbers as a float.
        
    Raises:
        ValueError: if the list is not all numbers.
    
    """
    #if not isinstance(input_list, list):
    #    raise TypeError("The input data should be a list, it was a {}".format(
    #                                        type(input_list).__name__))
    #if len(input_list) == 0:
    #    raise ValueError("The list is empty")
    #if not all(isinstance(each_number, (int, float)) for each_number in input_list):
    #   raise TypeError("The list must contain ints and/or floats")
    value_sum = sum(input_list)
    number_of_values = len(input_list)
    mean_value = value_sum / number_of_values
    return mean_value

This causes the function to pass the original three tests and fail two of the new ones. The failures are shown below.

test_data_structure is telling us here that it expected to have an error raised and that did not happen, causing the failure of the test. It states that it did not raise a specific error, so it is important that our errors match our expectations.

test_empty_data here shows that the test case given causes the function to produce a new error. The function is trying to divide by the length of the data structure, zero, and therefore a ZeroDivisionError is raised. This is why the original ValueError exception was introduced in the first place.

Importantly, the function does not fail the test_data_value_types because we are expecting the function to raise a TypeError and it does, because you cannot add strings to integers. However, this is because of the sum() function, not because we validated parameters.

2.5 Multiple Parameter Tests

As time goes on we will want to restructure/rewrite and improve our code.

We can also change our unit tests to ensure there is greater coverage and therefore confidence in our code.

So far we have tested similar properties of our functions using separate discrete tests. We can combine tests that have the same structure in order to test more cases quickly and clearly.

We are now going to rewrite all of the previous tests into a more succinct manner. We will replace the old tests with the new versions in test_functions.py/test_functions.R.

For each type of test there is a new test included. It is much easier to keep adding tests when we have set up the structure.

In order to test multiple parameters we use a “decorator” (denoted by the @ symbol on the line above the function). The specific decorator used is pytest.mark.parametrize(). The arguments required are as follows:

first: a string which contains the names of the variables to be used to test the function. This looks quite odd from a typical python syntax perspective, but is just the required format.
second: a list which contains tuples where each element in the tuple corresponds to the variables in the first argument.

Below, the decorator defines a function which has the same arguments as the .parametrize string. The function will be structured just like our previous tests but instead of directly putting in the data we want to test we write the placeholder argument names used.

We can now generate multiple tests from one function.

from functions import arithmetic_mean
import pytest

mean_value_test_cases = [
    ([1, 1, 1, 1], 1),
    ([2, 2, 2, 2], 2),
    ([-10, -20, -30, -40, -50], -30)
    ]
                          
@pytest.mark.parametrize("input_data, expected_mean", mean_value_test_cases)
def test_mean_values(input_data, expected_mean):
    assert arithmetic_mean(input_data) == expected_mean
    
mean_type_test_cases = [
    ([3.0, 4.0, 5.0], float),
    ([int(3), int(4), int(5)], float),
    ([-1, -2, -3, -4], float)
    ]
                       
@pytest.mark.parametrize("input_data, expected_type", mean_type_test_cases)
def test_mean_type(input_data, expected_type):
    assert isinstance(arithmetic_mean(input_data), expected_type)
    
mean_errors_test_cases = [
    ((1, 1, 1, 1), TypeError),
    ([], ValueError),
    (["not", "a", "number"], TypeError)
    ]
                         
@pytest.mark.parametrize("input_data, expected_error", mean_errors_test_cases)
def test_mean_errors(input_data, expected_error):
    with pytest.raises(expected_error):
        arithmetic_mean(input_data)

3 Exercises

The below exercises will help you consolidate the knowledge presented in this course, allowing you to perform basic unit testing in your own work.

3.1 Writing Tests

Using the information in this course you can now write your own unit tests relevant to your team’s projects. While it is advised to write tests as you add new features, or preferably before, sometimes you will want to check existing code’s functionality. This exercise will give you practice in writing unit tests to check code behaviour.

Exercise
Answer

You are working in a data science team with a project on text data. There is a function in the code base called string_cleaning(). The function takes an input of a string of characters and should output the input text with:

leading and trailing whitespace removed
the text lowercased
any punctuation removed

The team is unsure as to whether the function is actually doing what it is supposed to, as slight differences in what this function produces will impact the project down the line. You have been tasked with writing unit tests to ensure that that the function produces the expected outputs.

The unit tests should check that:

A TypeError (python) is raised when something other than a string of characters is in passed into the function.
There is no leading whitespace returned
There is no trailing whitespace returned
All characters in the string returned are lowercase
There is no punctuation in the returned string
The function returns a string

The function to test is contained in the text_processing.py file, write your tests in test_text_processing.py and change your working directory to /exercises/writing_tests/.

You should write at minimum 15 unit tests total to complete this task.

The syntax of your answers may be different to those below and still be correct! You must be sure that your tests check the behaviours you want.

from writing_tests.text_processing import string_cleaning
import pytest

string_lowercase_test_cases = [
    ("ALLCAPS", "allcaps"),
    ("Onecap", "onecap"),
    ("rAnDoMcApS", "randomcaps")
    ]

@pytest.mark.parametrize("example_string, expected_output", string_lowercase_test_cases)
def test_string_lowercasing(example_string, expected_output):
    assert text_processing.string_cleaning(example_string) == expected_output
    
string_white_space_test_cases = [
    ("  leading", "leading"),
    ("trailing   ", "trailing"),
    ("   both   ", "both"),
    ("          long         ", "long"),
    (" short ", "short")
    ]
    
@pytest.mark.parametrize("example_string, expected_output", string_white_space_test_cases)
def test_string_white_space(example_string, expected_output):
    assert text_processing.string_cleaning(example_string) == expected_output
    
string_punctuation_test_cases = [
    (",", ""),
    (":", ""),
    ("hi!", "hi"),
    ("&pointer", "pointer"),
    ("will this be removed?", "will this be removed"),
    ("'quote'", "quote"),
    ("%%%%%%%the", "the"),
    (r"/n", "n")
    ]
    
@pytest.mark.parametrize("example_string, expected_output", string_punctuation_test_cases)
def test_string_punctuation(example_string, expected_output):
    assert text_processing.string_cleaning(example_string) == expected_output
    
string_type_test_cases = [
    ("test", str),
    ("  test !   ", str),
    ("   int(1)   ", str),
    (" list(1, 2, 3)! ", str),
    (r"/n", str)
    ]
    
@pytest.mark.parametrize("example_string, expected_output_type", string_type_test_cases)
def test_string_types(example_string, expected_output_type):
    assert isinstance(text_processing.string_cleaning(example_string), expected_output_type)

string_error_test_cases = [
    (1, TypeError),
    (list("hello"), TypeError),
    (set(), TypeError),
    (("h", "i"), TypeError),
    (None, TypeError)
    ]
    
@pytest.mark.parametrize("example_string, expected_error", string_error_test_cases)
def test_error_raised(example_string, expected_error):
    with pytest.raises(expected_error):
        text_processing.string_cleaning(example_string)

3.2 Test driven development

This exercise is an example of how you may choose to write programs, by first specifying the expectations fo your code, then writing units that pass those tests.

Question
Answer

Your task is to rewrite the calculate_sd() function in the file ./exercises/test_driven_development/standard_deviation.(py/R) such that it passes all the tests contained within the ./exercises/test_driven_development/ folder under the file name test_standard_deviation.(py/R). You will need to change your working directory to the ./exercises/test_driven_development/ folder.

Run the tests first and use the Failures to work out what you need to do to get the function to perform to the test’s specifications.

The following functions are examples of ways to pass the shown unit tests, can you write a better function?

def calculate_sd(input_list):
    """Function to calculate the standard deviation of a list of numbers.
    
    Args:
        input_list (list): A list of numbers.
    
    Returns:
        The standard deviation of the list of numbers as a float.
    
    """
    if not isinstance(input_list, list):
        raise TypeError("The input data should be a list, it was a {}".format(
                                            type(input_list).__name__))
    if len(input_list) == 0:
        raise ValueError("The list is empty")
    if not all(isinstance(each_number, (int, float)) for each_number in input_list):
       raise ValueError("The list must contain ints and/or floats")
    mean_value = arithmetic_mean(input_list)
    difference_squared_sum = 0
    for number in input_list:
        difference_squared_sum += (number - mean_value)**2
    variance = difference_squared_sum / len(input_list)
    standard_deviation = variance**0.5
    return standard_deviation

4 Summary

In this course we have delved into a range of theory and practical examples about good practice for writing functions and creating unit tests.

Before testing units of our code we need to be sure that the units (in our case functions) are well designed. There are many ways to design functions, but we have shown that some guiding principles can help. These include:

Function docstring
Clear logic
Single purpose of functions
Validating parameters

By writing good functions we are therefore able to test the performance of these functions in a much easier way.

We have looked only at one type of software tests, the unit tests. These are the most simple, low level and fast genre of tests. We have only scratched the surface of what they can do.

This course introduced basic types of unit tests, these allow us to check elements of our codes performance such as:

Returned Data Values
Returned Data Types
Errors Raised
Parameterised Tests

This is not an extensive list of the power of Testing, but should be a start to how you can ensure your code does what it is meant to. The next section will introduce some new concepts that you can explore.

Please ensure you complete the post-course survey.

5 Further Study

This is a short course that has introduced some aspects of unit testing, but there are many more elements that you can add to your code to test different aspects.

5.1 Repository Structure

Tests often sit within a wider package of code so it is important to consider how best to structure our repositories.

For python, pytest supports either having tests in the same location as source code or annexing them to their own folder. What this looks like in practice and the pros and cons of each approach are described in the pytest documentation.

5.2 Testing Data Frames

Most of the testing we have looked at so far have been checking things related to built-in data types.

Often for analysis what we need to test is data frames and related data structures.

There are two additional groups of functions which can help test these data structures.

pandas comes with methods to test equality of properties of data frames a description is given in the documentation
numpy has a suite of testing functions to assert properties of arrays, this can be used with pandas too. Descriptions are contained withing their documentation

5.3 Assertions

In this course we have touched upon some good practice in writing functions, and how to test them. In the python section of this course we have briefly introduced assertions, but not used them to their full potential.

Assertions in programming can be a powerful, but lightweight way to ensure your code is performing as expected. In a similar way to how we raised errors when our data was not what we expected, we can use assertions throughout our code to check the data within is what is expected.

In python we can use the assert command to check whether a statement about our data is true. If the statement evaluates to True, the code continues running. If the statement evaluates to False, then an AssertionError is raised. We can add a useful comment in the assertion too to help with debugging.

Example of assertion passing

first_number = 10
second_number = 3

assert second_number != 0
divided = first_number / second_number

print(divided)

Example of assertion NOT passing

first_number = 10
second_number = 0

try: 
    assert second_number != 0
    divided = first_number / second_number
    print (divided)
except AssertionError as e:
    print("Error")

#| eval: false
AssertionError: 

Detailed traceback: 
  File "<string>", line 1, in <module>

We can see that the statement second_number != 0 is False in this case, therefore the assertion error is raised. This can sometimes replace our more formal exception handling, to reduce the amount of code written, and to confirm the value of data in our programs.

Example of assertion NOT passing with comment Below is the same as the example above, except the assertion includes a more useful comment.

first_number = 10
second_number = 0


try: 
    assert second_number != 0  #"second_number cannot be zero as it is later a divisor"
    divided = first_number / second_number
    print (divided)
except AssertionError as e:
    print("Error")

#| eval: false
AssertionError: second_number cannot be zero as it is later a divisor

Detailed traceback: 
  File "<string>", line 1, in <module>

These assert statements can be used in a wide variety of cases, such as data input validation, checking of logic and data types.

For more information about assertions and defensize programming check here.

5.4 Advanced Errors

In this course we have only looked at the default errors produced in python. We can raise different flags/explanations which allow more information about a program to be passed.

We can raise Warning calls which do not result in the termination of our program. This tells us that the behaviour may not be as expected, but the program still completes it’s run. Some example warnings may be familiar to you already, they include:

UserWarning
DeprecationWarning
RuntimeWarning
SyntaxWarning

In addition to the default warnings and errors we can create custom classes which we can give specific meaning. Below is a quick example of the basic syntax for writing custom errors. First a base Error class with the Exception built in class is created, then the custom class.

The script below will iterate over a range of values and exit raising a specific error when the value is seven.

class Error(Exception):
    """Base class for other exceptions"""
    pass
   
class ValueCannotBeSevenError(Error):
    """Raised when the input value is equal to 7"""
    pass
    
for integer in range(0, 10):
  try:
    if integer == 7:
        raise ValueCannotBeSevenError
    else:
        print(integer)
  except ValueCannotBeSevenError:
      print("The value should never, ever be seven!")

5.5 Widening Test Coverage

We have been able to manually improve our coverage of tests so far only by adding more tests ourselves, and py parameterising our tests, making it easier to expand the number of cases our tests check.

However, there are packages which can further increase our test coverage by allowing us to generate strategies for testing, which greatly increases the number of edge cases and situations we may not be able to spot ourselves.

One such package to do this in python is Hypothesis.

5.6 Integration Tests

We determined that the purpose of unit tests was to check that the smallest element of code behaves as expected. This is good, but does not cover the entirety of what our code actually does.

Integration tests are tests which allow us to check the behaviour of how many of our units work together as a whole system. We can also use many of the same techniques used to test units of our code, but at a higher level of abstraction, with many elements working together.

Integration tests and unit tests work well together as the system is made of individual units, we can be confident of both the component parts, and how they work together.

5.7 Continuous Integration

It is not simply enough to have a test suite that we can call when we like in order to check the behaviour of our code based, ideally we would have a method to check our code in a more consistent manner.

Continuous Integration (CI) is the practice of merging small changes to our code base often. The opposite of this is a more ad hoc method, typical of less structured projects where large merges are added infrequently or at the end of the development cycle. Doing continuous integration allows the developers in a project immediate feedback on whether changes pass tests.

CI can be used for:

Automated testing (running a test on each Gitlab commit/merge request)
Style checking
Generating code metrics

In development and analysis it is desirable to do CI as it allows:

Faster development due to a smaller reliance on manual testing
Fast, regular deployment
Early bug detection
Simple code governance

At the ONS one of the Continuous Integration services that is used is Jenkins, which is described in more detail here. There is also a short course on CI which can be found at .