Basic Programming Styles in Python

Author

ONS Data Science Campus

1 Styles For Analysis

There are two main styles we will discuss in this section to help us think about how to structure our code.

Here is an example piece of code we will be converting into the different styles. As it’s a small piece of code it is quite simple, so writing functions for it may be a bit of overkill, but this is done to show the differences in approaches.

The code:

takes a list of strings
converts the strings to floats
finds the sum of the numbers
counts how many numbers are in the list (the length)
divides the sum by the length of the list

Of course, in reality you would use the inbuilt functions sum() and len() (and their R equivalents) to perform this calculation. By building these simple functions from scratch you can see how the functions (and how the functions are called) vary between the different programming styles.

Try running it yourself to see what it does, or even write it yourself.

1.1 Example Code

input_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

# convert strings to floats
list_of_numbers = []
for string_num in input_strings:
    list_of_numbers.append(float(string_num))
    
# find the sum
total = 0
for num in list_of_numbers:
    total += num
    
# find the length - usually you would use len() but this is illustrative!
length = 0
for num in list_of_numbers:
    length += 1
    
# divide by count of numbers to get the mean
mean = total / length

print(mean)

6.362500000000001

1.2 Style Examples

1.2.1 Procedural Style

Procedural code is a set of instructions run one after another. Each instruction is wrapped inside a function. The functions are called step by step.

Iteration is a common feature of procedural code, more so in Python than R.

At each step the result of the previous step is given to the next function.

This style is common in Python and R code.

All the instructions are written as functions that define what will happen when variables are given to them. These functions are defined first in the file.

Below the functions is the code that “runs” the script. Each line executes one after the other, with data being passed from one function to the next.

To see what the program does we just need to look at the end of the code, each function describes how it changes the data.

If we wanted to change the behaviour of any step, we would only change the relevant function. If we wanted to perform the steps in a different order, we would change the order in which they are run at the end of the script.

Advantages of Procedural Style

The code is easy to adapt, as to change the behaviour of any given step we only need to change the relevant function.
It’s easy for a collaborator or client to understand the flow of the program and what is being done at each step
Because procedural style is very popular with new programmers and easy to learn, there are lots of resources available with example code that can help you get started.
Procedural code has a “top-down” structure which suits programmers who prefer to work their way through a program without a lot of prior planning.

Disadvantages of Procedural Style

Since each function is designed to perform a specific step in a sequence, code in procedural style is not very re-useable even within the same program.
In procedural style code is broken down into smaller pieces and functions, which can make it difficult to track down errors when debugging.

1.3 Example Code

Procedural code is a common way to structure code in Python. Notice how in the final lines of the script, the output of a function is passed to the next function.

## Define functions to achieve each step
def convert_to_num(strings):
    """Convert each string to a float"""
    numbers = []
    for each_string in strings:
        numbers.append(float(each_string))
    return numbers


def sum_numbers(numbers):
    """Add the numbers together"""
    total = 0
    for num in numbers:
        total += num
    return total


def find_length(numbers):
    """return the length of the list"""
    count_element = 0
    for num in numbers:
        count_element += 1
    return count_element


def calculate_mean(total, length):
    """return the total of the numbers divided by the length"""
    return total / length


## Call functions on our data one after another
initial_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

numbers = convert_to_num(initial_strings)
total = sum_numbers(numbers)
length = find_length(numbers)
mean = calculate_mean(total, length)

print(mean)

6.362500000000001

Note that the collection of functions at the end of the file presents the entire algorithm - what you are trying to achieve. As the function names are self-explanatory (and only do what they say), it is straightforward to follow what is being attempted.

1.4 Functional Style

Functional programming is an approach to solving problems using key principles:

Immutability

Data isn’t changed once it is created - a new variable can be made but existing ones are not altered.

High order functions

Functions that can take other functions as arguments are used to break up parts of a problem. We will see an example using the map() and sapply() functions below.

Function purity

Purity means that a function doesn’t interact with the rest of the program. The function has no “side effects” - it doesn’t alter other variables or objects outside itself.

In addition, pure functions when given the same input - always give the same output.

Function purity and related concepts are discussed further in the Introduction to Unit Testing course. For government analysts you can access the Unit Testing course.

Advantages of Functional Style

The principles of functional programming help when we convert our scripts into functions and then modules.
Code organised into pure functions is more reliable - you can easily write unit tests for the functions, and having pure functions with no “side effects” makes it easier to debug our code.
Writing code in the functional style provides clearly organised, pure functions and immutable (can’t be edited) variables which make your code easier to understand.
Because functional code is organised into functions, it is highly re-useable and easy to adapt - you only need to change the function in question to change a step.

Disadvantages of Functional Style

Immutability requires new variables to be assigned for every step, which can lead to functional code requiring a lot of memory.
In some cases, functional code can actually be less readable than other styles, for example when recursion is used instead of for loops.

Note: there is a distinction between “functional programming” - a defined style of writing code and “writing programs that use functions”. We can write code using functions with a procedural style, but our code is only “functional” if it follows functional principles. Annoying naming conventions!

1.5 Example Code

Notice that we do not define a function to convert each string in the list to a float or to add together all of the numbers in the list. Instead we define a function to perform a particular task once, and use map() and reduce() to apply that function to all of the values in a list.

reduce() comes from a package called functools, which contains many useful functions for functional programming. functool comes with python, it is part of the standard library so you don’t need to install anything to use it.

from functools import reduce

## Define functions to perform each action
def convert_to_num(string):
    """Convert a string to a number"""
    number = float(string)
    return number

def sum_numbers(num1, num2):
    """Return the sum of the given numbers"""
    return num1 + num2

# not using a functional approach for the below
def find_length_of_list(numbers):
    """Count the elements in a list"""
    count_element = 0
    for num in numbers:
        count_element += 1
    return count_element

def mean(numbers):
    total = reduce(sum_numbers, numbers)
    return total/find_length_of_list(numbers)

## Apply the functions across each element of the data
initial_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

# `map()` takes a function and an iterable, applying the function
# to each element of the iterable
numbers = map(convert_to_num, initial_strings)

mean = mean(list(numbers))
print(mean)

6.362500000000001

In Python there are a range of other parts of the language to help with functional programming:

1.6 Comparison

Each style has its appropriate uses, often in combination with each other.

Throughout this course we will try to design our code as functions following functional principles. Each of the new code styles shown rely on us breaking our scripts up into functions appropriately.

1.7 Example Code

The code does what we want but isn’t structured to be reusable.

Each function completes a task for us, we call the functions in order.

Each function is applied to all elements of our data, following principles of functional programming.

1.8 Exercise

The imperative script below contains steps for calculating the standard deviation of a given list of numbers (which are provided as strings). Recreate the calculation using procedural and functional styles.

Consider how in the functional style you could re-use one function in different parts of the calculation.

Rewrite the code below in a procedural style, then a functional style.

Note: writing a function to find the length of an iterable is beyond the scope of the course, doing so is considered an extension exercise.

input_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

# convert strings to floats
list_of_numbers = []
for string_num in input_strings:
    list_of_numbers.append(float(string_num))
    
# find the sum
total = 0
for number in list_of_numbers:
  total += number
    
# count the number of numbers in the list
length_of_list = 0
for number in list_of_numbers:
  length_of_list += 1
    
# divide by count of numbers to get the mean
mean = total / length_of_list

# subtract the mean from each number
diffs = []
for num in list_of_numbers:
  diffs.append(num-mean)
  
# square each difference
diff_sq = []
for diff in diffs:
  diff_sq.append(diff**2)
  
# add the squared differences together
diff_sq_total = 0
for diff in diff_sq:
  diff_sq_total += diff

# divide by the count of numbers
var = diff_sq_total / length_of_list

# square root
sd = var**0.5

print(sd)

2.3302025126585026

Re-writing the imperative code in a procedural style. This answer is an example, you could split up your code into functions differently.

def convert_to_num(strings):
    """convert list of strings to floats"""
    numbers = []
    for string in strings:
        numbers.append(float(string))
    return numbers

def sum_numbers(numbers):
    """find sum of list of numbers"""
    total = 0
    for number in numbers:
        total += number
    return total
    
def find_length(values):
    """find length of a list"""
    length = 0
    for value in values:
        length += 1
    return length

def calculate_mean(numbers):
    """given list of numbers return mean"""
    total_sum = sum_numbers(numbers)
    count = find_length(numbers)
    return total_sum / count
    
def sum_squared_diffs(numbers):
    """calculate the sum of the squared differences"""
    mean = calculate_mean(numbers)
    squared_differences = []
    for value in numbers:
        difference = value - mean
        difference_squared = difference ** 2
        squared_differences.append(difference_squared)
    return sum_numbers(squared_differences)

def standard_deviation(numbers):
    """calculate standard deviation of list of numbers"""
    variance = sum_squared_diffs(numbers) / find_length(numbers)
    sd = variance ** 0.5
    return sd


input_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

numerical_values = convert_to_num(input_strings)

sd = standard_deviation(numerical_values)

print(sd)

2.3302025126585026

Re-writing the imperative code in a functional style. This answer is an example, you could split up your code into functions differently.

The find_length() function is a bit impractical! An interesting design, but not something we would use in practice.

from functools import reduce

# there are no longer any for loops in the code

def convert_to_num(strings):
    """convert list of strings to floats"""
    return list(map(float, strings)) 

def sum_numbers(number1, number2):
    """find sum of two numbers"""
    return number1 + number2
    
def find_length(values):
    """find length of a list"""
    # this gives a 1 for every value, which are all added together
    # equal to the length
    return reduce(sum_numbers, map(lambda _: 1, values))

def calculate_mean(numbers):
    """given list of numbers return mean"""
    return reduce(sum_numbers, numbers) / find_length(numbers)
    
def sum_squared_diffs(numbers):
    """calculate the sum of the squared differences"""
    mean = calculate_mean(numbers)
    # calculate sum squared difference for each value, then add together
    squared_differences = map(lambda values: (values - mean)**2, numbers)
    return reduce(sum_numbers, squared_differences)

def standard_deviation(numbers):
    """calculate standard deviation of list of numbers"""
    variance = sum_squared_diffs(numbers) / find_length(numbers)
    sd = variance ** 0.5
    return sd


input_strings = ["5.5", "7.4", "9.3", "4.6", "5.4", "10.1", "2.5", "6.1"]

numerical_values = convert_to_num(input_strings)

sd = standard_deviation(numerical_values)

print(sd)

2.3302025126585026

Congratulations, you’ve completed the Modular Programming in Python course.