x = 4 + 3
x7
Python was created by Guido van Rossum and first released in 1991. It was named this way as a reference to Monty Python’s Flying Circus.
It is a general-purpose programming language that has four key aims:
Python is extensible – we can use a broad range of additional packages to enhance our code and make processes easier. These are also (mostly) open source, and often have great support from the developers e.g. pandas, plotly.
In practical terms, Python is a programming language that you use to write instructions for a computer. These instructions are typically organized in the following ways:
A script is a file containing Python code that is meant to be run directly. Scripts are often used for automation, data analysis, or any task you want to execute from start to finish. Any file ending with .py is a Python file. These files can contain scripts, reusable functions, classes, or any Python code. You can run them directly or import them into other Python files.
A module is simply a Python file (or a collection of files) that can be imported into other Python code. Modules help organize code into reusable pieces.
A package is a collection of Python modules.
When we create objects in Python we assign it to a variable. Variables are words and numbers that act like labels; a reference to an object that lives in memory. Without this variable we can’t “find” the object again in memory and won’t be able to use it for analysis purposes.
In Python we assign variables using the equals sign (=), where our label (or variable name) goes on the left and the object we want to store goes on the right. Unlike other languages, in Python we do not have to state the data type of the variable we are storing in memory.
Naming your variables can be one of the trickier parts of coding. Choosing sensible names saves time and energy later, when you try and remember what you’ve called something or if you need to refer to an object many times in code (after all, it can become tiring to consistently type out long variable names!).
Clever naming allows you to figure out what an object contains without having to inspect it first, a practice heavily adopted in code production and development.
Generally, a variable name:
We’ll deal with two main types of numeric data types in Python.
The handy type() function in Python allows us to check the type of whatever we put within the brackets.
Strings are sequences of character (word/text) data. The type in Python is called str. They are contained within either ‘single’ or “double” quotation marks and within your coding you should remain consistent with whichever you use.
We recommend that if you’re creating strings that use apostrophes or single quote marks within them, use double quotes to open and close your string.
Boolean values are sometimes called logical values in other languages and consist of two unique values, True and False.
In Python they must be spelt out fully and have a capital first letter. They are not text values; so do not require quote marks. They are a reserved word (special words in Python that cannot be used as variable names), and so are displayed in bold green text. We will see many more examples of these reserved keywords later.
As we’ll see later Python often evaluates expressions in a Boolean context; something is either True or False. They even have parallels in the integers, namely that true has the value of 1 and false has the value of 0.
In this section we are going to explore some of the common data structures in Python. So far we have only stored one piece of information in memory; whereas we will usually want to store many. These data structures provide us with a particular way of organising data so it can be accessed efficiently. How you store it will often depend on how you want to use it later, so the choice is often an important one.
A list is a type of container; it holds a collection of items. The items in a list have an order (known as an index). They:
We create lists in Python using square brackets [ ] and separate each item with a comma.
Store at least three hobbies within a list. Choose an appropriate variable name.
Tuples are similar to lists but have two major differences that separate them into their own object category with its own niche uses. These are:
('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday')
The third object type we’ll look at is Dictionaries, which also store a collection of objects similarly to lists and tuples. They are:
To create a dictionary in Python, we use the only type of brackets we have yet to utilise, the curly braces that often feature in mathematics { }. Dictionaries contain key value pairs (these are different from tuples!), where Keys are usually integers or strings (an immutable data type) and Values can be any type of object. Syntax wise, these pairs are separated by a colon, written as “Key:Value”.
{'Nigeria': 2229, 'Wales': 2183, 'Malawi': 1801}
{'name': 'Bob', 'grades': {'math': 90, 'science': 85}, 'John': ['123-4567', 'john@email.com']}
Create a dictionary called student that stores the following information:
Pandas gives us two new object types, one of which is universally popular.
“Series” are:
In the following example we will create a series from scratch, but this is not something we often do in practice
DataFrames are
Essentially a collection of series objects (one series per column) where
The dimensions are labelled similarly to a series object index refers to the row labels, defaults to starting at 0 columns refers to the column labels, or headers. The “DataFrames” will have some of the same methods as “Series” and some different, with the major methods they both share being heavily utilized in Data Analysis, with no series specific method worth mentioning over dataframes themselves.
If I want to give you a location of a file, I can use the absolute file path. Let’s say, for example, that I have saved the “Intro_to_Python” folder in my C Drive and I want to access the file “animals.csv”.
The full or absolute location of this file is:
This is clear and explicit about where the data is stored. However, if you were to use this link you would need to change elements of it, for example your username is not “username”.
Because my working directory is automatically set to “C:/Users/ianbanda/Intro_to_Python/notebooks”, I can use what’s called a relative path.
A relative path is the location relative to the working directory, i.e., we specify the filepath starting from where we currently are in the folder structure.
For example, I can load the same file as above using the path
This will work for any user, as long as their working directory is set to the “notebooks” Folder in the “Into_to_Python” parent folder. You may notice here I’ve used two full stops, which is something we have yet to see. This refers to us moving back one level in the folder structure.
We highly recommend using forward slashes “/” within file paths. However, when copying a file path from Windows Explorer it will often have the “” backslash character instead of a forward slash.
This causes issues in two ways. Firstly, this is a Windows exclusive issue, as Mac and Linux operating systems use the forward slash “/”. Secondly the backslash symbol is often used as an escape character within Python. Thirdly; although Python will often accept backslashes, other commonly used languages, like the statistical programming language “R” will not. It’s worth getting into the good practice of using forward slashes.
Lastly, if you absolutely must use backslashes you should preface the string with the letter “r”, to ensure it’s passed as a raw string, rather than a unique character in Python. This is commented out below due to conflicts with the software these notes are written in, but will work for you if you have followed thus far!
| pclass | survived | name_of_passenger | sex_of_passenger | age_of_passenger | sibsp | parch | ticket | fare | cabin | embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | Allen, Miss. Elisabeth Walton | female | 29.0000 | 0 | 0 | 24160 | 211.3375 | B5 | S |
| 1 | 1 | 1 | Allison, Master. Hudson Trevor | male | 0.9167 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 2 | 1 | 0 | Allison, Miss. Helen Loraine | female | 2.0000 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 3 | 1 | 0 | Allison, Mr. Hudson Joshua Creighton | male | 30.0000 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 4 | 1 | 0 | Allison, Mrs. Hudson J C (Bessie Waldo Daniels) | female | 25.0000 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
As mentioned earlier we can also use the “pd.read_” functions to read in excel files. These have the file extension “.xlsx” and differ slightly from comma separated value files.
The function for reading in these files is pd.read_excel(), which also takes a minimum of one argument; the location of the file including the file extension.
Import the data police_data.xlsx. We specifically want the second sheet, this has the name “Table P1”. You will need to specify some additional parameters. Look in the help documentation to see which one you should specify.
Hint - If referencing the sheet by index position; remember that Python starts counting at 0!
JSON (JavaScript Object Notation) files are text files that store data in a structured, human-readable format using key-value pairs, lists, and nested objects. The structure is similar to Python dictionaries and lists.
Why use JSON files in Python?
Typical uses in Python:
A function is a bit of code which, when called, performs a task. It can take various inputs, called arguments, and return outputs.
Functions can help us to write code that is consistent, readable, maintainable and reproducible.
Functions are especially useful for reducing repetition. Repetitive code is harder to read and harder to maintain.
Functions within Python generally fall into three categories:
These are built into Python and always available for use Examples – print(), help().
Created by users to carry out specific tasks. Declared using the def keyword.
User defined - generally one line functions used within a larger piece of code.
3
Our functions start with the keyword def (define) with syntax highlighting, which is followed by our function name. This:
Assess whether each name follows good naming conventions (clarity, consistency, verbs vs. nouns, etc.)
def create_age_sex_pivot_tables()def PrintData()def x()def disease_prevalance()def get_number_of_patients()def False()def import_excel_file()def process1()def clear_temp_directory()def calc()def calculate_bmi()def pivot_and_save_to_excel()def process_data_and_return_result()def process_data_and_return_result_or_error_if_fails()def process_text()As well as a name, functions can have arguments. Arguments are information, such as data, that are passed into the function.
In the example above, we have passed values into the function add_two_values(), but arguments can also be other data types such as strings.
A function can have multiple arguments.
In the function body, the arguments take the place of the data in the code.
The date is 30 March 2026
All of the code inside the function body should be indented.
The return statement ends the function and sends a value back to the caller. It can return any data type.
[1, 2, 3, 4, 5]
We heavily promote looking at docstrings (which means document strings), the inbuilt “help” documents in our courses. They’re very useful for finding out what a function or method does, what our parameters are called, and what we should expect to be passed as arguments.
Docstrings commonly describe:
But in general, there is scope to add any information that you consider relevant to an end-user of this particular function.
def add_two_values(value_1, value_2) :
"""
This function will add together two values
Parameters
----------
value_1 : The first value to add
value_2 : The second value to add
Returns
-------
total: The sum / concatenation of the two values specified.
Notes
-----
A TypeError is raised if the two types cannot be added together.
"""
total = value_1 + value_2
return totaldef add_two_values(value_1, value_2) :
"""
This function will add together two values
Args:
value_1(number) : The first value to add
value_2(number) : The second value to add
Returns:
total(number): The sum / concatenation of the two values specified.
Notes:
A TypeError is raised if the two types cannot be added together.
"""
total = value_1 + value_2
return totalHelp on function add_two_values in module __main__:
add_two_values(value_1, value_2)
This function will add together two values
Args:
value_1(number) : The first value to add
value_2(number) : The second value to add
Returns:
total(number): The sum / concatenation of the two values specified.
Notes:
A TypeError is raised if the two types cannot be added together.
Add a docstring to your function that explains what it does, describes its parameters and return value
A virtual environment is a folder containing a self-contained Python installation and libraries. It helps you avoid conflicts between different projects’ dependencies.
Open your command line (Terminal, Command Prompt, or PowerShell).
Navigate to your project folder (optional):
Create a virtual environment named venv:
This creates a folder called venv in your project directory.
Once activated, use pip to install packages:
Create and activate a virtual environment. Install a package and check it’s available.