31  File Handling

So far, we have only looked at storing data temporarily (in variables). This data is lost when we shut down. Similarly, we have got all our inputs from the user when the program runs, but that’s likely not practical for many purposes.

A better approach is to use files to store data that we need longer term. This might include : Input data for our code (e.g. parameters for our model) Output data from our code (e.g. results from our model)

There are lots of different ways we can do this in Python. However, for most of the data you’ll be using, you’ll likely want to interact with comma separated value (.csv) files.

.csv files are simply text files that have discrete values separated using commas. This makes it easy to read them into / export them from tables and spreadsheets.

In a .csv file, we can think of the data as being separated into rows or records (lines) and columns (with each column of data separated by a comma).

.csv files can be read and opened using any text editor (including VSCode). They can also be opened using spreadsheet software (eg Excel), which will open them as a spreadsheet - but be warned, it can also add things to it that can cause problems…

31.1 Reading .csv files in Python

There’s a library in Python called csv which allows us to easily read from and write to .csv files. (There’s an even easier approach to using .csv files in Python using Pandas, which you’ll learn about later). Let’s say we wanted to read in a .csv which contained name and age data in the format name, age :

import csv # import the csv library

# Set up two empty lists - one to store names, one to store ages
list_of_names = []
list_of_ages = []

# do the following block of code holding the csv file open in "read only" mode
# ("r"), and henceforth we'll refer to this file opened in read only mode as f
# (alias)
with open("filename.csv", "r") as f:
    # Create a csv reader object (we'll come back to that in Object Oriented
    # Programming), which uses the file opened in read-only mode (f), and where
    # the delimiter is a comma
    reader = csv.reader(f, delimiter=",")

    # For each row it reads in, take the first value before a comma and add it
    # to the list of names, and take the second value, and add it to the list
    # of ages.
    for row in reader:
        list_of_names.append(row[0])
        list_of_ages.append(row[1])

31.2 Writing .csv files in Python

We can also write to .csv files in Python using the same library.

import csv # import the csv library

# Here's a list of column titles
list_of_column_titles = ["Name", "ID"]

# Here's a few lists containing data we want to write out to file
data_row_1 = ["Dan", 43102]
data_row_2 = ["Sammi", 31023]
data_row_3 = ["Amy", 54098]

# Let's combine the data rows into a single list of lists
combined_data = [data_row_1, data_row_2, data_row_3]

# Do the following block of code with output.csv opened in write mode ("w"),
# and setting the newline parameter to null (this is a necessary inclusion on
# Windows, otherwise it'll add an extra line between lines), and refer to it as
# f
with open("output.csv", "w", newline='') as f:
    # Create a csv writer object, which uses f (the file opened in write mode
    # etc) with a comma delimiter
    writer = csv.writer(f, delimiter=",")

    # Write the first row (the column titles)
    writer.writerow(list_of_column_titles)

    # Loop through the other data rows and write them (one on each line)
    for data_row in combined_data:
        writer.writerow(data_row)

The resulting csv from this code would look like this.

31.3 File access Modes

There are a huge number of different file access modes. The most common ones you will use are :

“r” - open the file read-only, start at the beginning of the file

“w” - open the file write-only, start at the beginning of the file

“w+” - open the file read and write, start at the beginning of the file

“a” - open the file in append mode (a write mode that starts at the end of the file)

“w”, “w+” and “a” will create a new file with the given name if one does not exist. “r” will throw up an error if the file does not exist.

IMPORTANT - if you want to add to a file, you MUST use append mode. If you use a write mode, it will start at the beginning, and therefore wipe out your file before it starts.