33  An Introduction to NumPy

NumPy (“Numerical Python”) is an important Python library for scientific computing. It provides a number of features, and key amongst them is the ability to easily work with large multidimensional arrays MUCH more efficiently (computationally).

Because of this, NumPy is particularly important when working in Machine Learning (as you’ll see later in the course).

NumPy is already installed in many scientific distributions of Python, such as the Anaconda distribution you have installed. Therefore in order to work with it in our code, we just need to import the library. Convention dictates that we import the library with the alias “np”, so that we refer to NumPy as “np” in the rest of our code.

import numpy as np

33.1 Multidimensional Arrays

The main object in NumPy is a multidimensional array. Before we go any further, let’s explain what we mean by a multidimensional array.

That seems rather abstract. Let’s think about some more practical examples.

A list of ages is a 1 dimensional array

[32, 71, 56, 54, 28]

A list of ages for each of a number of groups is a 2 dimensional array

Group 1 : 32, 71, 56, 54, 28
Group 2 : 17, 28, 22, 18, 56
Group 3 : 98, 88, 10, 12, 5

A list of ages for each group within each department is a 3 dimensional array

Department 1
Group 1 : 32, 71, 56, 54, 28
Group 2 : 17, 28, 22, 18, 56
Group 3 : 98, 88, 10, 12, 5

Department 2
Group 1 : 8, 9, 15, 16, 23
Group 2 : 21, 37, 45, 42, 46
Group 3 : 51, 67, 16, 16, 21

A 4 Dimensional Array might have Organisation, Department, Group, Age etc etc

In Machine Learning, Multidimensional arrays are common, because we are looking to predict an outcome based on input values (which we call features) - things like patient age, number of admissions etc. And there may be lots of them. And each one adds another dimension to our problem.

So we need a solution that allows us to work with multi-dimensional arrays easily and efficiently. NumPy provides us with this via NumPy Arrays - arrays that can be multi-dimensional and upon which we can perform rapid calculations.