Multi-dimensional arrays. Why do we need them?
For example, let's say I have a whole bunch of data points, like
Person The Matrix Inception Force Awakens Wall-E ...
A 5 4 5 ?
B 4 ? 4 ?
C ? ? 5 5
...
We would like to guess how Person C would rate Inception. Not an easy problem, but how would we store this kind of data in the first place?
If we wanted to just store the matrix, we can do a list of lists.
xs = [[5, 4, 5, 3], [4, -1, 4, -1], [-1, -1, 5, 5]]
There are 3 lists (one per person) and each one has 4 entries. So it's a 3x4 matrix. We can do a lot with lists of lists, but it will be slow. We would eventually need a more efficient and flexible ways of using multi-dimensional arrays:
The standard way of working with data sets in Python is to use the Numpy library. A Numpy array is a multi-dimensional array.
import numpy as np
arr = np.array([[5, 4, 5, 3], [4, -1, 4, -1], [-1, -1, 5, 5]])
print(arr)
type(arr)
Every array has a shape:
arr.shape
arr = np.array([[1,2,3], [4,5,6]])
print(arr)
arr.shape
# accessing elements:
arr[0] # 0th row
arr[1] # 1st row
arr[0, 0] # very different from lists of lists, for those, we would have done arr[0][0]
arr[0, 1]
We can use slicing too:
arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
arr
arr[1:3,1:3]
arr[0,:]
arr[:,0]
We can have 3 or more dimensional arrays too
ar3 = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(ar3)
ar3.shape
ar3[0,1,2]
arr = np.zeros([3,3]) # you put the shape in as a list
arr
The identity matrix
arr = np.identity(4)
arr
Equally distant points:
np.linspace(1,2,11) # 11 points between 0 and and 1 inclusive
Of course we could have done this with list comprehensions ([1 + 0.1*x for x in range(11)]
) but anything you do in numpy will be faster.
# guess what will happen?
arr = np.zeros([3,3])
arr = arr + 1
print(arr)
Similarly
def f(x):
return x*x + x + 1
f(arr) # again this would never work for lists
# Remark: for library functions you may need to use a function called `vectorize`
Even more interesting:
np.array([1,2,3]) + np.array([4,5,6]) # if these were lists, it would be concatenation
It added the arrays as if they were vectors. Numpy figures out how to use the function with the array you gave.
But maybe you want to control it youself:
arr = np.array(range(9)).reshape(3,3) + 1
arr
np.sum(arr)
np.apply_along_axis(np.sum, 0, arr)
np.apply_along_axis(np.sum, 1, arr)
There is also: np.apply_over_axis
arr = np.array(range(16))
arr.reshape(4,4)
arr.reshape(2,2,-1) # if you put -1, it figures out what the shape should be
arr
np.reshape(arr, (2,2,-1))
This is very very similar to Matlab's plotting functions.
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from math import pi
from math import cos
xs = np.linspace(0, 2*pi, 200)
ys = np.cos(xs)
zs = np.sin(xs)
plt.plot(xs, ys) # first one is blue
plt.plot(xs, zs) # second one is green