Last time: More about mutable vs immutable. Selection-sort.
Today: making our selection sort code better. Complexity, big-O notation. Recursion.
We want to write a function that will sort a list of numbers:
e.g. we want sort([2, 15 ,-1 ,8 ,7])
to return:
[15,8,7,2,-1]
Idea for algorithm: Move the maximum element to the top of the list, then move the maxiumum of the rest to the top of the list...
Pseudo-code first level:
input: xs output: a list with the same entries as xs but x[i]>=x[j] for all i>j N = length of xs for i=0,...,N-1 mloc = the location of the maximum of xs from i to N-1 swap xs[mloc] and xs[i]
This algorithm is called selection sort. There are much better algorithms like merge sort, quick-sort.
Of course we need to expand "the location of the maximum of xs from i to N-1" as code as well.
# returns the index of the max of the list
def max_loc_of_part(xs, start, end): # this is actually officially called argmax
current_max = xs[start]
current_max_location = start
for i in range(start, end):
if current_max < xs[i]:
current_max = xs[i]
current_max_location = i
return current_max_location
# let's test:
max_loc_of_part([1,2,3,4,5],0,5)
max_loc_of_part([6,2,3,4,5],0,2)
max_loc_of_part([6,2,3,4,5],2,3)
During the sorting, we'll also need to swap things. Let's make that into a function too:
# note that this swaps *in place*
def swap(xs, i, j):
dum = xs[i]
xs[i] = xs[j]
xs[j] = dum
#return xs #(we don't need to return because xs is changed by the function, but we could do it)
# let's test:
xs = [1,2,3]
swap(xs,0,1)
print(xs)
def sort(xs):
N = len(xs)
for i in range(N):
swap(xs, max_loc_of_part(xs,i,N), i)
return xs
xs = [2, 15 ,-1 ,8 ,7]
sort(xs)
Note that we could have written the same algorithm in one go like this:
def sort_with_not_great_code(xs):
N = len(xs)
for i in range(N):
current_max = xs[i]
current_max_location = i
for j in range(i, N):
if current_max < xs[j]:
current_max = xs[j]
current_max_location = j
dum = xs[i]
xs[i] = xs[current_max_location]
xs[current_max_location] = dum
return xs
# test
xs = [2, 15 ,-1 ,8 ,7]
sort_with_not_great_code(xs)
But it is harder to read, and most importantly, you don't get any parts that you can test indepdently and make sure are ok. So it's better to break the problem down to smaller parts.
A few other small improvements:
we don't need to make max_loc_of_part(xs,start,end)
, we can make a function called argmax which gives the location of the max of the whole list. We need to use the slicing notation xs[i:N]
to get the sublist of xs
. The way slicing works is as follows:
if xs = [1,4,6,4,1,5,10]
, then xs[2:5]
is [6,4,1]
. It's the part of the list beginning from the index on the left to the index on the right.
we can use:
for i,x in enumerate(xs):
to cycle though the list and keep an index at the same time.
def argmax(xs):
current_max = xs[0]
current_max_location = 0
for i,x in enumerate(xs): # use enumerate to cycle through i and xs[i] at the same time
if current_max < x:
current_max = x
current_max_location = i
return current_max_location
def sort(xs):
N = len(xs)
for i in range(N):
swap(xs, i + argmax(xs[i:N]), i)
return xs
# test
xs = [2, 15 ,-1 ,8 ,7]
sort(xs)
Remark: We didn't deal with empty lists in our code. We probably should.
A little more about slicing:
xs = [1,2,3,4,5,6,7]
print("the element at index 2 is: ", xs[2])
print("the sublist from indices 2 (included) to 4 (not included) is: ", xs[2:4])
print("the part up to index 2 (not included) is: ", xs[:2])
print("the part starting from index 2 (included) is: ", xs[2:])
If I run sort(xs)
on a list xs
of length $n$. How many operations are done by Python?
If we look through the code, we can see that we loop through the list once, and for each time we loop through, we loop through the part from xs[i] to xs[N-1]. And then we do a few operations for swapping etc. All in all, it is safe to say that we do less than $$20\frac{n(n+1)}{2} = 10n^2 + 10n$$ operations.
As $n$ becomes large, $n^2$ is clearly much much bigger than $n$, and the factor of $10$ is not that interesting. since it's not the real source of growth in the amount of time it will take.
So we say, the algorithm takes time in $\operatorname{O}(n^2)$.
Officially, let us call longest number of steps that the algorithm takes on a list of length $n$ as $T(n)$, so $T: \mathbb{N} \rightarrow \mathbb{N}$ is function called the worst-case-time-complexity of the algorithm. $T$ is in $\operatorname{O}(n^k)$ if, there is a constant $C$ such that, for large enough $n$, we have: $$T(n) < Cn^k$$
Conclusion: the running time of the selection sort algorithm is (for large $n$) less than $Cn^2$, so we say it's an $\operatorname{O}(n^2)$ algorithm.
By the way, the best sorting algorithm is $\operatorname{O}(n\log n)$.
xs
, make a new list ys = []
. For each element x
in xs
, insert x
into ys
in a way so that ys
stays sorted. e.g. if ys = [10,8,4,1]
and we are inserting x = 5
, ys
will be [10,8,5,4,1]
. (you can use ys.insert(place, new_element)
if you want or write it yourself)