Python

How to Brute Force Sort a List in Python: Bubble, Insertion, and Selection

Earlier in this series, I wrote a couple of articles on how to sort different types of lists in Python. For instance, I wrote one article on how to sort a list of strings. Then, later I wrote an article on how to sort a list of dictionaries. In both of those article, I used a few elegant solutions that are afforded to use by the Python standard library. Of course, what if we want to write our own sorting algorithm? That’s our topic for today!

As it turns out, there are a lot of ways to write your own brute force sorting algorithm in Python. For instance, you could try to implement selection sort, bubble sort, or insertion sort. For fun, you might even roll your own bogosort. In this article, we’ll take a look at solutions for all four algorithms.

Problem Description

If you’ve ever taken a data structures or algorithms course, you’re probably familiar with the different ways we can store and manage data in a program. For example, we might store information in a list because we want to be able to access it at random quickly. Alternatively, we might opt for a dictionary because we want a quick way to lookup values.

Whatever data structure we choose, there are various ways we can interact with it. For example, a stack usually has push and pop operations. Meanwhile, a list might have insert and remove operations.

In this article, we’ll be taking a look at the Python list which can function as a lot of different data structures (e.g. stacks, queues, etc.). For our purposes, we’ll be treating it like an array of integers:

1
2
my_list = [4, -7, 5, 4]
my_sorted_list = [-7, 4, 4, 5]

Now, the question is: what can we do with a list of integers? Well, we could try summing them up. Alternatively, we might look for the mean, median, and mode. That said, you’re not here to do any of that. You want to know how to sort this thing.

That said, sorting can mean a lot of different things depending on the context. Of course, as my buddy Robert said:

To sort means to flip the proverbial bird at entropy

In other words, the goal of sorting is to take the chaos of some list and organize it in some specific order. For example, if we sort this list of integers, we could organize the values in ascending or descending order. Luckily, most of the algorithms we’ll be looking at in this article will work for any sortable data like strings and characters.

Specifically, our goal will be to write a few list sorting algorithms by hand. In other words, we won’t be using any of the straightforward solutions outlined in the previous articles. Instead, we’ll write our own loops to implement some of the common poor-performing algorithms like bubble sort, insertion sort, and selection sort (i.e O(N2)). After all, each of these poor-performing algorithms work on the basis of brute force: sort one element per pass.

For now, we won’t bother talking about Big O notation, but if you’re interested in that sort of thing, I wrote an article about it ages ago.

Solutions

As I mentioned already, we’ll take a look at three typical brute force sorting algorithms: bubble sort, insertion sort, and selection sort. Of course, we won’t leave here without at least one fun sorting algorithm (hint: it’s bogo sort).

Sort a List With Bubble Sort

If you’re not familiar with bubble sort, we’ve written about the algorithm for the Sample Programs repo. To summarize, bubble sort is an algorithm which relies on swapping consecutive pairs of elements. As a result, large values tend to “bubble” up to the top of the list. To see this algorithm in action, check out the following video:

At any rate, here’s a simple Python implementation of bubble sort:

1
2
3
4
5
6
7
8
my_list = [4, -7, 5, 4]
is_sorted = False
while not is_sorted:
  is_sorted = True
  for i in range(len(my_list) - 1):
    if my_list[i] > my_list[i + 1]:
      my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i]
      is_sorted = False

I wrote this algorithm based on the pseudocode provided in Dr. Shun Yan Cheung’s bubble sort notes. Essentially, it works by continually swapping pairs of consecutive elements that are out of order until there are no more swaps to be made. For example, on the first pass, we end up with the following change:

1
2
[4, -7, 5, 4]  # Initial list
[-7, 4, 4, 5]  # After the initial iteration

Interestingly, we actually end up with a sorted list after the first pass in this case. Of course, that is almost never the case. For example, if we change the list as follows:

1
[5, 4, 3, 2, 1]

We will only see the 5 move on the first pass:

1
2
[5, 4, 3, 2, 1]  # Initial list
[4, 3, 2, 1, 5]  # After the first iteration

In other words, we end up with our worst nightmare: a list that’s in reverse sorted order.

At any rate, the portion of the code that performs each swap is the inner loop:

1
2
3
4
for i in range(len(my_list) - 1):
  if my_list[i] > my_list[i + 1]:
    my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i]
    is_sorted = False

Meanwhile, the code that detects if the list is sorted is the outer loop:

1
2
3
is_sorted = False
while not is_sorted:
  is_sorted = True

Of course, the actual mechanism that tells us if the list is not sorted is the line is_sorted = False in the inner loop. If there aren’t any swaps needed for a pass of the list, the is_sorted variable stays true. In other words, we’re done!

As you can probably imagine, there are some minor optimizations we can make with this algorithm. For example, we know that each pass moves the current largest element to the end of the list. As a result, we could reduce our number of checks by “shrinking” our list by one each iteration. Of course, I’ll leave that exercise up to you.

Sort a List With Insertion Sort

If bubble sort isn’t your style, perhaps you might like to try insertion sort. Once again, I won’t go into too much detail on this algorithm because we’ve written about it for the Sample Programs repo. That said, the basic idea behind insertion sort is to treat a subset of the list as sorted and increasing grow that collection by inserting elements in it from the unsorted set—or visually:

In terms of implementation, we can write the insertion sort algorithm as follows:

1
2
3
4
5
6
7
8
my_list = [4, -7, 5, 4]
for i in range(1, len(my_list)):
  to_swap = my_list[i]
  j = i - 1
  while j >= 0 and my_list[j] > to_swap:
    my_list[j + 1] = my_list[j]
    j -= 1
  my_list[j + 1] = to_swap

Once again, this solution was borrowed from the pseudocode on Algorithmist. It works by starting at the first index (i.e. i = 1) and comparing that element with the element at the zeroth index (i.e. j < 1). If a swap is needed, the items are swapped. In this case, the second item is smaller than the first, so we end up with the following change:

1
2
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After the first iteration

Next, the algorithm moves to the second index (i.e. i = 2) and begins working backward (i.e. j < 2) to find where that item fits in the first two items. In this case, 5 is already larger than 4, so we don’t need to perform any swaps:

1
2
3
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After the first iteration
[-7, 4, 5, 4]  # After the second iteration

Finally, the outer loop moves to the final element (i.e i = 3) and begins scanning through the sorted portion of the list (i.e. j < 3) to find where the current item goes. In this case, we only need to check as far back as the first index to figure out where 4 goes. As a result, we’re finished:

1
2
3
4
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After the first iteration
[-7, 4, 5, 4]  # After the second iteration
[-7, 4, 4, 5]  # After the third iteration

One thing to note is that swaps occur as we work backward through the sorted list. For example, in the last iteration, we found out that 5 was bigger than 4. At that point, we were able to move 5 into the last position. The portion of the code that handles swapping is the inner loop:

1
2
3
while j >= 0 and my_list[j] > to_swap:
  my_list[j + 1] = my_list[j]
  j -= 1

Meanwhile, the outer loop tracks the point that divides the sorted portion of the list from the unsorted portion and performs insertion:

1
2
3
4
5
for i in range(1, len(my_list)):
  to_swap = my_list[i]
  j = i - 1
  # Inner loop
  my_list[j + 1] = to_swap

As you can probably imagine, there are more pythonic ways to write this solution. For example, Haseeb Majid chose to split the list in half and reassemble it with the latest item inserted in the correct place. If you know of any better solutions, feel free to share them in the comments.

Sort a List With Selection Sort

Now that we’ve seen insertion sort, it’s not too much of a stretch to begin talking about selection sort. After all, the algorithm is quite similar. However, instead of inserting an item in a sorted sublist, we seek out the smallest item from the unsorted sublist and add it to the end of the sorted sublist. For more information, check out the description of selection sort in the Sample Programs repo. Otherwise, here’s a nice visualization:

In terms of actual code, here’s a potential solution in Python:

1
2
3
4
5
6
7
my_list = [4, -7, 5, 4]
for i in range(len(my_list)):
  min_index = i
  for j in range(i + 1, len(my_list)):
    if my_list[j] < my_list[min_index]:
      min_index = j
  my_list[i], my_list[min_index] = my_list[min_index], my_list[i]

As usual, I based this solution off a solution written in C on the selection sort Wikipedia page. It works by starting from the first element in the list (i.e. i = 0) and searching for the smallest element in the list (i.e. j > 0). After a complete pass, we know we’ve found the smallest element (min_index = 1), so we can perform our swap. On the first pass, we end up with the following change:

1
2
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After first iteration

Then, we move our main pointer (i.e. i = 1) and begin searching the unsorted portion of the list (i.e. j > 1) for the smallest value. On the second pass, we end up with the following change:

1
2
3
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After the first iteration
[-7, 4, 5, 4]  # After the second iteration

In this case nothing changes because 4 is in the correct position. Then, on the next iteration (i.e. i = 2), we search the unsorted portion of the list (i.e j > 2) for the smallest remaining value. In this case, it’s the other 4:

1
2
3
4
[4, -7, 5, 4]  # Initial list
[-7, 4, 5, 4]  # After the first iteration
[-7, 4, 5, 4]  # After the second iteration
[-7, 4, 4, 5]  # After the third iteration

At this point, the list is sorted.

Naturally, the portion of the code responsible for performing the search is the inner loop:

1
2
3
for j in range(i + 1, len(my_list)):
    if my_list[j] < my_list[min_index]:
      min_index = j

Meanwhile, the portion of code responsible for tracking the end of the sorted list and performing the swap is the outer loop:

1
2
3
4
for i in range(len(my_list)):
  min_index = i
  # Inner loop
  my_list[i], my_list[min_index] = my_list[min_index], my_list[i]

Again, I’m sure there are more clever ways to write this solution using Python. For example, we could use a two-list approach (as Haseeb did) which allows us to use the min, append, and remove functions. In other words, no explicit loops. If you know of other clever ways to implement selection sort, let me know in the comments.

Sort a List With Bogosort

Now that we’ve gone through the three main brute force sorting algorithms, I figured we could look at another brute force method: bogosort. Rather than continually placing one element in the correct place on each pass, we’ll just move the elements at random until we sort the list. Here’s what that might look like in Python:

01
02
03
04
05
06
07
08
09
10
11
12
my_list = [4, -7, 5, 4]
 
import random
is_sorted = False
while not is_sorted:
  random.shuffle(my_list)
  last_item = my_list[0]
  is_sorted = True
  for item in my_list:
    if last_item > item:
      is_sorted = False
    last_item = item

Here, we leverage a helpful package called random which has a utility for shuffling lists. To start, we shuffle the list assuming the list isn’t already sorted. Then, we check to see if the list is sorted. If so, we’re done. Otherwise, we repeat the cycle.

To see this in action, let’s look at what could happen. First, we’ll shuffle the list:

1
2
[4, -7, 5, 4]  # Initial list
[5, 4, 4, -7]  # After first iteration

As we can see, the list isn’t sorted. We’ll confirm that by checking each pair of values in sequential order. If we don’t see any pairs out of order, we stop. However, in this case, 5 is greater than 4, so we know the list isn’t sorted. As a result, we shuffle again:

1
2
3
[4, -7, 5, 4]  # Initial list
[5, 4, 4, -7]  # After first iteration
[-7, 4, 5, 4]  # After second iteration

As we can imagine, this process could go on for a long time. Here’s an actual sequence of permutations I got when I ran the solution above:

1
2
3
4
5
6
7
8
9
[5, 4, 4, -7]
[-7, 4, 5, 4]
[5, 4, -7, 4]
[4, 4, -7, 5]
[4, 5, 4, -7]
[4, 5, 4, -7]
[4, 5, -7, 4]
[4, 5, 4, -7]
[-7, 4, 4, 5]

Now, that’s just for four elements. Imagine how long this could take with even more elements. Or, better yet, don’t imagine it at all. Here’s a visualization of the algorithm failing repeatedly for 100 elements:

Fortunately, there is a slight improvement that can be made to this algorithm. Instead of generating states at random, we could keep track of states we’ve already made and only generate new states. That way, we wouldn’t waste time generating repeated states.

Unfortunately, the deterministic version of bogosort is still very, very bad. Specifically, the algorithm is O(N!). In our four item case, we’d have a worst case runtime of checking 4! (24) states. Meanwhile, all of the algorithms mentioned thus far operate at O(N2) which means at worst 16 comparisons. As you can probably imagine, this is bad news for bogosort in the long term:

NO(N2) ComparisonsO(N!) Comparisons
41624
525120
636720
7495040
86440320

For fun, we’ll take a look at the performance of these algorithms in the next section.

Performance

To test each solution, we’ll need to build up some strings:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
setup = """
import random
size = 4
max = 30
"""
 
bubble_sort = """
my_list = random.sample(range(max), size)
is_sorted = False
while not is_sorted:
  is_sorted = True
  for i in range(len(my_list) - 1):
    if my_list[i] > my_list[i + 1]:
      my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i]
      is_sorted = False
"""
 
insertion_sort = """
my_list = random.sample(range(max), size)
for i in range(1, len(my_list)):
  to_swap = my_list[i]
  j = i - 1
  while j >= 0 and my_list[j] > to_swap:
    my_list[j + 1] = my_list[j]
    j -= 1
  my_list[j + 1] = to_swap
"""
 
selection_sort = """
my_list = random.sample(range(max), size)
for i in range(len(my_list)):
  min_index = i
  for j in range(i + 1, len(my_list)):
    if my_list[j] < my_list[min_index]:
      min_index = j
  my_list[i], my_list[min_index] = my_list[min_index], my_list[i]
"""
 
bogo_sort = """
my_list = random.sample(range(max), size)
is_sorted = False
while not is_sorted:
  random.shuffle(my_list)
  last_item = my_list[0]
  is_sorted = True
  for item in my_list:
    if last_item > item:
      is_sorted = False
    last_item = item
"""

For this test, I introduced random list generation, so we could get more consistent testing. Unfortunately, the random sampling does add to the test time. However, since it’s the same line of code for all the snippets, I suspect it only adds an overhead.

At any rate, to actually test these snippets, we just need to invoke timeit:

1
2
3
4
5
6
7
8
9
>>> import timeit
>>> min(timeit.repeat(setup=setup, stmt=bubble_sort))
9.461616800001138
>>> min(timeit.repeat(setup=setup, stmt=insertion_sort))
7.850697500000024
>>> min(timeit.repeat(setup=setup, stmt=selection_sort))
9.171850900000209
>>> min(timeit.repeat(setup=setup, stmt=bogo_sort))
92.38232779999998

As you can probably imagine, I waited a concerning amount of time for that bogosort test to finish. Beyond that, I was most surprised by the performance of the selection sort algorithm. As it turns out, insertion sort generally performs less swaps than bubble sort and less comparisons than selection sort.

If you’re interested in seeing how these solutions scale, I modified the size parameter just for you. However, I didn’t retest bogosort:

01
02
03
04
05
06
07
08
09
10
11
>>> setup = """
import random
size = 10
max = 30
"""
>>> min(timeit.repeat(setup=setup, stmt=bubble_sort))
29.55873109999993
>>> min(timeit.repeat(setup=setup, stmt=insertion_sort))
20.157115599999088
>>> min(timeit.repeat(setup=setup, stmt=selection_sort))
23.557934999998906

Here, we can see that selection sort is beginning to overtake bubble sort. However, it’s still not quite as fast as insertion sort. Naturally, I took to Google to find out exactly why this discrepancy exists. Thankfully, Stack Overflow user Cody Gray has a comprehensive answer. In short, they claimed that these discrepancies are expected. In fact, insertion sort is expected to outperform selection sort which is expected to outperform bubble sort. How cool is that?!

Challenge

If you liked learning about the different brute force sorting algorithms, I have a challenge for you:

Rewrite one of the brute force sorting algorithms (e.g. bubble sort, selection sort, insertion sort, or bogo sort) for your favorite data type (e.g. string, float, tuple, etc.).

There are a ton of different data types out there that you might be interested in sorting. For instance, maybe you want to alphabetize a list of names. Perhaps you have a list of address, and you want to sort them by distance from you.

Whatever data type you choose, find a way to rewrite the existing algorithms to accommodate them. As always, I’ll come up with a solution for my favorite data type, and I’ll share it below in the comments. I recommend you do the same!

A Little Recap

As always, let’s take a look at all of our solutions in one place:

01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
my_list = random.sample(range(max), size)
 
def bubble_sort(my_list):
  is_sorted = False
  while not is_sorted:
    is_sorted = True
    for i in range(len(my_list) - 1):
      if my_list[i] > my_list[i + 1]:
        my_list[i], my_list[i + 1] = my_list[i + 1], my_list[i]
        is_sorted = False
 
def insertion_sort(my_list):
  for i in range(1, len(my_list)):
    to_swap = my_list[i]
    j = i - 1
    while j >= 0 and my_list[j] > to_swap:
      my_list[j + 1] = my_list[j]
      j -= 1
    my_list[j + 1] = to_swap
 
def selection_sort(my_list):
  for i in range(len(my_list)):
    min_index = i
    for j in range(i + 1, len(my_list)):
      if my_list[j] < my_list[min_index]:
        min_index = j
    my_list[i], my_list[min_index] = my_list[min_index], my_list[i]
 
def bogosort(my_list):
  is_sorted = False
  while not is_sorted:
    random.shuffle(my_list)
    last_item = my_list[0]
    is_sorted = True
    for item in my_list:
      if last_item > item:
        is_sorted = False
      last_item = item

This time around, I decided to wrap the solutions in functions, so you could snag the code for yourself. Let me know if that’s helpful.

Published on Web Code Geeks with permission by Jeremy Grifski, partner at our WCG program. See the original article here: How to Brute Force Sort a List in Python: Bubble, Insertion, and Selection

Opinions expressed by Web Code Geeks contributors are their own.

Jeremy Grifski

Jeremy is the founder of The Renegade Coder, a software curriculum website launched in 2017. In addition, he is a PhD student with an interest in education and data visualization.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button