All You Need to Know About Sorting Algorithms

In this article, we'll be comparing famous sorting algorithms and in the end, you can find a code-wise implementation of the discussed algorithms.

Foreword

Sorting algorithms are some procedural methods that we use to rearrange the elements in an ordered list of items based on a criterion. A developer must study this subject extensively and regularly because sorting algorithms are:

  • Always present in any data-driven projects (practically speaking, any worthwhile subject)

  • A common source of performance degradation, if not implemented well

  • Very effective at optimizing the performance of a logical operation, if chosen and implemented well enough

  • Easy to reimplement once enough expertise is achieved

  • Infamous for being presented as hard interview questions

Overview

There are many known sorting algorithms but we'll be only covering some of the most well-known and practical ones. Here is a comparative list:

Algorithm NameBest CaseAverage CaseWorst CaseMemoryStable
Quick Sortnlog(n)nlog(n)nlog(n)No
Merge Sortnlog(n)nlog(n)nlog(n)nYes
Heap Sortnlog(n)nlog(n)nlog(n)1No
Tree Sortnlog(n)nlog(n)nlog(n)nYes
Selection Sort1No
Bubble Sortn1Yes
Tim Sortnnlog(n)nlog(n)nYes

(Source: GeeksForGeeks)

Before explaining each one, let's review two basic concepts presented in the table:

  • Each case is the time complexity of the sorting algorithm, measured using

    • Big-O notation for the worst-case scenario

    • Theta Notation for an average scenario

    • Omega Notation for the best-case scenario

  • A sorting algorithm is said to be stable if it retains the relative order between two elements that are deemed to be equal according to the sorting criteria. Unstable sorting algorithms do not care and might change their places with one another.

Introduction

Quick Sort

The first step in a quick sort algorithm is choosing a pivot element, and there are many ways to go about it, depending on how one chooses the pivot element. For example, the pivot element could be:

  • The last element

  • The first element

  • The median

  • Random element

Regardless of how to choose the pivot element, the next step is to divide the array into elements higher than the pivot and elements lower than the pivot and to repeat this process until all elements are examined.

Merge Sort

In merge sort, one will divide the subject array into two subarrays and sort each subarray in a recursive manner, then finally put together all the parts in a sorted manner by merging the smaller arrays.

Tip

Note that the merge function performs well when number of subarrays is a power of 2.

(Source)

Heap Sort

For Heap sort, you first have to form a heap from your array, whether it's a maximum or a minimum, doesn't matter.

Short Background

Heap is a basic data structure (like an array or int) and they represent a complete binary tree, but they come strictly in two forms:

  1. Minimum Heap: Each parent node is less than both of its children, and the top node is the minimum among all the values

  2. Maximum Heap: Each parent node is more than both of its children, and the top node is the maximum among all the values.

What's the difference with an array?

That's a good question, and the answer is Time Complexity. In certain use cases, it's much faster to use a heap than to use an array. Here's a short comparison between the two:

OperationT-Complexity for ArrayT-Complexity for Heap
Find Min/MaxO(n)O(1)
Add/Remove ElementO(1) ~ O(n)*O(log(n))
Remove Min/MaxO(n)O(log(n))

* If the necessary space is already allocated for the new element, then it's O(1), if not, a new array should be created which would be O(n).

Sorting Process

  1. Create a maximum heap from the array

  2. Pop the max element

  3. Reevaluate Heap

  4. Repeat (2-3) until the heap is empty

For example, the Mergesort algorithm is exceedingly fast but requires a lot of space to do the operations. On the other side, Bubble Sort is exceedingly slow but requires the minimum space.

Tree Sort

The sorting process involves creating a Binary Search Tree.

Short Background

A tree data structure is called binary when each node has at most 2 children. And for that to be a binary search tree it has to have 3 more properties:

  1. The left child of a node must have a value lesser than the value of the node.

  2. The right child of a node must have a value greater than the value of the node.

  3. Each left and right subtree must be binary search trees themselves.

Sorting Process

  1. Create a binary search tree from the array.

  2. Perform an in-order traversal on the tree.

Tip

If we know the median and set it as the root of BST, we reduce the search space by half in each step.

Selection Sort

In this sorting algorithm, one iteratively searches for the minimum (or maximum) value and moves it to the correct position in the array by swapping elements.

Bubble Sort

This is the simplest and most widely known sorting algorithm. For this technique, one will iteratively swap the position of two adjacent elements that are in the wrong order.

Tim Sort

This is an efficient and very popular sorting algorithm that is also picked up by some programming languages as the native sorting method like the sorted method in Python. The algorithm implements two sorting methods:

  • Insertion Sort

  • Merge Sort

The algorithms steps are as follows:

  1. Divide the unsorted array into subarrays (called run)

    Because merging is most efficient when the number of runs is equal to, or slightly less than, a power of two, and notably less efficient when the number of runs is slightly more than a power of two, Timsort chooses minimum run size to try to ensure the former condition

    (Source)

    The minimum run size is chosen as a number between 32 and 64.

  2. Use Insertion Sort to sort runs.

  3. Use Merge Sort to merge the sorted arrays.

Code Implementations

You can refer to this repository for the coding implementation of the formerly described sorting algorithms*

* Note: With the exception of Tim Sort. Because the native sorted method in Python already supports that and I didn't implement it for ruby just yet.