by Onel Harrison
Stability in Sorting Algorithms — A Treatment of Equality
Algorithms are at the heart of computer science. Algorithms used for sorting are some of the most fundamental, useful, and consequently, ubiquitous.
Algorithm — a finite set of unambiguous steps for solving a specific problem.
We constantly and often unconsciously sort and rely on the order of grouped objects. For instance, we rank tasks on a list according to priority. We stack books on shelves by their height. We sort rows in a spreadsheet or database, or rely on the alphabetical order of words in a dictionary. Sometimes, we even perceive a certain kind of beauty in ordered arrangements.
As programmers, knowing how we sort is important because it affects what a sorted arrangement might look like. Not all sorts order objects in the same way! Because of this, the results of sorting operations differ based on the algorithms used. If this goes unacknowledged, we might surprise ourselves or the people who use our software.
The stability of sorting algorithms is one of the distinguishing properties among them. It deals with how the algorithm treats comparable items with equal sort keys.
Sort key — A key used to determine the ordering of items in a collection, e.g. age, height, position in the alphabet, etc.
A stable sorting algorithm maintains the relative order of the items with equal sort keys. An unstable sorting algorithm does not. In other words, when a collection is sorted with a stable sorting algorithm, items with the same sort keys preserve their order after the collection is sorted.
An Example, Code, and a Demo
The image above illustrates the effect of a stable sort. On the left, the data was sorted alphabetically by Name. After sorting the data by Grade, you can see that the alphabetical order of the names was maintained for each row with the same Grade.
With an unstable sort, there is no guarantee the alphabetical ordering is maintained as shown in the image above.
You don’t always need a stable sort
Knowing whether or not the sort you use is stable is particularly important. Especially in situations when your data already has some order to it that you would like to maintain when you sort it by another sort key. For example, you have rows in a spreadsheet containing student data that is, by default, sorted by name. You would like also to sort it by grades while maintaining the sorted order of the names.
On the other hand, the stability of the sort doesn’t matter when the sort keys of the objects in a collection are the objects themselves — an array of integers or strings, for example — because we can’t tell the difference among the duplicated keys.
// $5 bucks if you can correctly tell which 4 in the sorted// array was the first 4 when the array was unsorted.
var numbers = [5, 4, 3, 4, 9];numbers.sort(); // [3, 4, 4, 5, 9]
// A one second trip around the world, courtesy of the Flash, to// whomever correctly tells me which 'harry' in the sorted array was// the second 'harry' in the unsorted array.
var names = ['harry', 'barry', 'harry', 'cisco'];names.sort(); // ['barry', 'cisco', 'harry', 'harry']
Sorts are everywhere — know your sorts
Here are a few common sorting algorithms and their stability:
- Insertion Sort — Stable
- Selection Sort — Unstable
- Bubble Sort — Stable
- Merge Sort — Stable
- Shell Sort — Unstable
- Timsort — Stable
See Wikipedia for a more exhaustive list.
It’s demo time ??
This demo shows the effect of using a stable (insertion sort) and unstable sorting (selection sort) algorithm to sort a small table of data. I had a bit of fun and practically reverse engineered React while building this. Have a look at the source.
If you are thirsty for more knowledge about the stability of other sorting algorithms, Wikipedia has a good comparison table and additional information on well known sorting algorithms.
Until next time, peace.