Sorting and Efficiency
Everyone has sorted a deck of cards at least once, or sorted a set of numbers. These tasks seem relatively simple to us, however it may surprise one to find out how complicated sorting can be for a a computer. Analyzing different sorting algorithms can give one a better understanding of how many different ways sorting can be done, and is a great starting point to learn about algorithm efficiency.
One of the first sorts that most people learn is insertion sort. The insertion sort algorithm is as follows:
Start at second item in the list
Compare it to the item that precedes it
If the item is less than the item that precedes it, swap the items.
Continue until the item that precedes it is less than the selected item or the item is at the end of the list
This process can be visualized by the following gif:
Algorithms worst case run-time can be described in big- oh notation. If a sorting algorithm has a run time that grows proportional to the size of the list (n) the run time would be O(n). In insertion sorts case the worst case run time is O(n^2). Insertion sort has a run time of O(n^2) when the list is sorted in a reversed order. When the list is reversed all n items will be compared to all the other n items, n*n = n^2 run time.
There are many ways to describe the run-time of an algorithm. Big - Oh describes the worst case run time, big- omega describes the best case run time, and big- theta describes the run time with both a minimum and maximum. The insertion sort algorithm described previously has a best case run time of Ω(n) on an already sorted list.
Another sort with a better run time than insertion sort is Merge sort. Merge sort is recursive algorithm, that continuously divides the size of the list in half, sorts them separately, then merges the two halves together. The base case of this algorithm is a list of size two. The process can be visualized by the following gif:
http://en.wikipedia.org/wiki/File:Merge-sort-example-300px.gif
The worst case run time of this algorithm is O(nlogn), the best case is O(nlogn). This is an example of an algorithm with a tight-bound run time (meaning the best and worst case are of the same degree). The algorithms run time can also be described as Θ(nlogn).
A great resource for viewing various sorting algorithms is http://www.sorting-algorithms.com/
In the real world, often data is partially sorted, and some sorting algorithms can take advantage of this. A sort that does this is Timsort. Timsort is pythons built in sort, and has a worst case run time of O(nlogn) and a best case of O(n). Timsort combines insertion sort and merge sort, and as a result has a better minimum run time that merge sort.
There are many different types of sorting algorithms, some better than others. Some sorting algorithms are better for certain types of data than others. By analyzing the run times of various algorithms, it can help one make informed decisions about what algorithm to use on a set of data. More importantly analyzing sorting algorithms is a great way to introduce efficiency to students.