Monday, January 24, 2011

Lesson 12. Sorting

Lesson 12. Sorting

29 14 11 50 3 99

If you are asked to sort a list of items like these from smallest to largest, one way to sort them that comes to mind may be to go through pairs of items, and if either item in the pair is in the wrong order, swap them and repeat going through the list until they are in the correct order.  This is called bubble sort because with each time going through the list, an item “bubbles” toward its destination.  Another possibility is to make a new list, and go through the old list and each time through the list remove the old list’s smallest item and insert it into the new list, which is called insertion sort.  However, the one problem with these sorting algorithms is that they have a worst-case runtime of O(n^2).  In this tutorial, you will learn three new sorting algorithms that make sorting much faster, aiming for a general runtime of O(nlog2n).

The first algorithm we will be learning about is mergesort.  In this algorithm, we break up the list until every item constitutes its own single-element list.  Then, we merge the items together into increasingly large lists that are then sorted until eventually we have the whole list restored and sorted.  I’m going to just show you how it’s done with the picture before showing you the code because mergesort is easiest to understand with a picture.



void quicksort(int* list, int first, int last)
{
     if(first != last)//1
     {
           int pivot = list[first];
           int leftIter = first + 1;
           int rightIter = last;
           while(leftIter < rightIter)//2
           {
                if(list[leftIter] > pivot)//3
                     while(list[rightIter] > pivot && rightIter > leftIter)
                           rightIter--;
                if(rightIter > leftIter && list[rightIter] < pivot && list[leftIter] > pivot)//4
                {
                     int temp = list[rightIter];
                     list[rightIter] = list[leftIter];
                     list[leftIter] = temp;
                }
                leftIter++;
           }
           int lastLess = rightIter;//5
           while(list[lastLess] > pivot && lastLess > first)
           {
                lastLess--;
           }
           if(lastLess > first)//6
           {
                list[first] = list[lastLess];
                list[lastLess] = pivot;
           }
           if(lastLess == first)//7
                quicksort(list, first + 1, last);
           else if(lastLess == last)//8
                quicksort(list, first, last - 1);
           else//9
           {
                quicksort(list, first, lastLess - 1);
                quicksort(list, lastLess + 1, last);
           }
     }
}

1. If first does not equal last (we are not working with a one-item sublist in this recursion), that means that we have to call recursions of the function until we have one-item sublists.

2. We make a temp array to store all of the items as we sort them.  These items will then be returned to the list in the order in which they are stored in temp.

3. We now iterate through the two sublists to combine them into one list.

4. If we’re not at the end of either sublist, we add the smaller item between the items indexed by searchA and searchB and then increment its searchA or searchB iterator.

5. If we are at the end of the right sublist, we add the item in the left sublist that is indexed by searchA.

6. If we are at the end of the left sublist, we add the item in the right sublist that is indexed by searchB.

7. We now have the list sorted in temp, so we transfer the items from temp to the list in the order given in temp.




Another sorting algorithm that generally has a runtime of O(nlog2n) is quicksort.

In quicksort, we once again use recursion to split the list into two sublists.  Here’s how we do it:

We have the list {25, 70, 46, 11, 33, 0, 85, 4}                 

With each recursion, we have one element of the list, a pivot (for simplicity we will always use the first item as the pivot, although we can select different pivots, which can be more efficient).  We will arrange the elements so that there will be two groups of elements: one that contains all elements less than the pivot and one that contains all elements greater than the pivot.

To do this, we traverse the list starting at the first item after the pivot (the second item) and we also traverse the list backwards starting at the last item.  With the forward iterator, we keep going until we find an item greater than the pivot.  If we can find an item less than the pivot with the backward iterator without passing the forward iterator, we swap the items.

We keep iterating until the forward and backward iterators meet, and then we swap the pivot with the last item less than the pivot.  Then, we call recursions on the sublist of elements less than the pivot and the sublist of elements greater than the pivot.

This is the quicksort implementation:

void quicksort(int* list, int first, int last)
{
     if(first != last)//1
     {
           int pivot = list[first];
           int leftIter = first + 1;
           int rightIter = last;
           while(leftIter < rightIter)//2
           {
                if(list[leftIter] > pivot)//3
                     while(list[rightIter] > pivot && rightIter > leftIter)
                           rightIter--;
                if(rightIter > leftIter && list[rightIter] < pivot && list[leftIter] > pivot)//4
                {
                     int temp = list[rightIter];
                     list[rightIter] = list[leftIter];
                     list[leftIter] = temp;
                }
                leftIter++;
           }
           int lastLess = rightIter;//5
           while(list[lastLess] > pivot && lastLess > first)
           {
                lastLess--;
           }
           if(lastLess > first)//6
           {
                list[first] = list[lastLess];
                list[lastLess] = pivot;
           }
           if(lastLess == first)//7
                quicksort(list, first + 1, last);
           else if(lastLess == last)//8
                quicksort(list, first, last - 1);
           else//9
           {
                quicksort(list, first, lastLess - 1);
                quicksort(list, lastLess + 1, last);
           }
     }
}
1. We only go through all of the code in the function if the index of the first item of the sublist and the index of the last item of the sublist are different because if they are the same index, that means that there is only one item in the sublist, so that single-item sublist would be sorted to begin with.

2. We iterate through the list with the left iterator searching for items greater than the pivot and with the right iterator searching for items less than the pivot.  The iterations continue until the left and right iterators meet or pass each other.

3. If we find an item with the left iterator that has a value greater than the pivot, we then have the right iterator search for items less than the pivot, iterating until it finds an item less than the pivot or meets the left iterator.

4. If after we finish iterating the right iterator is still to the right of the left iterator, the right iterator has found a value less than the pivot, and the left iterator has found a value greater than the pivot, we swap the elements indexed by the two iterators and then continue the left iterator’s iteration to the right.

5. After the two iterators meet or pass each other, we have an iterator lastLess iterate from the right to the left until it either reaches the first element of the sublist (the pivot) or it finds an element less than the pivot (the last item in the sublist that is less than the pivot).

6. If lastLess is greater than first (which would mean that there is an item in the sublist less than the pivot), then the first element of the sublist (the pivot) is swapped with the element indexed by lastLess.

7. If lastLess indexes the first element of the sublist, then there are no items in the list smaller than the pivot, so we only need to call quicksort on all of the other elements in the list, which make the sublist of items greater than the pivot.

8. If lastLess indexes the last item in the sublist, then there are no items larger than the pivot, so we only need to call quicksort on all of the other elements in the list, making a sublist of items less than the pivot.

9. Otherwise, the list has items greater than and less than the pivot.  Because of this, we call quicksort on the sublist of items greater than the pivot as well as on the sublist of items less than the pivot.
Once again, we get a runtime of O(nlog2n), which is fast for a sorting algorithm.  However, quicksort can have a runtime of O(n^2) if we are sorting lists that are either already sorted in the right order or sorted in reverse order.  Because of this, if you know going into writing the program that you will be dealing with a lot of mostly-sorted elements, you may want to pick a different sorting algorithm or do some modifications of quicksort to make it faster.  Generally, though, quicksort is a reliable sorting algorithm, which is why a lot of programming languages with built-in sorting functions (for example PHP) use quicksort.
 
I am going to give you two pictures to represent quicksort.  One of them is the traditional representation of quicksort that is shown in textbooks.  The other picture is a detailed representation that is hard to read but shows the related arrays, iterators, and swaps, so it is more step-by-step.  However, the best way to look at quicksort is to trace it yourself, so if you are still confused after my detailed representation of quicksort, try following along with the code and doing it with a pencil and paper, which is as step-by-step as it gets.




Finally, we have heapsort, which uses those heaps covered in the last lesson.

Heapsort is where writing this tutorial gets easy!  All I need to do is use the maxheap code from the last lesson and then write one little function.  As you will see, I don’t even need to change it to be written for minheaps!

Here’s how heapsort works:
  • We first set up an int size, which is the size of the heap we are making out of the given list.
  • We then use heapify on the list and size to give the list a maxheap property.
  • Finally, we remove elements from the heap one by one with removeMax.
Here is the code for heapsort:


void heapsort(int* list, int size)
{
     int* heapArray = new int[size+1];//1
     for(int i = 0; i < size; i++)
           heapArray[i+1] = list[i];
     heapify(heapArray, size);//2
     for(int i = size - 1; i >= 0; i--)//3
     {
           int j = i + 1;
           list[i] = removeMax(heapArray, j);
     }
}

1. Since most arrays use zero-indexing, unless we implement a zero-indexing version of heaps, we need to make a new array that is one element larger than the array we want to sort and then copy the elements we are sorting to this new one-indexed array.  The new array’s size is one item larger than that of the array we are sorting because in C++ arrays are zero-indexed, so the 0th space still has to be in the array even though it won’t be used.

2. We now use heapify on the new array.

3. We then loop through the array we want to sort and use removeMax with each iteration.  We are iterating through the array backwards because the heap is a maxheap; if we used a minheap we would loop through forwards since with each call to the remove function we would be removing the next smallest item in the heap. 
Heapsort has a runtime of O(nlog2n).

Below is a picture of heapsort being used on the sample heap that the heap functions were demonstrated on in the last lesson.  Since the other heap functions were shown in the last lesson, the steps that make up each call to heapify and removeMax are not shown.


Finally, one side note I would like to mention is that often times in the real world, groups of items come somewhat sorted.  This is an important point to remember in sorting algorithms since sorting algorithms can be optimized to take advantage of parts of the list that are pre-sorted.