Operation Algorithm: Quicksort in C and Scheme
Learning Hadoop and Spark?
I've scoured the internet and I think this free Big Data course from UC San Diego is a great way to jump in. It's hosted on Coursera, so you can audit the course for free.
This is the first post in my quest to be a better programmer. Quicksort is one of the ‘standard’ sorting algorithms. Its fairly easy to implement, and normally runs fairly quickly. I’ve implemented quicksort in both C and Scheme, although I think my Scheme implementation is more of a pseudo-quicksort due to a reluctance to break from the functional style encouraged by Scheme syntax.
The ‘Partition’ algorithm is at the core of a quicksort implementation, and this partitions an array into groups of ‘less than x’ and ‘more than x’, with x itself being placed inbetween. The final position of x is the value returned by the partition function. In my simple implementation, the choice of x remains constant (either the first or last element). Randomizing this is the main way to increase the worst case speed of this algorithm.
After partitioning the initial input, quicksort recursivly calls itself on the ‘less-than’ and ‘more-than’ groups returned from partition. The result is a chain of recursive calls which continues until quicksort is trying to sort a single number.
Check out Wikipedia to get a more in-depth understanding of quicksort.
Quicksort is good because it sorts ‘in-place’. In other words it doesn’t use a secondary array to sort the elements into. This can save a lot of memory, especially if you are sorting a list of complicated data structures which are a few KB each.
My C Implementation
My Scheme Implementation
I think its worth noting that the scheme implementation doesn’t exactly sort ‘in-place’. Instead it recursivly builds up secondary lists then joins them together at the end as the return value instead of moving the elements around in a single list. However the amount of memory used should still be in the order of the number of elements in the input array unlike other sorting algorithms (counting-sort, and bucket-sort come to mind) which use secondary arrays which do not form part of the solution.
If you're looking to learn more about basic CS algorithms like Quicksort, I suggest checking out the book Introduction to Algorithms on Amazon