Topics covered:

  • Sorting
    • Sorting and classification
  • Review of the most popular sorting algorithms
    • Quick sort
    • Merge sort
    • Bubble sort
    • Bucket sort
  • Searching
    • Linear search
    • Binary search
    • Interpolation search
  • Shuffling

Video (in Bulgarian)

Presentation Content

What is a Sorting Algorithm?

  • Sorting algorithm
    • An algorithm that puts elements of a list in a certain order (most common lexicographically)
  • More formally:
    • The output is in some (non-decreasing) order
    • The output is a permutation of the input
  • Efficient sorting is important for
    • Producing human-readable output
    • Canonicalizing data
    • Optimizing the use of other algorithms
  • Sorting presents many important techniques

Classification

  • Sorting algorithms are often classified by
    • Computational complexity
      • worst, average and best behavior
    • Memory usage
    • Recursive or non-recursive
    • Stability
    • Whether or not they are a comparison sort
    • General method
      • insertion, exchange (bubble sort and quicksort), selection (heapsort), merging, serial or parallel…

Stability of Sorting

  • Stable sorting algorithms
    • Maintain the relative order of records with equal values
  • If two items compare as equal, then their relative order will be preserved
    • When sorting only part of the data is examined when determining the sort order

Selection sort

  • Very simple and very inefficient algorithm

    • Best, worst and average case: n2
    • Memory: 1 (constant, only for the min element)
    • Stable: No
    • Method: Selection
    for j = 0 ... n-2
        // find the best element in a[j .. n-1]
        best = j;
        for i = j + 1 ... n -1
            if a[i] < a[best]
              best = i;
        if best is not j
          swap a[j], a[best]
    

Bubble sort

  • Repeatedly stepping through the list
    • Comparing each pair of adjacent items
      • Swap them if they are in the wrong order
    • Best case: n, worst and average case: n2
    • Memory: 1, Stable: Yes, Method: Exchanging
while swapIsDone
  swapIsDone = false
  for i = 0 ... n - 1
    if a[i-1] > a[i]
      swap a[i] a[i-i]
      swapIsDone = true

Insertion sort

  • Builds the final sorted array one item at a time
    • Best case: n, worst and average case: n2
    • Memory: 1, Stable: Yes, Method: Insertion
for i = 1 ... n - 1
  valueToInsert = a[i]
  holePos = i
  while holePos > 0 and valueToInsert < A[holePos - 1]
    a[holePos] = A[holePos - 1] // shift the larger value up
    holePos = holePos - 1       // move the hole position down
  A[holePos] = valueToInsert

Quicksort

  • First divides a large list into two smaller sub-lists then recursively sort the sub-lists
    • Best and average case: n*log(n), worst: n2
    • Memory: log(n) stack space
    • Stable: Depends
    • Method: Partitioning
function quicksort('array')
  if len('array') <= 1:
    return
  select and remove a pivot value from 'array'
  create empty lists 'less' and 'greater'
  for each 'x' in 'array':
    if 'x' <= 'pivot':
      append 'x' to 'less'
    else:
      append 'x' to greater
  return concatenate (quicksort('less'), 'pivot', quicksort('greater'))

Merge Sort

  • Conceptually, a merge sort works as follows
    • Divide the unsorted list into n sublists, each containing 1 element (list of 1 element is sorted)
    • Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining
  • Best, average and worst case: n*log(n)
  • Memory: Depends; worst case is n
  • Stable: Yes;
  • Method: Merging
  • Highly parallelizable (up to O(log(n)) using the Three Hungarian’s Algorithm
  • http://en.wikipedia.org/wiki/Merge_sort

Merge Sort Pseudocode

function merge_sort(list m)
    // if list size is 0 (empty) or 1, consider it sorted
    // (using less than or equal prevents infinite recursion
    // for a zero length m)
  if length(m) <= 1
      return m
    // else list size is > 1, so split the list into two sublists
  var list left, right
  var integer middle = length(m) / 2
  for each x in m before middle
       add x to left
  for each x in m after or equal middle
       add x to right
    // recursively call merge_sort() to further split each sublist
    // until sublist size is 1
  left = merge_sort(left)
  right = merge_sort(right)
    // merge the sublists returned from prior calls to merge_sort()
    // and return the resulting merged sublist
  return merge(left, right)
function merge(left, right)
    var list result
    while length(left) > 0 or length(right) > 0
        if length(left) > 0 and length(right) > 0
            if first(left) <= first(right)
                append first(left) to result
                left = rest(left)
            else
                append first(right) to result
                right = rest(right)
        else if length(left) > 0
            append first(left) to result
            left = rest(left)
        else if length(right) > 0
            append first(right) to result
            right = rest(right)
    end while
    return result

Heap

  • Specialized tree-based data structure that satisfies the heap property:
    • Parent nodes are always greater (less) than or equal to the children
      • No implied ordering between siblings or cousins

Heapsort

  • Can be divided into two parts
    • In the first step, a heap is built out of the data
    • A sorted array is created by repeatedly removing the top element from the heap
  • Best, average and worst case: n*log(n)
  • Memory: Constant - O(1)
  • Stable: No
  • Method: Selection
  • http://en.wikipedia.org/wiki/Heapsort

Counting sort

  • Algorithm for sorting a collection of objects according to keys that are small integers
    • Or big integers and a map
  • Not a comparison sort
  • Average case: n + r
  • Worst case: n + r
    • r is the range of numbers to be sorted
  • Stable: Yes
  • Memory: n + r
  • http://en.wikipedia.org/wiki/Counting_sort

Bucket sort

  • Partitioning an array into a number of buckets
    • Each bucket is then sorted individually
  • Not a comparison sort
  • Average case: n + k
    • k = the number of buckets
  • Worst case: n2 * k
  • Stable: Yes
  • Memory: n * k
  • http://en.wikipedia.org/wiki/Bucket_sort

Comparison of Sorting Algorithms

  • There are hundreds of sorting algorithms
    • Some of them are:
Name Best Avg Worst Memory Stable
Bubble n n2 n2 1 Yes
Insertion n n2 n2 1 Yes
Quick n*log(n) n*log(n) n2 log(n) Depends
Merge n*log(n) n*log(n) n*log(n) n Yes
Heap n*log(n) n*log(n) n*log(n) n No
Bogo n n*n! n*n! 1 No

Search Algorithm

  • An algorithm for finding an item with specified properties among a collection of items
  • Different types of searching algorithms
    • For virtual search spaces
      • satisfy specific mathematical equations
      • try to exploit partial knowledge about structure
    • For sub-structures of a given structure
      • graph, a string, a finite group
    • Search for the max (min) of a function
    • etc.

Linear Search

  • Method for finding a particular value in a list
    • Checking every one of the elements
    • One at a time in sequence
    • Until the desired one is found
  • Worst and average performance: O(n)
 for each item in the list:
     if that item has the desired value,
         stop the search and return the location.
 return nothing.

Binary Search

  • Finds the position of a specified value within an ordered data structure
  • In each step, compare the input with the middle
    • The algorithm repeats its action to the left or right sub-structure
  • Average performance: O(log(n))

Recursive Binary Search

  • Example: Recursive binary search
function binarySearch(int items[], int key, int from, int to)
  if (to < from):
    // set is empty, so return value showing not found
    return KEY_NOT_FOUND;
  // calculate midpoint to cut set in half
  int middle = midpoint(from, to);
  if (a[middle] > key):
    // key is in lower subset
    return binarySearch(a, key, from, middle - 1);
  else if (a[middle] < key):
    // key is in upper subset
    return binarySearch(a, key, middle + 1, to);
  else:
    // key has been found
    return middle;

Iterative Binary Search

  • Example: Iterative binary search
int binarySearch(int a[], int key, int from, int to)
  // continue searching while [imin,imax] is not empty
  while (from <= to):
    //calculate the midpoint for roughly equal partition x/
    int middle = midpoint(from, to);
    // determine which subarray to search
    if (a[middle] < key)
      // change from index to search upper subarray
       from = middle + 1;
    else if (a[middle] > key)
      // change to index to search lower subarray
       to = middle - 1
    else
      // key found at index middle
      return middle;
  return KEY_NOT_FOUND;

Interpolation Search

  • An algorithm for searching for a given key value in an indexed array that has been ordered by the values of the key
    • Parallels how humans search through a telephone book
    • Calculates where in the remaining search space the sought item might be
      • Binary search always chooses the middle element
  • Average case: log(log(n)), Worst case: O(n)
  • http://youtube.com/watch?v=l1ed_bTv7Hw

Interpolation SearchSample Implementation

public int interpolationSearch(int[] sortedArray, int toFind){
  // Returns index of toFind in sortedArray, or -1 if not found
  int low = 0;
  int high = sortedArray.length - 1;
  int mid;
  while(sortedArray[low] <= toFind && sortedArray[high] >= toFind) {
    mid = low + ((toFind - sortedArray[low]) x (high - low)) /
                (sortedArray[high] - sortedArray[low]);
                // out of range   is possible here
    if (sortedArray[mid] < toFind)
      low = mid + 1;
    else if (sortedArray[mid] > toFind)
      high = mid - 1;
    else
      return mid;
  }
  if (sortedArray[low] == toFind) return low;
    else return -1; // Not found
}

Shuffling

Fisher–Yates shuffle algorithm

public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
    var array = source.ToArray();
    var n = array.Length;
    for (var i = 0; i < n; i++)
    {
        // Exchange a[i] with random element in a[i..n-1]
        int r = i + RandomProvider.Instance.Next(0, n - i);
        var temp = array[i];
        array[i] = array[r];
        array[r] = temp;
    }
    return array;
}
public static class RandomProvider
{
    private static Random Instance = new Random();
}