## Topics covered:

• Sorting
• Sorting and classification
• Review of the most popular sorting algorithms
• Quick sort
• Merge sort
• Bubble sort
• Bucket sort
• Searching
• Linear search
• Binary search
• Interpolation search
• Shuffling

## Presentation Content

### What is a Sorting Algorithm?

• Sorting algorithm
• An algorithm that puts elements of a list in a certain order (most common lexicographically)
• More formally:
• The output is in some (non-decreasing) order
• The output is a permutation of the input
• Efficient sorting is important for
• Canonicalizing data
• Optimizing the use of other algorithms
• Sorting presents many important techniques

### Classification

• Sorting algorithms are often classified by
• Computational complexity
• worst, average and best behavior
• Memory usage
• Recursive or non-recursive
• Stability
• Whether or not they are a comparison sort
• General method
• insertion, exchange (bubble sort and quicksort), selection (heapsort), merging, serial or parallel…

### Stability of Sorting

• Stable sorting algorithms
• Maintain the relative order of records with equal values
• If two items compare as equal, then their relative order will be preserved
• When sorting only part of the data is examined when determining the sort order

### Selection sort

• Very simple and very inefficient algorithm

• Best, worst and average case: `n``2`
• Memory: `1` (constant, only for the min element)
• Stable: No
• Method: Selection
``````for j = 0 ... n-2
// find the best element in a[j .. n-1]
best = j;
for i = j + 1 ... n -1
if a[i] < a[best]
best = i;
if best is not j
swap a[j], a[best]
``````

### Bubble sort

• Repeatedly stepping through the list
• Comparing each pair of adjacent items
• Swap them if they are in the wrong order
• Best case: `n`, worst and average case: `n``2`
• Memory: `1`, Stable: Yes, Method: Exchanging
``````while swapIsDone
swapIsDone = false
for i = 0 ... n - 1
if a[i-1] > a[i]
swap a[i] a[i-i]
swapIsDone = true
``````

### Insertion sort

• Builds the final sorted array one item at a time
• Best case: `n`, worst and average case: `n``2`
• Memory: `1`, Stable: Yes, Method: Insertion
``````for i = 1 ... n - 1
valueToInsert = a[i]
holePos = i
while holePos > 0 and valueToInsert < A[holePos - 1]
a[holePos] = A[holePos - 1] // shift the larger value up
holePos = holePos - 1       // move the hole position down
A[holePos] = valueToInsert
``````

### Quicksort

• First divides a large list into two smaller sub-lists then recursively sort the sub-lists
• Best and average case: `n*log(n)`, worst: `n``2`
• Memory: `log(n)` stack space
• Stable: Depends
• Method: Partitioning
``````function quicksort('array')
if len('array') <= 1:
return
select and remove a pivot value from 'array'
create empty lists 'less' and 'greater'
for each 'x' in 'array':
if 'x' <= 'pivot':
append 'x' to 'less'
else:
append 'x' to greater
return concatenate (quicksort('less'), 'pivot', quicksort('greater'))
``````

### Merge Sort

• Conceptually, a merge sort works as follows
• Divide the unsorted list into `n` sublists, each containing `1` element (list of 1 element is sorted)
• Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining
• Best, average and worst case: `n*log(n)`
• Memory: Depends; worst case is `n`
• Stable: Yes;
• Method: Merging
• Highly parallelizable (up to `O(log(n)`) using the Three Hungarian’s Algorithm
• http://en.wikipedia.org/wiki/Merge_sort

### Merge Sort Pseudocode

``````function merge_sort(list m)
// if list size is 0 (empty) or 1, consider it sorted
// (using less than or equal prevents infinite recursion
// for a zero length m)
if length(m) <= 1
return m
// else list size is > 1, so split the list into two sublists
var list left, right
var integer middle = length(m) / 2
for each x in m before middle
for each x in m after or equal middle
// recursively call merge_sort() to further split each sublist
// until sublist size is 1
left = merge_sort(left)
right = merge_sort(right)
// merge the sublists returned from prior calls to merge_sort()
// and return the resulting merged sublist
return merge(left, right)
``````
``````function merge(left, right)
var list result
while length(left) > 0 or length(right) > 0
if length(left) > 0 and length(right) > 0
if first(left) <= first(right)
append first(left) to result
left = rest(left)
else
append first(right) to result
right = rest(right)
else if length(left) > 0
append first(left) to result
left = rest(left)
else if length(right) > 0
append first(right) to result
right = rest(right)
end while
return result
``````

### Heap

• Specialized tree-based data structure that satisfies the heap property:
• Parent nodes are always greater (less) than or equal to the children
• No implied ordering between siblings or cousins

### Heapsort

• Can be divided into two parts
• In the first step, a heap is built out of the data
• A sorted array is created by repeatedly removing the top element from the heap
• Best, average and worst case: `n*log(n)`
• Memory: Constant - `O(1)`
• Stable: No
• Method: Selection
• http://en.wikipedia.org/wiki/Heapsort

### Counting sort

• Algorithm for sorting a collection of objects according to keys that are small integers
• Or big integers and a `map`
• Not a comparison sort
• Average case: `n + r`
• Worst case: `n + r`
• `r` is the range of numbers to be sorted
• Stable: Yes
• Memory: `n + r`
• http://en.wikipedia.org/wiki/Counting_sort

### Bucket sort

• Partitioning an array into a number of buckets
• Each bucket is then sorted individually
• Not a comparison sort
• Average case: `n + k`
• `k` = the number of buckets
• Worst case: `n2 * k`
• Stable: Yes
• Memory: `n * k`
• http://en.wikipedia.org/wiki/Bucket_sort

### Comparison of Sorting Algorithms

• There are hundreds of sorting algorithms
• Some of them are:
Name Best Avg Worst Memory Stable
Bubble n n2 n2 1 Yes
Insertion n n2 n2 1 Yes
Quick n*log(n) n*log(n) n2 log(n) Depends
Merge n*log(n) n*log(n) n*log(n) n Yes
Heap n*log(n) n*log(n) n*log(n) n No
Bogo n n*n! n*n! 1 No

### Search Algorithm

• An algorithm for finding an item with specified properties among a collection of items
• Different types of searching algorithms
• For virtual search spaces
• satisfy specific mathematical equations
• try to exploit partial knowledge about structure
• For sub-structures of a given structure
• graph, a string, a finite group
• Search for the max (min) of a function
• etc.

### Linear Search

• Method for finding a particular value in a list
• Checking every one of the elements
• One at a time in sequence
• Until the desired one is found
• Worst and average performance: `O(n)`
`````` for each item in the list:
if that item has the desired value,
stop the search and return the location.
return nothing.
``````

### Binary Search

• Finds the position of a specified value within an ordered data structure
• In each step, compare the input with the middle
• The algorithm repeats its action to the left or right sub-structure
• Average performance: `O(log(n))`

### Recursive Binary Search

• Example: Recursive binary search
``````function binarySearch(int items[], int key, int from, int to)
if (to < from):
return KEY_NOT_FOUND;
// calculate midpoint to cut set in half
int middle = midpoint(from, to);
if (a[middle] > key):
// key is in lower subset
return binarySearch(a, key, from, middle - 1);
else if (a[middle] < key):
// key is in upper subset
return binarySearch(a, key, middle + 1, to);
else:
// key has been found
return middle;
``````

### Iterative Binary Search

• Example: Iterative binary search
``````int binarySearch(int a[], int key, int from, int to)
// continue searching while [imin,imax] is not empty
while (from <= to):
//calculate the midpoint for roughly equal partition x/
int middle = midpoint(from, to);
// determine which subarray to search
if (a[middle] < key)
// change from index to search upper subarray
from = middle + 1;
else if (a[middle] > key)
// change to index to search lower subarray
to = middle - 1
else
// key found at index middle
return middle;
return KEY_NOT_FOUND;
``````

### Interpolation Search

• An algorithm for searching for a given key value in an indexed array that has been ordered by the values of the key
• Parallels how humans search through a telephone book
• Calculates where in the remaining search space the sought item might be
• Binary search always chooses the middle element
• Average case: `log(log(n))`, Worst case: `O(n)`

### Interpolation SearchSample Implementation

``````public int interpolationSearch(int[] sortedArray, int toFind){
// Returns index of toFind in sortedArray, or -1 if not found
int low = 0;
int high = sortedArray.length - 1;
int mid;
while(sortedArray[low] <= toFind && sortedArray[high] >= toFind) {
mid = low + ((toFind - sortedArray[low]) x (high - low)) /
(sortedArray[high] - sortedArray[low]);
// out of range   is possible here
if (sortedArray[mid] < toFind)
low = mid + 1;
else if (sortedArray[mid] > toFind)
high = mid - 1;
else
return mid;
}
if (sortedArray[low] == toFind) return low;
}
``````

### Fisher–Yates shuffle algorithm

``````public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
var array = source.ToArray();
var n = array.Length;
for (var i = 0; i < n; i++)
{
// Exchange a[i] with random element in a[i..n-1]
int r = i + RandomProvider.Instance.Next(0, n - i);
var temp = array[i];
array[i] = array[r];
array[r] = temp;
}
return array;
}
public static class RandomProvider
{
private static Random Instance = new Random();
}
``````