Video: Sorting and Searching Algorithms (Bulgarian)

October 01, 2020
Nikolay Kostov (Nikolay.IT)

Topics covered:

Sorting
- Sorting and classification
Review of the most popular sorting algorithms
- Quick sort
- Merge sort
- Bubble sort
- Bucket sort
Searching
- Linear search
- Binary search
- Interpolation search
Shuffling

Video (in Bulgarian)

Presentation Content

What is a Sorting Algorithm?

Sorting algorithm
- An algorithm that puts elements of a list in a certain order (most common lexicographically)
More formally:
- The output is in some (non-decreasing) order
- The output is a permutation of the input
Efficient sorting is important for
- Producing human-readable output
- Canonicalizing data
- Optimizing the use of other algorithms
Sorting presents many important techniques

Classification

Sorting algorithms are often classified by
- Computational complexity
  - worst, average and best behavior
- Memory usage
- Recursive or non-recursive
- Stability
- Whether or not they are a comparison sort
- General method
  - insertion, exchange (bubble sort and quicksort), selection (heapsort), merging, serial or parallel…

Stability of Sorting

Stable sorting algorithms
- Maintain the relative order of records with equal values
If two items compare as equal, then their relative order will be preserved
- When sorting only part of the data is examined when determining the sort order

Selection sort

Very simple and very inefficient algorithm

Best, worst and average case: n²
Memory: 1 (constant, only for the min element)
Stable: No
Method: Selection

for j = 0 ... n-2
    // find the best element in a[j .. n-1]
    best = j;
    for i = j + 1 ... n -1
        if a[i] < a[best]
          best = i;
    if best is not j
      swap a[j], a[best]

http://en.wikipedia.org/wiki/Selection_sort

Bubble sort

Repeatedly stepping through the list
- Comparing each pair of adjacent items
  - Swap them if they are in the wrong order
- Best case: n, worst and average case: n²
- Memory: 1, Stable: Yes, Method: Exchanging

while swapIsDone
  swapIsDone = false
  for i = 0 ... n - 1
    if a[i-1] > a[i]
      swap a[i] a[i-i]
      swapIsDone = true

http://en.wikipedia.org/wiki/Bubble_sort

Insertion sort

Builds the final sorted array one item at a time
- Best case: n, worst and average case: n²
- Memory: 1, Stable: Yes, Method: Insertion

for i = 1 ... n - 1
  valueToInsert = a[i]
  holePos = i
  while holePos > 0 and valueToInsert < A[holePos - 1]
    a[holePos] = A[holePos - 1] // shift the larger value up
    holePos = holePos - 1       // move the hole position down
  A[holePos] = valueToInsert

http://en.wikipedia.org/wiki/Insertion_sort

Quicksort

First divides a large list into two smaller sub-lists then recursively sort the sub-lists
- Best and average case: n*log(n), worst: n²
- Memory: log(n) stack space
- Stable: Depends
- Method: Partitioning

function quicksort('array')
  if len('array') <= 1:
    return
  select and remove a pivot value from 'array'
  create empty lists 'less' and 'greater'
  for each 'x' in 'array':
    if 'x' <= 'pivot':
      append 'x' to 'less'
    else:
      append 'x' to greater
  return concatenate (quicksort('less'), 'pivot', quicksort('greater'))

http://en.wikipedia.org/wiki/Quicksort

Stable implementation

Merge Sort

Conceptually, a merge sort works as follows
- Divide the unsorted list into n sublists, each containing 1 element (list of 1 element is sorted)
- Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining
Best, average and worst case: n*log(n)
Memory: Depends; worst case is n
Stable: Yes;
Method: Merging
Highly parallelizable (up to O(log(n)) using the Three Hungarian’s Algorithm
http://en.wikipedia.org/wiki/Merge_sort

Merge Sort Pseudocode

function merge_sort(list m)
    // if list size is 0 (empty) or 1, consider it sorted
    // (using less than or equal prevents infinite recursion
    // for a zero length m)
  if length(m) <= 1
      return m
    // else list size is > 1, so split the list into two sublists
  var list left, right
  var integer middle = length(m) / 2
  for each x in m before middle
       add x to left
  for each x in m after or equal middle
       add x to right
    // recursively call merge_sort() to further split each sublist
    // until sublist size is 1
  left = merge_sort(left)
  right = merge_sort(right)
    // merge the sublists returned from prior calls to merge_sort()
    // and return the resulting merged sublist
  return merge(left, right)

function merge(left, right)
    var list result
    while length(left) > 0 or length(right) > 0
        if length(left) > 0 and length(right) > 0
            if first(left) <= first(right)
                append first(left) to result
                left = rest(left)
            else
                append first(right) to result
                right = rest(right)
        else if length(left) > 0
            append first(left) to result
            left = rest(left)
        else if length(right) > 0
            append first(right) to result
            right = rest(right)
    end while
    return result

Heap

Specialized tree-based data structure that satisfies the heap property:
- Parent nodes are always greater (less) than or equal to the children
  - No implied ordering between siblings or cousins

https://en.wikipedia.org/wiki/Heap_data_structure

Heapsort

Can be divided into two parts
- In the first step, a heap is built out of the data
- A sorted array is created by repeatedly removing the top element from the heap
Best, average and worst case: n*log(n)
Memory: Constant - O(1)
Stable: No
Method: Selection
http://en.wikipedia.org/wiki/Heapsort

Counting sort

Algorithm for sorting a collection of objects according to keys that are small integers
- Or big integers and a map
Not a comparison sort
Average case: n + r
Worst case: n + r
- r is the range of numbers to be sorted
Stable: Yes
Memory: n + r
http://en.wikipedia.org/wiki/Counting_sort

Bucket sort

Partitioning an array into a number of buckets
- Each bucket is then sorted individually
Not a comparison sort
Average case: n + k
- k = the number of buckets
Worst case: n2 * k
Stable: Yes
Memory: n * k
http://en.wikipedia.org/wiki/Bucket_sort

Comparison of Sorting Algorithms

There are hundreds of sorting algorithms
- Some of them are:

Name	Best	Avg	Worst	Memory	Stable
Bubble	n	n²	n²	1	Yes
Insertion	n	n²	n²	1	Yes
Quick	n*log(n)	n*log(n)	n²	log(n)	Depends
Merge	n*log(n)	n*log(n)	n*log(n)	n	Yes
Heap	n*log(n)	n*log(n)	n*log(n)	n	No
Bogo	n	n*n!	n*n!	1	No

Search Algorithm

An algorithm for finding an item with specified properties among a collection of items
Different types of searching algorithms
- For virtual search spaces
  - satisfy specific mathematical equations
  - try to exploit partial knowledge about structure
- For sub-structures of a given structure
  - graph, a string, a finite group
- Search for the max (min) of a function
- etc.

Linear Search

Method for finding a particular value in a list
- Checking every one of the elements
- One at a time in sequence
- Until the desired one is found
Worst and average performance: O(n)

 for each item in the list:
     if that item has the desired value,
         stop the search and return the location.
 return nothing.

Binary Search

Finds the position of a specified value within an ordered data structure
In each step, compare the input with the middle
- The algorithm repeats its action to the left or right sub-structure
Average performance: O(log(n))

Recursive Binary Search

Example: Recursive binary search

function binarySearch(int items[], int key, int from, int to)
  if (to < from):
    // set is empty, so return value showing not found
    return KEY_NOT_FOUND;
  // calculate midpoint to cut set in half
  int middle = midpoint(from, to);
  if (a[middle] > key):
    // key is in lower subset
    return binarySearch(a, key, from, middle - 1);
  else if (a[middle] < key):
    // key is in upper subset
    return binarySearch(a, key, middle + 1, to);
  else:
    // key has been found
    return middle;

Iterative Binary Search

Example: Iterative binary search

int binarySearch(int a[], int key, int from, int to)
  // continue searching while [imin,imax] is not empty
  while (from <= to):
    //calculate the midpoint for roughly equal partition x/
    int middle = midpoint(from, to);
    // determine which subarray to search
    if (a[middle] < key)
      // change from index to search upper subarray
       from = middle + 1;
    else if (a[middle] > key)
      // change to index to search lower subarray
       to = middle - 1
    else
      // key found at index middle
      return middle;
  return KEY_NOT_FOUND;

Interpolation Search

An algorithm for searching for a given key value in an indexed array that has been ordered by the values of the key
- Parallels how humans search through a telephone book
- Calculates where in the remaining search space the sought item might be
  - Binary search always chooses the middle element
Average case: log(log(n)), Worst case: O(n)
http://youtube.com/watch?v=l1ed_bTv7Hw

Interpolation SearchSample Implementation

public int interpolationSearch(int[] sortedArray, int toFind){
  // Returns index of toFind in sortedArray, or -1 if not found
  int low = 0;
  int high = sortedArray.length - 1;
  int mid;
  while(sortedArray[low] <= toFind && sortedArray[high] >= toFind) {
    mid = low + ((toFind - sortedArray[low]) x (high - low)) /
                (sortedArray[high] - sortedArray[low]);
                // out of range   is possible here
    if (sortedArray[mid] < toFind)
      low = mid + 1;
    else if (sortedArray[mid] > toFind)
      high = mid - 1;
    else
      return mid;
  }
  if (sortedArray[low] == toFind) return low;
    else return -1; // Not found
}

Shuffling

A procedure used to randomize the order of items in a collection
- Generating random permutation
http://en.wikipedia.org/wiki/Shuffling

Fisher–Yates shuffle algorithm

public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
    var array = source.ToArray();
    var n = array.Length;
    for (var i = 0; i < n; i++)
    {
        // Exchange a[i] with random element in a[i..n-1]
        int r = i + RandomProvider.Instance.Next(0, n - i);
        var temp = array[i];
        array[i] = array[r];
        array[r] = temp;
    }
    return array;
}
public static class RandomProvider
{
    private static Random Instance = new Random();
}