Spring 2023

Analysis of Algorithms

Analysis of Algorithms

  • In the previous week, we looked into
    • difficulty of merge-sort (upper bound)
    • difficulty of sorting (lower bound)
  • But we have many more questions that can be asked:
    • How long an implementation may take on a particular computer
    • How does merge-sort compare to other \(O(N\log N)\) algorithms?
    • How does merge-sort compare to fast on average but slow on worst case algorithms?
    • How does it compare to algorithms that are not based on comparison?

Steps of Analysis

  • Implement the algorithm completely
  • Determine the time required for each basic operation
  • Identify unknown quantities that can be used to describe the frequency of execution of the basic operations.
  • Develop a realistic model for the input to the program.
  • Analyze the unknown quantities, assuming the modeled input.
  • Calculate the total running time by multiplying the time by the frequency for each operation, then adding all the products.

Implementation

  • program: a careful implementation of an algorithm
  • one algorithm may correspond to many programs
  • a program gives an object to study
  • a program provides useful experimental data
  • do not over-emphasize efficiency too early
    • analysis will result in better efficiency

Estimation of operation times

  • Can be done quite easily in most cases
  • profilers help a lot
  • We will be mostly interested in machine-independent implementations in this course
    • so we don’t actually care much about this step

Identify frequency of execution

  • Study branching structure
  • Compute execution frequencies as unknowns
  • Concentrate on high frequency/cost items

Develop input model

  • Input size determines the unknown quantities computed in the previous step
  • We usually refer to the size of the input as \(N\)
  • By “model” we mean the characteristics of the input
    • A classic example for sorting is: “a randomly ordered, distinct array of numbers of size \(N\)”
    • Alternative can be : “random array of integers between 1 and 1000”
    • The algorithms behaviour and performance may change based on the input model

Analyze the quantities based on the input

  • For average case analysis,
    • compute average frequencies
    • multiply with operation costs
    • sum all up
  • Worst case,
    • compute maximum frequencies
    • multiply with operation costs
    • sum all up

Approximations

  • This schematic can provide very nice results in most cases
  • However, the details are daunting
  • So, we usually seek approximate models that can be used to estimate costs
    • Computing each operational cost can be a really tedious task
    • Alternatively, we only focus on inner operations
    • For example, for sorting we only count compares

Average Case Analysis

Average Case Analysis

  • Our focus in this course
    • Formulate a reasonable input model
    • Analyze the expected time based on inputs from this model
  • Effective for two reasons:
    • straightforward models of randomness are often extremely accurate
    • we can often inject randomness to the problem instance

Random Models

  • How to compute the mean?

Distributional Approach

  • Let \(\Pi_N\) be the number of possible inputs of size \(N\)
  • Let \(\Pi_{Nk}\) be the number of inputs of size \(N\) that cause the algorithm to have cost \(k\)
  • Therefore, \(\Pi_N=\sum_k{\Pi_{Nk}}\)
    • Probability that the cost is \(k\): \[\Pi_Nk/\Pi_N\]
    • Expected cost of the algorithm: \[\frac{1}{\Pi_N}\sum_k{k\Pi_{Nk}}\]

Cumulative Approach

  • Let \(\sum_N\) be the total (or cumulated) cost of the algorithm on all inputs of size \(N\)
    • That is, \(\sum_N = \sum_k{k\Pi_{Nk}}\)
    • Then, the average cost is \(\sum_N / \Pi_N\)

An Analysis of Quicksort

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi)
{
  if (hi <= lo) return;
  int i = lo-1, j = hi;
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;
    while (v < a[--j]) if (j == lo) break;
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi) // lo: start, hi: end
{
  if (hi <= lo) return;
  int i = lo-1, j = hi;
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;
    while (v < a[--j]) if (j == lo) break;
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   // stop when array is of size 1 or less
  int i = lo-1, j = hi;
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;
    while (v < a[--j]) if (j == lo) break;
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   // i: index from left, j: index from right
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;
    while (v < a[--j]) if (j == lo) break;
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];    // t: temp var, v: pivot (last element)
  while (true)
  {
    while (a[++i] < v) ;
    while (v < a[--j]) if (j == lo) break;
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)     // Do as much as we can
  {
    // skip all elements less than the pivot
    while (a[++i] < v) ;    
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;    
    // skip all elements greater than the pivot
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;    
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;     // stop when all processed
    t = a[i]; a[i] = a[j]; a[j] = t;
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;    
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;     
    t = a[i]; a[i] = a[j]; a[j] = t;  // swap the two irregulars
  }
  t = a[i]; a[i] = a[hi]; a[hi] = t;
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;    
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;     
    t = a[i]; a[i] = a[j]; a[j] = t;
  }   
  t = a[i]; a[i] = a[hi]; a[hi] = t;  // Put the pivot in place
  quicksort(a, lo, i-1);
  quicksort(a, i+1, hi);
}

Quicksort: a Java Implementation

private void quicksort(int[] a, int lo, int hi
{
  if (hi <= lo) return;   
  int i = lo-1, j = hi;   
  int t, v = a[hi];
  while (true)
  {
    while (a[++i] < v) ;    
    while (v < a[--j]) if (j == lo) break;  
    if (i >= j) break;     
    t = a[i]; a[i] = a[j]; a[j] = t;
  }   
  t = a[i]; a[i] = a[hi]; a[hi] = t;  
  quicksort(a, lo, i-1);   // recurse left
  quicksort(a, i+1, hi);   // recurse right
}

Quicksort: Analysis

  • Identify resource requirements
    • while (a[++i] < v) ; translates into
      LOOP INC I,1      # increment i
           CMP V,A(I)   # compare v with A(i)
           BL LOOP      # branch if less
  • 4 unit memory access operations
  • Other while is similar

Identify Frequencies

  • A – the number of partitioning stages

  • B – the number of exchanges

  • C – the number of compares

  • On a typical computer: \[4C + 11B + 35A\]

  • The exact coefficients depend on the compiler and the computer architecture

  • The coefficient of C is significantly lower compared to mergesort

Quicksort Analysis

Theorem Quicksort uses, on the average,

  • \((N − 1)/2\) partitioning stages,
  • \(2(N + 1) (H_{N+1} − 3/2) \approx 2N\ln N − 1.846N\) compares, and
  • \((N + 1) (H_{N+1} − 3)/3 + 1 \approx .333N\ln N − .865N\) exchanges

to sort an array of N randomly ordered distinct elements.

where \(H_N = \sum_{1\leq k\leq N}{1/k}\) is the harmonic numbers

Proof: Full proof in the book, if you are interested.

Quicksort Analysis

\[C_N = N+1 + \frac{1}{N}\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})}\]

Quicksort Analysis

\[C_N = N+1 + \frac{1}{N}\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})}\]

  • \(N+1\) comparison in first partitioning phase
  • \(\displaystyle\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})}\) for sub-arrays divided at \(j^{th}\) element
    • multiply by \(1/N\) for probability of each

Quicksort Analysis

\[C_N = N+1 + \frac{1}{N}\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})}\]

  • Note that the first and second terms are identical in \(\displaystyle\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})}\)
    • \(\displaystyle\sum_{1\leq j\leq N}{C_{j-1}} = \sum_{1\leq j\leq N}{C_{N-j}}\)
  • So, \(\displaystyle\sum_{1\leq j\leq N}{(C_{j-1}+C_{N-j})} = 2\sum_{1\leq j\leq N}{C_{j-1}}\)

Quicksort Analysis

\[C_N = N+1 + \frac{2}{N}\sum_{1\leq j\leq N}{C_{j-1}}\] - Multiply by N

\[NC_N = N^2+N + 2\sum_{1\leq j\leq N}{C_{j-1}}\] - and note for N-1

\[(N-1)C_{N-1} = (N-1)^2+N-1 + 2\sum_{1\leq j\leq N-1}{C_{j-1}}\]

Quicksort Analysis

  • Subtract case of N from case of N-1

\[NC_N - (N-1)C_{N-1}= 2N+2C_{N-1}\] - rearrange

\[NC_N = (N+1)C_{N-1} + 2N\]

  • divide both sides by \(N(N+1)\)

\[\frac{C_N}{N+1} = \frac{C_{N-1}}{N} + \frac{2}{N+1}\]

Quicksort Analysis

  • Iterating gives:

\[\frac{C_N}{N+1} = \frac{C_1}{2} + 2\sum_{3\leq k\leq N+1}{1/k}\]

  • \(C_1=0\) and \(\displaystyle\sum_{3\leq k\leq N+1}{1/k} = H_{N+1}-3/2\), so

\[C_N = 2N\ln N - 1.846N\]

  • using Euler–Mascheroni constant for approximation of Harmonics.

Quicksort Analysis

How to Improve Quicksort?

  • Small subarrays: A small array can be sorted with simpler techniques much faster
    • An array of size 2 requires one compare and one potential exchange
    • Use insertion sort for small arrays
    • When to switch from insertion sort to quicksort recursion?
  • Median-of-three: take a small sample and use the median as a pivot
  • Radix-exchange sort: Consider the keys to be bit strings and partition bit by bit