Spring 2023
The course syllabus can be found at my website
You can also download all lecture slides from the same web page.
There will be one midterm and one final exam.
or
Definition Given a function \(f(N)\),
\(O(f(N))\) denotes the set of all \(g(N)\) such that \(|g(N)/f(N)|\) is bounded from above as \(N \rightarrow \infty\)
\(\Omega(f(N))\) denotes the set of all \(g(N)\) such that \(|g(N)/f(N)|\) is bounded from below by a (strictly) positive number as \(N \rightarrow \infty\)
\(\Theta(f(N))\) denotes the set of all \(g(N)\) such that \(|g(N)/f(N)|\) is bounded from both above and below as \(N \rightarrow \infty\)
Knuth (1976)
\(O(f(N))\) denotes an upper bound
\(\Omega(f(N))\) denotes a lower bound
\(\Theta(f(N))\) denotes matching upper and lower bounds
Typically used to
When you hide the constants derivations become simpler:
private void mergesort(int[] a, int lo, int hi) { if (hi <= lo) return; int mid = lo + (hi - lo) / 2; mergesort(a, lo, mid); mergesort(a, mid + 1, hi); for (int k = lo; k <= mid; k++) b[k-lo] = a[k]; for (int k = mid+1; k <= hi; k++) c[k-mid-1] = a[k]; b[mid-lo+1] = INFTY; c[hi - mid] = INFTY; int i = 0, j = 0; for (int k = lo; k <= hi; k++) if (c[j] < b[i]) a[k] = c[j++]; else a[k] = b[i++]; }
If we were to split the array into three parts, sort each and then do a three-way merge, would it make a difference?
Theorem (Mergesort Compares) Mergesort uses \(N\lg N + O(N)\) compares to sort an array of N elements.
Proof. Let \(C_N\) be the number of compares for \(N\) elements. Then, the first half of the array requires \(C_{N-1}\) compares, as well as the second half. For the merge, we make \(N\) more compares. Hence,
\(C_N = C_{N/2} + C_{N/2} + N\)
Assume, \(N = 2^n\). Then,
\(C_{2^n} = 2C_{2^{n-1}}+2^n\)
…
\(C_{2^n} = 2C_{2^{n-1}}+2^n\)
Divide both sides by \(2^n\):
\(\frac{C_{2^n}}{2^n} = \frac{C_{2^{n-1}}}{2^{n-1}}+1 = \frac{C_{2^{n-2}}}{2^{n-2}}+2 = \frac{C_{2^{n-3}}}{2^{n-3}}+3=...=\frac{C_{2^0}}{2^0}+n=n\)
Therefore,
\(C_{2^n}=2^nn\) ==> \(C_N = N\lg N\)
We will later look into the general case (where \(N\neq 2^n\))
Ignoring details for now, we can assume a reasonable implementation of mergesort will result in a running time of a constant factor of \(N\lg N\).
From a theoretical point of view, mergesort provides an upper bound on sorting:
There exists an algorithm that can sort any N-element file in time proportional to \(N\log N\).
Theorem (Complexity of Sorting) Every compare based sorting program uses at least \(\lceil\lg N!\rceil > N\lg N - N/(\ln 2)\) compares for some input.
Proof. We will not do a full proof, but the idea is as follows. Consider all permutations of the given array. That is, we have \(N!\) different arrangements. With each comparison, the best you can do is get rid of half of the remaining arrangements (the ones that conflict the comparison result). Then, you need at least \(\lceil\lg N!\rceil\) comparison to reach a single remaining arrangement, that is the sorted array. Using Stirling’s approximation one can show that \(\lceil\lg N!\rceil > N\lg N - N/(\ln 2)\)
All compare-based sorting algorithms require time proportional to \(N\log N\) to sort some N-element input file.
Suppose that it is known that each of the items in an N-item array has one of two distinct values. Give a sorting method that takes time proportional to N.
Answer the previous exercise for three distinct values.