Sept 28, 2022
Base Case: $T(1) \leq c \cdot 1$
$$c_0 \leq c$$
True for any $c > c_0$
Assume: $T(\frac{n}{2}) \leq c \frac{n}{2} \log\left(\frac{n}{2}\right)$
Show: $T(n) \leq c n \log\left(n\right)$
$$2\cdot T(\frac{n}{2}) + c_1 + c_2 n \leq c n \log(n)$$
By the assumption and transitivity, showing the following inequality suffices:
$$2 c \frac{n}{2} \log\left(\frac{n}{2}\right) + c_1 + c_2 n \leq c n \log(n)$$
$$c n \log(n) - c n \log(2) + c_1 + c_2 n \leq c n \log(n)$$
$$c_1 + c_2 n \leq c n \log(2)$$
$$\frac{c_1}{n \log(2)} + \frac{c_2}{\log(2)} \leq c$$
True for any $n_0 \geq \frac{c_1}{\log(2)}$ and $c > \frac{c_2}{\log(2)}+1$
All of the "work" is in the combine step.
Can we put the work in the divide step?
Idea 1: Partition the data on the median value.
Idea 2: Partition the data in-place.
def idealizedQuickSort(arr: Array[Int], from: Int, until: Int): Unit =
{
if(until - from < 1){ return }
val pivot = ???
var low = from, high = until -1
while(low < high){
while(arr(low) <= pivot && low < high){ low ++ }
if(low < high){
while(arr(high) > pivot && low < high){ high ++ }
swap(arr, low, high)
}
}
idealizedQuickSort(arr, from = 0, until = low)
idealizedQuickSort(arr, from = low, until = until)
}
If we can obtain a pivot in $O(1)$, what's the complexity?
$$T_{quicksort}(n) = \begin{cases} \Theta(1) & \textbf{if } n = 1\\ 2 \cdot T(\frac{n}{2}) + \Theta(n) + 0 & \textbf{otherwise} \end{cases}$$
Contrast with MergeSort: $$T_{mergesort}(n) = \begin{cases} \Theta(1) & \textbf{if } n = 1\\ 2 \cdot T(\frac{n}{2}) + \Theta(1) + \Theta(n) & \textbf{otherwise} \end{cases}$$
Problem: Finding the median value of an unsorted collection is $O(n\log(n))$
(We'll talk about heaps later)
Idea: If we pick a value at random,
on average half the values will be lower.
What's the worst-case runtime?
What if we always pick the worst pivot?
[8, 7, 6, 5, 4, 3, 2, 1]
[7, 6, 5, 4, 3, 2, 1], 8, []
[6, 5, 4, 3, 2, 1], 7, [], 8
[5, 4, 3, 2, 1], 6, [], 7, 8
...
$$T_{quicksort}(n) \in O(n^2)$$
Is the worst case runtime representative?
No! (it'll almost always be faster)
Is there something we can say about the runtime?
Let's say we pick the $X$th largest element as pivot,
What's the recursive runtime for $T(n)$?
$$\begin{cases} T(0) + T(n-1) + \Theta(n) & \textbf{if } X = 1\\ T(1) + T(n-2) + \Theta(n) & \textbf{if } X = 2\\ T(2) + T(n-3) + \Theta(n) & \textbf{if } X = 3\\ ..\\ T(n-2) + T(1) + \Theta(n) & \textbf{if } X = n-1\\ T(n-1) + T(0) + \Theta(n) & \textbf{if } X = n\\ \end{cases}$$
How likely are we to pick $X = k$ for any specific $k$?
$P[X = k] = \frac{1}{n}$
... a brief aside...
If I roll d6 (a 6-sided die 🎲) $k$ times,
what is the average over all possible outcomes?
If I roll d6 (a 6-sided die 🎲) $1$ time...
Roll | Probability | Contribution |
---|---|---|
⚀ | $\frac{1}{6}$ | 1 |
⚁ | $\frac{1}{6}$ | 2 |
⚂ | $\frac{1}{6}$ | 3 |
⚃ | $\frac{1}{6}$ | 4 |
⚄ | $\frac{1}{6}$ | 5 |
⚅ | $\frac{1}{6}$ | 6 |
If $X$ is a random variable representing the outcome of the roll, we call this the expectation of $X$, or $E[X]$
$$E[X] = \sum_{i} P_i \cdot X_i$$
If I roll d6 (a 6-sided die 🎲) $2$ times...
Does the outcome of one roll affect the other?
No: Each roll is an independent event.
If $X$ and $Y$ are random variables representing the outcome of each roll (i.e., independent random variables), $E[X + Y] = E[X] + E[Y]$
$= 3.5 + 3.5 = 7$