Idea 1: Run each plan
If we can't get the exact cost of a plan, what can we do?
Idea 2: Run each plan on a small sample of the data.
Idea 3: Analytically estimate the cost of a plan.
Figure out the cost of each individual operator.
Only count the number of IOs added by each operator.
Operation | RA | IOs Added (#pages) | Memory (#tuples) |
---|---|---|---|
Table Scan | $R$ | $\frac{|R|}{\mathcal P}$ | $O(1)$ |
Projection | $\pi(R)$ | $0$ | $O(1)$ |
Selection | $\sigma(R)$ | $0$ | $O(1)$ |
Union | $R \uplus S$ | $0$ | $O(1)$ |
Sort (In-Mem) | $\tau(R)$ | $0$ | $O(|R|)$ |
Sort (On-Disk) | $\tau(R)$ | $\frac{2 \cdot \lfloor log_{\mathcal B}(|R|) \rfloor}{\mathcal P}$ | $O(\mathcal B)$ |
(B+Tree) Index Scan | $Index(R, c)$ | $\log_{\mathcal I}(|R|) + \frac{|\sigma_c(R)|}{\mathcal P}$ | $O(1)$ |
(Hash) Index Scan | $Index(R, c)$ | $1$ | $O(1)$ |
Operation | RA | IOs Added (#pages) | Memory (#tuples) |
---|---|---|---|
Nested Loop Join (Buffer $S$ in mem) | $R \times S$ | $0$ | $O(|S|)$ |
Nested Loop Join (Buffer $S$ on disk) | $R \times_{disk} S$ | $(1+ |R|) \cdot \frac{|S|}{\mathcal P}$ | $O(1)$ |
1-Pass Hash Join | $R \bowtie_{1PH, c} S$ | $0$ | $O(|S|)$ |
2-Pass Hash Join | $R \bowtie_{2PH, c} S$ | $\frac{2|R| + 2|S|}{\mathcal P}$ | $O(1)$ |
Sort-Merge Join | $R \bowtie_{SM, c} S$ | [Sort] | [Sort] |
(Tree) Index NLJ | $R \bowtie_{INL, c}$ | $|R| \cdot (\log_{\mathcal I}(|S|) + \frac{|\sigma_c(S)|}{\mathcal P})$ | $O(1)$ |
(Hash) Index NLJ | $R \bowtie_{INL, c}$ | $|R| \cdot 1$ | $O(1)$ |
(In-Mem) Aggregate | $\gamma_A(R)$ | $0$ | $adom(A)$ |
(Sort/Merge) Aggregate | $\gamma_A(R)$ | [Sort] | [Sort] |
Next Class: How to estimate $|R|$