CSE-4/562 Spring 2019 - Query Evaluation

February 11, 2019

- All-At-Once (Collections)
- Bottom-up, one operator at a time.
- Volcano-Style (Iterators)
- Operators "request" one tuple at a time from children.
- Push-Style (Buffers)
- Operators continuously produce/consume tuples.

- Memory Bounds
- Disk IO Used
- CPU Used

__Data__bases are usually IO- or Memory-bound

- Constant
- Scales with output
- Scales with part of the input
- Worse

**Core Question: ** Do we have enough memory to use this operator?

IO measured in:

- Number of Tuples
- Number of Data Pages (absolute size)

Figure out the cost of each **individual** operator.

Only count the number of IOs **added** by each operator.

- Memory Required?
- Constant!
- IOs added?
- $|R|$ tuples read

- Memory Required?
- Constant!
- IOs added?
- None! (Can "inline" into cost of $R$)

- Memory Required?
- Constant!
- IOs added?
- None!

- Memory Required?
- Constant!
- IOs added?
- None!

- Memory Required?
- Constant!
- IOs added?
- $(|R|-1) \cdot \texttt{cost}(S)$ tuples read
- (or $(|S|-1) \cdot \texttt{cost}(R)$)

If $S$ is a query, this can get very expensive

**Optimization 1: ** Cache $S$ in memory

- Memory Required?
- $O(|S|)$
- IOs added?
- None!

**Optimization 2: ** Cache $S$ on disk

- Memory Required?
- Constant
- IOs added?
- $|S|$ tuples written.
- $(|R| - 1) \cdot |R|$ tuples read.

Is there a middle ground?

**Problem**: We need to evaluate `rhs`

iterator

once per record in `lhs`

**Better Solution**: Load both `lhs`

and `rhs`

records in blocks.

```
def apply_cross(lhs, rhs):
result = []
while r_block = lhs.take(100):
while s_block = rhs.take(100):
for r in r_block:
for s in s_block:
result += [r + s]
rhs.reset()
return result
```

(with $\mathcal B$ as the block size for $R$)

(and with caching $S$ to disk)

- Memory Required?
- $O(\mathcal B)$
- IOs added?
- $|S|$ tuples written.
- $(\frac{|R|}{\mathcal B} - 1) \cdot |R|$ tuples read.

In-memory caching is a special case of block-nested loop with $\mathcal B = |S|$

Does the block size for $R$ matter?

How big should the blocks be?

Cross product is expensive!

Can we do better?

$\sigma_c(R\times S) \equiv R\bowtie_c S$

**Problem**: Naively, any tuple matches any other

**Solution**: First organize the data

- In-Memory Index Join (1-pass Hash; Hash Join)
- Build an in-memory index on one table, scan the other.
- Partition Join (2-pass Hash; External Hash Join)
- Partition both sides so that tuples don't join across partitions.
- Sort/Merge Join
- Sort all of the data upfront, then scan over both sides.

- A hash function is a function that maps a large data value to a small fixed-size value
- Typically is deterministic & pseudorandom

- Used in Checksums, Hash Tables, Partitioning, Bloom Filters, Caching, Cryptography, Password Storage, …
- Examples: MD5, SHA1, SHA2
- MD5() part of OpenSSL (on most OSX / Linux / Unix)

- Can map h(k) to range [0,N) with h(k) % N (modulus)

$$h(X) \mod N$$

- Pseudorandom output between $[0, N)$
- Always the same output for a given $X$

- Limited Queries
- Only supports join conditions of the form $R.A = S.B$
- Moderate-High Memory
- Keeps 1 full relation in memory
- Low Added IO Cost
- Only requires 1 scan over each input.

**Alternative: ** Build an in-memory tree (e.g., B+Tree) instead of a hash table!

- Limited Queries
- Also supports $R.A \geq S.B$, $R.A > S.B$
- Moderate-High Memory
- Keeps 1 full relation in memory
- Low Added IO Cost
- Only requires 1 scan over each input.

- Limited Queries
- Only supports join conditions of the form $R.A = S.B$
- Low Memory
- Never need more than 1 pair of partitions in memory
- High IO Cost
- $|R| + |S|$ tuples written out
- $|R| + |S|$ tuples read in

Why is it important that the hash function is pseudorandom?

What if the data is already organized (e.g., sorted) in a useful way?

- Limited Queries
- Only supports join conditions of the form $R.A = S.B$
- Low Memory
- Only needs to keep ~2 rows in memory at a time (not counting sort).
- Low Added IO Cost
- No added IO! (not counting sort).