- Algorithms for Queries

Algorithms for Queries

February 18, 2021

Garcia-Molina/Ullman/Widom: Ch. 15.1-15.5, 16.7

News

Homework 1 assigned last night, due Weds night.
Checkpoint 1 posted Sunday. Submissions open tonight.

Query Evaluation Styles

All-At-Once (Collections): Bottom-up, one operator at a time.
Volcano-Style (Iterators): Operators "request" one tuple at a time from children.
Push-Style (Buffers): Operators continuously produce/consume tuples.

Analyzing Volcano Operators

CPU Used
Memory Bounds
Disk IO Used

Databases are usually IO- or Memory-bound

Memory Bounds

Constant
Scales with output
Scales with part of the input
Worse

Core Question: Do we have enough memory to use this operator?

Disk IO

IO measured in:

Number of Tuples
Number of Data Pages (absolute size)

Accounting

Figure out the cost of each individual operator.

Only count the number of IOs added by each operator.

Note

So far, we've been pretending that each operator has one algorithm.

Often, there are many algorithms, some of which cover multiple operators.

This is why Spark has a PhysicalPlan

In the suggested Iterator-based approach the Iterators are your PhysicalPlan

Table Scan ($R$)

Memory Required?: Constant!
IOs added?: $|R|$ tuples read

Select ($\sigma(R)$)

Memory Required?: Constant!
IOs added?: None! (Can "inline" into cost of $R$)

Example

Example, assume $R$ is 100 tuples.

How many IOs do we need to compute $Q := R$

How many IOs do we need to compute $Q := \sigma(R)$

Project ($\pi(R)$)

Memory Required?: Constant!
IOs added?: None!

Example

Example, assume $R$ is 100 tuples.

How many IOs do we need to compute $Q := \pi(R)$

How many IOs do we need to compute $Q := \pi(\sigma(R))$

Projection and Selection do not add IO.

Union ($R \cup S$)

Memory Required?: Constant!
IOs added?: None!

Cross ($R \times S$)

Memory Required?: It depends
IOs added?: It depends

Cross ($R \times S$)

How do you "reset" $S$?

"Materialize" S into memory: No extra IOs (but $O(|S|)$ memory)
Rerun the entire iterator: $(|R|-1) \cdot \texttt{cost}(S)$ extra tuples read
"Materialize" S onto disk: $|S|$ tuples written; $(|R|-1) \cdot |S|$ extra tuples read

This can get very expensive

Example

Example, assume $R$ and $S$ are both 100 tuples.

How many IOs do we need to compute $Q := R \cup S$?

Getting an Iterator on $R$: 100 tuples
Getting an Iterator on $S$: 100 tuples
Getting an Iterator on $R \cup S$ using the above iterators: 0 extra tuples

Example

Example, assume $R$ is 20 tuples and $S$ is 100 tuples.

How many IOs do we need to compute $Q := R \times S$?

Getting an Iterator on $R$: 20 tuples
Getting an Iterator on $S$: 100 tuples
Getting an Iterator on $R \times S$ using the above iterators:

Memory: 0 extra tuples
Replay: $(|R|-1) \times \texttt{cost}(S) = 19 \times 100 = 1900$ extra tuples
Cache: $|R| \times |S| = 20 \times 100 = 2000$ extra tuples

Best Total Cost $100 + 20 + 1900 = 2010$

Example

Example, assume $R$ is 20 tuples and $S$ is 100 tuples,
and $c$ filters out 90% of tuples.

How many IOs do we need to compute $Q := R \times \sigma_c(R \times S)$

Getting an Iterator on $\sigma_c(R \times S)$: 2010 tuples
Getting an Iterator on $R$: 20 tuples
Getting an Iterator on $R \times \sigma_c(R \times S)$ using the above iterators:

Memory: 0 extra tuples
Replay: $(|R|-1) \times \texttt{cost}(\sigma_c(R \times S)) = 19 \times 2010 = 38190$ extra tuples
Cache: $|R| \times (0.1 \times (|R| \times |S|)) = 20 \times 200 = 4000$ extra tuples

Best Total Cost $2010 + 20 + 4000 = 6030$

Can we do better with cartesian product
(and joins)?

Nested-Loop Join

Problem: We need to evaluate rhs iterator
once per record in lhs

Preloading Data

Better Solution: Load both lhs and rhs records in blocks.


                    def apply_cross(lhs, rhs):
                      result = []

                      while r_block = lhs.take(100):
                        while s_block = rhs.take(100):
                          for r in r_block:
                            for s in s_block: 
                              result += [r + s]
                        rhs.reset()

                      return result

Block-Nested Loop Join

Block-Nested Loop ($R \times S$)

(with $\mathcal B$ as the block size for $R$)

(and with caching $S$ to disk)

Memory Required?: $O(\mathcal B)$
IOs added?: $|S|$ tuples written.; $(\frac{|R|}{\mathcal B} - 1) \cdot |S|$ tuples read.

In-memory caching is a special case of block-nested loop with $\mathcal B = |R|$

Does the block size for $S$ matter?

How big should the blocks be?

As big as possible!

... but more on that later.

Cross product is expensive!
Can we do better?

$\sigma_c(R\times S) \equiv R\bowtie_c S$

Cross Product

Problem: Naively, any tuple matches any other

Join Conditions

Solution: First organize the data

Strategies for Implementing $R \bowtie_{R.A = S.A} S$

In-Memory Index Join (1-pass Hash; Hash Join): Build an in-memory index on one table, scan the other.
Partition Join (2-pass Hash; External Hash Join): Partition both sides so that tuples don't join across partitions.
Sort/Merge Join: Sort all of the data upfront, then scan over both sides.

Hash Functions

A hash function is a function that maps a large data value to a small fixed-size value
- Typically is deterministic & pseudorandom
Used in Checksums, Hash Tables, Partitioning, Bloom Filters, Caching, Cryptography, Password Storage, …
Examples: MD5, SHA1, SHA2
- MD5() part of OpenSSL (on most OSX / Linux / Unix)
Can map h(k) to range [0,N) with h(k) % N (modulus)

Hash Functions

$$h(X) \mod N$$

Pseudorandom output between $[0, N)$
Always the same output for a given $X$

1-Pass Hash Join

Limited Queries: Only supports join conditions of the form $R.A = S.B$
Moderate-High Memory: Keeps 1 full relation in memory
Low Added IO Cost: Only requires 1 scan over each input.

Alternative: Build an in-memory tree (e.g., B+Tree) instead of a hash table!

Limited Queries: Also supports $R.A \geq S.B$, $R.A > S.B$
Moderate-High Memory: Keeps 1 full relation in memory
Low Added IO Cost: Only requires 1 scan over each input.

2-Pass Hash Join

Limited Queries: Only supports join conditions of the form $R.A = S.B$
Low Memory: Never need more than 1 pair of partitions in memory
High IO Cost: $|R| + |S|$ tuples written out; $|R| + |S|$ tuples read in

Why is it important that the hash function is pseudorandom?

What if the data is already organized (e.g., sorted) in a useful way?

Sort/Merge Join

Limited Queries: Only supports join conditions of the form $R.A = S.B$
Low Memory: Only needs to keep ~2 rows in memory at a time (not counting sort).
Low Added IO Cost: No added IO! (not counting sort).

Recap: Joins

Block-Nested Join: Moderate Memory, Moderate IO, High CPU
In-Memory Index Join (e.g., 1-Pass Hash): High Memory, Low IO
Partition Join (e.g., 2-Pass Hash): High IO, Low Memory
Sort/Merge Join: Low IO, Low Memory (But need sorted data)

Next time...

Extended Relational Algebra