October 22
Same great algorithms,
awesome new hardware flavor
Hardware adaptation uses standard transformations:
The hard part is picking where to apply the transformations and selecting values for the transformation's parameters
Key Insights:
Automate the search!
What do we need?
A collection of rules that assign a property called a type to the parts of a computer program: variables, expressions, etc...
A typesystem allows you to:
A type
($\tau := D\;|\;[\tau]\;|<\tau,\tau>\;|\;\tau\rightarrow\tau$)
A set of inference rules
($\frac{e\;:\;\tau}{[e]\;:\;[\tau]}$)
These example types are part of the monad algebra
Type | Meaning |
---|---|
$D$ | Primitive Type (int, float, etc...) |
$[\tau]$ | An array of elements with type $\tau$ |
$<\tau_1,\tau_2>$ | A pair of elements of types $\tau_1$ and $\tau_2$. |
$\tau_1\rightarrow\tau_2$ | A function with one argument of type $\tau_1$ and one return value of type $\tau_2$. |
Defined over a language of expressions like $f(a, b)$.
$$\frac{a : \tau_a\;\;\;b : \tau_b}{f(a,b):\tau_f(\tau_a, \tau_b)}$$If expression $a$ has type $\tau_a$ and expression $b$ has type $\tau_b$...
...then expression $f(a, b)$ has type $\tau_f(\tau_a, \tau_b)$.
A primitive language for describing data processing.
Operator | Meaning |
---|---|
$\lambda x.e$ | Define a function with body $e$ that uses variable $x$. |
$e_1\;e_2$ | Apply the function defined by $e_1$ to the value obtained from $e_2$. |
$ \textbf{if}\;c\;\textbf{then}\;e_1\;\textbf{else}\;e_2$ | If $c$ is true then evaluate $e_1$, and otherwise evaluate $e_2$. |
Operator | Meaning |
---|---|
$ < e_1, e_2 >$ | Construct a tuple from $e_1$ and $e_2$. |
$ e.i $ | Extract attribute $i$ from the tuple $e$. |
$ [e] $ | Construct a single-element array from e. |
$ [] $ | Construct an empty array. |
$ e_1 \sqcup e_2 $ | Concatenate the arrays $e_1$ and $e_2$. |
Apply function $f$ to every element of array $e$. Concatenate all of the arrays returned by $f$.
$$\textbf{foldL}(c : \tau_2, f : < \tau_2, \tau_1 >)(e : [\tau_1])$$Apply function $f$ to every element of array $e$, with each invocation passing its return value to the next call (e.g., aggregation)
$$\textbf{for}(xB : [\tau_1] [k] \leftarrow e_{in} : [\tau_1])(e_{loop} : [\tau_2])$$
Extract blocks of size $k$ from $e_{in}$. For each block compute a flatMap using expression $e_{loop}$.
Fold implements aggregation
Fold takes a 'previous' $a$ and a 'current' $x$
We need a sum, and a count
Initial sum and count are both 0
Postprocess with division ($\lambda$ creates a variable $tot$)
We need...
Basic Approach: Define a second type for tracking data sizes
$$\alpha\;:=\;[\alpha]^x\;|\;< \alpha_1, \alpha_2 >|\;c$$e.g., $[ < 1, [1]^y > ]^x$ corresponds to:
$R(\Gamma, e)$ computes the cardinality type for $e$
$\Gamma : x \rightarrow \alpha$ is a context / scope
$\frac{cardinality(R(\Gamma, e_1))}{k}\cdot$ $R(\Gamma', e_2)$
The cardinality is based on that of $e_2$
Repeated once for every time through the loop
And $e_2$ is evaluated in the context of a $k$ element array.
$max(R\left(\Gamma, e_1\right), R\left(\Gamma, e_2\right))$
Pessimistic assumption of biggest possible size.
Avoids needing to estimate $p(c = true)$.
Expression | Context | Result Size |
---|---|---|
$\textbf{for}( xB [k_1] \leftarrow R )$ | $\Gamma_1 = R \mapsto [1]^x, S \mapsto [1]^y$ | $[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$ |
$\textbf{for}( yB [k_2] \leftarrow S )$ | $\Gamma_2 = \Gamma_1 \cup xB \mapsto [1]^{k_1}$ | $[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$ |
$\textbf{for}( x \leftarrow xB )$ | $\Gamma_3 = \Gamma_2 \cup yB \mapsto [1]^{k_2}$ | $[ < 1, 1 > ]^{k_1 \cdot k_2}$ |
$\textbf{for}( y \leftarrow yB )$ | $\Gamma_4 = \Gamma_3 \cup x \mapsto 1$ | $[ < 1, 1 > ]^{k_2}$ |
$\textbf{if}\;joinCond(x, y)$ | $\Gamma_5 = \Gamma_4 \cup y \mapsto 1$ | $[ < 1, 1 > ]^1$ |
$\textbf{then}\;[< x, y >]$ | $\Gamma_5$ | $[ < 1, 1 > ]^1$ |
$\textbf{else}\;[]$ | $\Gamma_5$ | $0$ |
IO Costs have 2 components:
Costs are defined for every pair of memory hierarchy levels:
Expression | Result Size | HDD to RAM | RAM to HDD |
---|---|---|---|
$\textbf{for}( xB [k_1] \leftarrow R )$ | $[ < 1, 1 > ]^{\frac{x}{k_1} \cdot \frac{y}{k_2} \cdot k_1 \cdot k_2}$ | $x+\frac{x}{k_1}y$ | $2xy$ |
$\textbf{for}( yB [k_2] \leftarrow S )$ | $[ < 1, 1 > ]^{\frac{y}{k_2} \cdot k_1 \cdot k_2}$ | $y$ | $2k_1y$ |
$\textbf{for}( x \leftarrow xB )$ | $[ < 1, 1 > ]^{k_1 \cdot k_2}$ | $0$ | $2k_1k_2$ |
$\textbf{for}( y \leftarrow yB )$ | $[ < 1, 1 > ]^{k_2}$ | $0$ | $(1+1)k_2$ |
$\textbf{if}\;joinCond(x, y)$ | $[ < 1, 1 > ]^1$ | $0$ | $(1+1)k_2$ |
$\textbf{then}\;[< x, y >]$ | $[ < 1, 1 > ]^1$ | $0$ | $(1+1)k_2$ |
$\textbf{else}\;[]$ | $0$ | $0$ | $0$ |
HDD: $R, S, Result$ RAM: $x, xB, y, yB$