∂T(A,B,C,…) = {+/- 1} * (A ^= {X}) * (A ^= {Y}) * (A ^= {Z}) * …

∂S(…) = {0}

∂(Q1 + Q2) = ∂Q1 + ∂Q2

∂(Q1 * Q2) = (Q1 * ∂Q2) + (∂Q1 * Q2) + (∂Q1 * ∂Q2)

∂(AggSum([…], Q)) = AggSum([…], ∂Q)

∂({…}) = {0}

∂(V ^= {…}) = {0} (for the simplified Lift operation only)Now, we've been down in the nitty gritty of AGCA for a while now. Let's pop our heads up for a moment to remember where we're going with all of this. We have a query (let's call it Q) and we want to be able to incrementally maintain it. That is, we want to store a copy of the results of evaluating Q on a database (stored on disk, in memory, anywhere really), and

**every time the database changes in some way, we want to update the stored results to match**.

### Applying the Delta Transform

That's fairly easy to do somewhat efficiently if we have these deltas. Let's say we have the queryQ := AggSum([], R(A,B) * S(B,C) * A)or in SQL

SELECT SUM(A) AS Q FROM R, S WHERE R.B = S.B;Let's say R contains

`___A__B______#__`

< 1, 1 > -> 1

< 1, 2 > -> 1

< 2, 2 > -> 1and S contains

`___B__C______#__`

< 1, 1 > -> 2

< 2, 2 > -> 1For this data, Q = 5. Let's say we insert a new row: S(2,1). We could certainly re-evaluate the entire query from scratch and discover that the new result is Q = 8, but this would be pretty inefficient. Even in the best case, where everything fits in memory, re-evaluating the join requires O(|R| + |S|) work. That's where the deltas come in. The delta of Q tells us how the query results change with respect to a change in the table. So let's take the delta of Q with respect to an insertion of tuple <@Y,@Z> into S.

∂Q := ∂(AggSum([], R(A,B) * S(B,C) * {A})

:= AggSum([], ∂(R(A,B) * S(B,C) * {A}))

:= AggSum([], R(A,B) * ∂(S(B,C) * {A})

+ ∂R(A,B) * (S(B,C) * {A})

+ ∂R(A,B) * ∂(S(B,C) * {A}))

:= AggSum([], R(A,B) * ∂(S(B,C) * {A)

+ {0} * (S(B,C) * {A})

+ {0} * ∂(S(B,C) * {A}))

:= AggSum([], R(A,B) * ∂(S(B,C) * {A}))

:= AggSum([], R(A,B) * (S(B,C) * ∂{A}

+ ∂S(B,C) * {A}

+ ∂S(B,C) * ∂{A}))

:= AggSum([], R(A,B) * (S(B,C) * {0}

+ ((B ^= {@Y}) * (C ^= {@Z}) * {A})

+ ∂S(B,C) * {0}))

:= AggSum([], R(A,B) * (B ^= {@Y}) * (C ^= {@Z}) * {A})I'm not going to get into the details of optimizing AGCA expressions (yet), but trust me for now that the following (simpler) query is equivalent

∂Q := AggSum([], R(A,@Y) * {A})or in SQL

SELECT SUM(A) FROM R WHERE R.B = @Y;Note, by the way, that @Y is a parameter to this delta query. When you evaluate a delta query (for example our delta for insertions into S), these parameters take their value from the tuple being modified (so when you insert <2,1> into S, then @Y = 2). That said, @Y is just a normal variable/column. There's nothing special about it (other than the @ in the name). The delta query tells us how the query results change. If we insert <2,1> into S, then we evaluate the delta query for insertions into S (∂Q above), setting @Y to 2, and @Z to 1.

AggSum([], R(A, 2) * {A})… which, for our initial dataset above, gives us ∂Q = 3. To figure out what Q will give us on the modified database (after inserting <2,1> into S), we just add ∂Q to our initial result (5 + 3 = 8).

### Parameters and AggSums

I keep saying that parameters just normal variables, and that there's nothing special about them. That's mostly true. I actually oversimplified a bit on the delta rules. We want these parameters to be visible from the outside so that evaluating ∂Q for a specific insertion (or deletion) essentially amounts to selecting a single row from the output of ∂Q. In other words, the AggSums need to be rewritten slightly so that the parameters appear in the group-by variables (where appropriate). That is, the correct delta with respect to an insertion into S : +S(@Y,@Z) isAggSum([@Y], R(A, @Y) * {A})or in SQL

SELECT R.B, SUM(R.A) FROM R GROUP BY R.B;That little hiccup out of the way, let's get to the actual viewlet transform

### Auxiliary Views

The delta query is an improvement over evaluating the entire query from scratch. For this particular example though, we still need to scan over multiple rows of R (even if it is only a small subset of R). We can do even better. Right now, every time the database changes we update Q with ∂Q.ON +S(@X, @Y)

Q += AggSum([@Y], R(A,@Y) * {A})But recall that for any AGCA query Q, ∂Q is just an ordinary, simple, unexceptional AGCA query (no funny business is introduced by the ∂). If we can store ('materialize' to use the technical term) the results of Q, what's to stop us from storing the results of ∂Q? Nothing! Let's say we had another view materialized (let's call it M_S), this time with a group by variable:

M_S[Y] := AggSum([Y], R(A, Y) * {A})For the initial dataset above, this would contain

`___Y______#__`

< 1 > -> 1

< 2 > -> 3This view can help us substantially when we need to update Q after an insertion into S. Expressing this update as a trigger:

ON +S(@Y, @Z)

Q += M_S[@Y]In other words, to update Q, we just need to look up one row of M_S. The update can be done in constant time! That said, we now have an extra view that we need to maintain. Fortunately, M_S is simpler than Q, and has only one table, in this case R. Whenever R changes, we need to update M_S. Since M_S is defined in terms of a normal, ordinary AGCA expression, we update it in exactly the same way that we update Q, using the delta of M_S. For an insertion of <@X,@Y> into R, this would be:

∂M_S[Y] := AggSum([Y], R(A,Y) * {A})

:= AggSum([@X, @Y, Y], (A ^= {@X}) * (Y ^= {@Y}) * {A})

:= (Y ^= {@Y}) * {@X}Or, expressed as a trigger

ON +R(@X, @Y)

M_S[Y] += (Y ^= {@Y}) * {@X}This can be simplified a bit, since the update operation only produces one row

ON +R(@X, @Y)

M_S[@Y] += {@X}(also a constant time operation) Recursive Deltas What's happening here is that we're saving the results of Q and maintaining them with (several instantiations of) ∂Q. The key idea of the viewlet transform is that we can also save the results of ∂Q and maintain them with (several instantiations of) ∂(∂Q). This process repeats recursively, giving us ∂(∂(∂Q)), ∂(∂(∂(∂Q))), and so on. Every time we add another ∂, another table drops out, making it the delta query simpler. After enough repetitions, we end up with a query that doesn't depend on the database at all (e.g., ∂M_S above). At this point, we can stop, since the query can be evaluated in constant time. This is the viewlet transform.

**Start by materializing the original query, and then alternate computing its delta(s), and recursively materializing the delta(s)**. I've obscured the issue a bit by not subscripting my ∂s, remember that each delta is taken with respect to a particular event. Q has four deltas: for both insertion and deletion of both R and S (∂+R, ∂-R, ∂+S, ∂-S). Similarly, M_S has two deltas: for both the insertion and deletion of R and S. The viewlet transform of Q produces five views: one for Q, and one for each of the first-tier deltas (the second-tier deltas are all constants). As it turns out, we can be even more efficient than that! Furthermore, the procedure I describe above runs into problems with special tables (as a bit of a teaser for this, take a close look at the delta of Q with respect to insertions into R). This

*naive*viewlet transform is insufficient. Next week, I'll start discussing some changes to the naive viewlet transform that make it more practical and efficient.