Those Marvelous Lifts and Exists (Part 3)

We've been talking for the last few weeks about how the Lift operator can be used to express nested subqueries.  This gives AGCA nearly the full power of (non-recursive)SQL.  

That said, there's one thing that AGCA can't do with what I've said before: existential quantification.  For example, consider the query:

SELECT COUNT(*) FROM R WHERE EXISTS (SELECT * FROM S WHERE R.A = S.A)

Now, if you stare at this query long enough you might think up with the following potential encoding:

AggSum([], R(A) * S(A))

In a way, this makes sense.  You get the count of R(A), but only if there's a matching S(A).  Unfortunately, this encoding isn't correct.  Remember, we're dealing with bags, not sets.  What if S looks like:

_<_A_>____#_
 < 1 > -> 2

And R looks like

_<_A_>____#_
 < 1 > -> 1

That is, there are two copies of the tuple <1> in S, and our query result will be 2 (instead of 1).  We're not looking for the specific number of tuples in S (that match a particular pattern), we're just looking to test whether there are ANY tuples in S (that match a particular pattern).  

We need an operator that can act as (not quite, but something sort of like) a step function.  Inside the operator is a nested query (just like Lift and AggSum).  If the nested query evaluates to 0, the operator evaluates to 0.  If the nested query evaluates to something other than 0, the operator evaluates to 1, regardless of what precisely the nested expression evaluates to.

This operation is actually something that you can't do with AGCA as I've described it up to this point (try it yourself if you don't believe me).  Thus, we have the Exists operator (which I've just described), and we can express the example query as:

AggSum([], R(A) * Exists(S(A)))

What about deltas?  Well, it turns out the exists operator is actually quite close to the lift operator.  The exists operator doesn't introduce any new columns into the schema (like the lift), but it is a non-linear operation (unlike AggSum).  Furthermore, unlike the only other non-linear operation (comparison), it can have a non-zero delta.  So, we get:

∂Exists(Q) = Exists(Q+∂Q) - Exists(Q)

Again, the delta has the original query in it, but this can be addressed using the same materialization tricks we talked about last week.

And that's it.  That's all there is to AGCA!


This page last updated 2019-06-28 15:47:51 -0400