The ODIn Lab - AGCA Summary

For the past month, I've been talking about AGCA, a language for incremental processing. Next week, I'll go into AGCA's primary application: The viewlet transform. But before I get to the transform, I'm going to do a quick overview of AGCA so that the basics of the language are all in one post.

I also lied a bit... I want to introduce one more (extremely) concept: A simplified form of an operation AGCA calls Lift.

Something that's required for any real query language is the ability to define tuples inline. This is something you might do in SQL as

(SELECT 1 AS A)

AGCA uses the Lift operation for this purpose:

(A ^= 1)

Think of the lift operation as variable assignment in a programming language. It creates a single-column relation with a single row in it

___A______#__

 < 1 > -> 1

Lift can be combined with the Natural Join to construct arbitrarily wide single-row relations. For example:

(SELECT 1 AS A, 2 AS B, 3 AS C)

would be expressed in AGCA as

(A ^= 1) * (B ^= 2) * (C ^= 3)

Lift and Union combine similarly to create multiple row relations.

(A ^= 1) + (A ^= 2) + (A ^= 3)

We'll need the lift when we talk about incrementality. That said, let's get to the summary.

Relation (Table)

NAME(COL1, COL2, ...)

Represents the contents of a base relation (aka a Table). The output is a mapping from every distinct row of the relation to the tuple's multiplicity in the relation. If the same relation appears more than once in the same expression, each occurrence of the relation can have different column names.

Natural Join

A * B

The natural join of the relations defined by expressions A and B. Every row in the output of A will be matched with every row in the output of B that has the same values for columns with the same name. If there are no columns with the same name, this is effectively the cartesian cross-product. For every row in A matched with a row in B, a single row with a multiplicity equal to the product of the two matched rows will be output.

Bag Union

A + B

The bag union of the relations defined by expressions A and B. In general, A and B should have the same schemas (although AGCA does support the case where they don't). Every row of the output has a multiplicity equal to the sum of the row's multiplicities in A and B.

Sum/Count Aggregate & Projection

AggSum([col1, col2, ...], A)

The Sum aggregate (grouping by col1, col2, ...) of A. This is equivalent in AGCA to projecting away everything except for col1, col2, ...etc. The output rows have schema col1, col2, ..., and any given row in the output has a multiplicity equal to the sum of all rows that got projected down to the output row.

Value Expression

A * { f(var1, var2, ...) }

When applied to an expression A by the natural join, multiplies each row's multiplicity by an arbitrary single-valued function f over columns var1, var2, ... of A.

Comparison Predicate

A * { f(var1, var2, ...) θ g(var1, var2, ...) }

When applied to an expression A by the natural join, filters out rows that do not satisfy the predicate (f θ g) where θ is a comparison operation, and f and g are arbitrary single-valued functions over columns var1, var2, ... of A.

Lift

(var ^= value)

Outputs a single row with column named var, with the indicated value, and a multiplicity of 1.