CSE 662 Fall 2019 - $Bloom^L$

$Bloom^L$

CSE 662 Fall 2019

Sept. 10

How is distributed consistency enforced?

  1. Don't allow the program to get into an inconsistent state.
  2. Detect inconsistencies and fix them after the fact.
  3. Eventually converge to a consistent state.

The CALM Principle

A monotonic program eventually converges naturally.

Monotonicity

"Once you learn a fact, it never becomes false" (although you might never learn all available facts)

Computation under monotonicity

  1. What facts do you know right now?
  2. What facts can you compute given what you know?
  3. Broadcast/react to newly discovered facts

What causes concurrency violations?

A computation step needs a complete input before it can produce a complete output

The output is incorrect if...

  • The input is incomplete, and...
  • An incomplete output is not correct.

  • Avoid incomplete inputs
    • Block until you know all inputs are ready (Point of Order)
  • Avoid computations where incomplete outputs are incorrect
    • Monotonic programs never produce incorrect outputs, just incomplete ones.

What isn't monotonic?

Negation

$$R = \{A, B, C\}; S = \{C, D\}$$

Let's say that $T = R - S$

You know two facts about $T$: $A \in T$ and $B \in T$

If you ever learn that $A \in S$,
the "fact" that $A \in T$ becomes false

Aggregation

$$R = \{1, 2, 3\}$$

Let's say that $T = \sum_{i \in R} i$

You know several facts about $T$ including: $T = 6$

If you ever learn that $4 \in R$,
the "fact" that $T = 6$ becomes false

What is monotonic?

Sets!

Datalog

Atoms: $Parent(A, B)$
($A$ is a $Parent$ of $B$)

Rules: $Ancestor(A, B)$ :- $Parent(A, B)$
(If $A$ is a $Parent$ of $B$, then $A$ is also an $Ancestor$ of $B$)

$Ancestor(A, B)$ :- $Parent(A, B)$

$Ancestor(A, C)$ :- $Parent(A, B), Ancestor(B, C)$

($A$ is an $Ancestor$ of $C$ if $A$ is a $Parent$ of $B$ and $B$ is an $Ancestor$ of $C$)


$Ancestor$ computes the transitive closure of $Parent$


No fact (atom) that you can ever learn will invalidate a fact that you've already learned.

Datalog is Monotonic

(unless you need negation or aggregation)

Bloom


Datalog with timesteps and asynchronous events

SymbolMeaning
<=Add a new fact right now
<+Add a new fact in the next timestep
<-Remove a fact from the next timestep
<~Send a fact to another node

Example: Shortest Paths (Dijkstra's)

class ShortestPaths
    include Bud

    state {
        table :link, [:from, :to] => [:cost]
        scratch :path, [:from, :to, :next_hop, :cost]
        scratch :min_cost, [:from, :to] => [:cost]
    }

    bloom {
        path <= link {|l| [l.from, l.to, l.to, l.cost]}
        path <= (link*path).pairs(:to => :from) { |l,p|
          [l.from, p.to, l.to, l.cost + p.cost]
        }
        min_cost <= path.group([:from, :to], min(:cost))
    }
end

Optimization


  path <= (link*path).pairs(:to => :from) { |l,p|
    [l.from, p.to, l.to, l.cost + p.cost]
  }
		

Compute only new facts: This step can be performed incrementally as path entries are added

Example: Quorum Voting


class QuorumVote 
	include Bud
	state { 
		channel :vote_chn, [:@addr, :voter_id]
		channel  :result_chn, [:@addr]
		table :votes, [:voter_id]
		scratch :cnt, [] => [:cnt]
	}

	bloom {
		votes <= vote_chn {|v| [v.voter_id]}
		cnt <= votes.group(nil, count(:voter_id))
		result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
	}
end

Problem!


	cnt <= votes.group(nil, count(:voter_id))
	result_chn <~ cnt {|c| [RET_ADDR] if c >= QUORUM_SIZE}
		

cnt isn't monotonic: It can't be computed until all votes are present and available!

(Positive) Set operations are monotonic

... but not the only thing that's monotonic

  • Growing Integers
  • Facts that become true
  • Many aggregates (MAX, MIN, COUNT)
  • Vector Clocks

There's a term for these... bounded join semilattices

Bounded Join Semilattice

$$< S, \sqcup, \bot >$$

$S$: A set (Integers, Boolean values, Sets of facts)

$\sqcup : S \times S \rightarrow S$: A 'merge' operation for elements of $S$

$\bot \in S$: A 'starting' element of $S$

"Merge"

(Least upper bound)

  • Associative: $(a \sqcup b) \sqcup c = a \sqcup (b \sqcup c)$
  • Commutative: $a \sqcup b = b \sqcup a$
  • Idempotent: $a \sqcup a = a$

Defines a partial order: $a < b$ if $a \sqcup b = b$

Examples

$$< \mathbb R, MAX, -\infty >$$
$$< \mathbb R, MIN, +\infty >$$
$$< \mathbb B, \wedge, T >$$
$$< \mathbb B, \vee, F >$$
$$< sets\ of\ \mathbb R, \cup, \emptyset >$$

New notion of 'Fact': How 'far' in the lattice's set you are.

EventAliceBobCarolDave
Initial State61710
$Alice \sqcup Bob$66710
$Carol \sqcup Dave$661010
$Alice \sqcup Dave$1061010
$Bob \sqcup Carol$10101010

The lmax lattice always goes up

Programs don't rely exclusively on one type!

We need mappings between different lattice types

Monotone Functions

$$f : S \rightarrow T$$

For any monotone $f$,
whenever $a <_S b$ then $f(a) <_T f(b)$


Monotone functions preserve partial orders

Example Monotone Functions

  • sizeof : $set \rightarrow \mathbb N$
  • $\sum$ : $set\ of\ \mathbb R^+ \rightarrow \mathbb R^+$
  • $\cap$ : $set \times set \rightarrow \mathbb set$
  • $>(\mathbb R)$ : $lmax \rightarrow \mathbb B$

If all computations in a program are monotone functions, the program is naturally eventually consistent

... but we can do better

Morphism

$$f : S \rightarrow T$$

$f$ is a morphism if
$f$ is monotone and $f(a \sqcup b) = f(a) \sqcup f(b)$
($f$ commutes with $\sqcup$)


Monotone functions are decomposable

Example Morphisms

  • $\cap$ : $set \times set \rightarrow \mathbb set$
  • $>(\mathbb R)$ : $lmax \rightarrow \mathbb B$

	path <= link {|l| [l.from, l.to, l.to, l.cost]}
	path <= (link*path).pairs(:to => :from) { |l,p|
		[l.from, p.to, l.to, l.cost + p.cost]
	}
	min_cost <= path.group([:from, :to]) { |group| 
		group.project(:cost).min 
   	}
			
Morphisms (using bags & lmin)
(link * path).pairs
+
project
group
min

Incremental Computation

We need to update an input with new data and compute: $$f(old \sqcup new)$$

We (probably) already have $f(old)$.

Insight: Computing $f(old) \sqcup f(new)$ is probably cheaper.

... but is only correct if $f$ is a morphism

Example: Set Lattice

class Bud::SetLattice < Bud::Lattice 
	wrapper_name :lset
	def initialize(x=[])
		@v = x.uniq # Remove duplicates from input
	end
	def merge(i)
		self.class.new(@v | i.reveal)
	end
	morph :intersect do |i| 
		self.class.new(@v & i.reveal)
	end
	morph :contains? do |i| 
		Bud::BoolLattice.new(@v.member? i)
	end
	monotone :size do 
		Bud::MaxLattice.new(@v.size)
	end 
end

Example: A Key Value Store

class KvsReplica 
	include Bud
	include KvsProtocol
	state { lmap :kv_store }
	bloom do
		# Fulfil any put requests
		kv_store <= kvput {|c| {c.key => c.val}}
		# Acknowledge any put requests
		kvput_resp <~ kvput {|c| 
			[ c.reqid, c.client_addr, ip_port ]} 
		# Respond to any get requests
		kvget_resp <~ kvget {|c| 
			[ c.reqid, c.client_addr,
			  kv_store.at(c.key), ip_port ]}
	end 
end