CSE 662 Fall 2019 - Functional Data Structures

Functional Data Structures

CSE 662 Fall 2019

Sept. 3

Mutable vs Immutable Data


      X = [ "Alice", "Bob", "Carol", "Dave" ]
    
X : Alice Bob Carol Dave

      print(X[2])  // ->  "Carol"
    

      X[2] = "Eve"
    
X : Alice Bob Eve Dave

      print(X[2])  // -> "Eve"
    

      X = [ Alice, Bob, Carol, Dave ]
    
X : Alice Bob Carol Dave
Thread 1 Thread 2

    X[2] = "Eve"
        

     print(X[2])
        
🤔

Mutable Data Structures

  • The programmer's intended ordering is unclear.
  • Atomicity/Correctness requires locking.
  • Versioning requires copying the data structure.
  • Cache coherency is expensive.

Can these problems be avoided?

Mutable vs Immutable Data


      X = [ "Alice", "Bob", "Carol", "Dave" ]
    
X : Alice Bob Carol Dave

      print(X[2])  // ->  "Carol"
    

      X[2] = "Eve"
    

Don't allow writes!

But what if we need to update the structure?

Idea 1: Copy

X : Alice Bob Carol Dave
X' : Alice Bob Eve Dave

Slooooooooooooooooooooooow!

Idea 2: Break it Down

Data is always added, not replaced!

Immutable Data Structures
(aka 'Functional' or 'Persistent' Data Structures)

  • Once an object is created it never changes.
  • The object persists until all pointers to it go away, at which point it is garbage collected.
  • Only the "root" pointer is ever allowed to change, to point to a new version.

Linked List Stacks

Class Exercise 1

How would you implement:


       list update(list, index, new_value)
    

Class Exercise 2

Implement a set with:


         set init()
         boolean is_member(set, elem)
         set insert(set, elem)
    

Lazy Evaluation

Can we do better?

Putting off Work

x = "expensive()"
Fast
(Just saving a 'todo')
print(x)
Slow
(Performing the 'todo')
print(x)
Fast
('todo' already done)

Class Exercise 3

Make it better!

Putting off Work


      concatenate(a, b) { 
        a', front = pop(a)
        if a' is empty {
          return (front, b)
        } else {
          return (front, "concatenate(a', b)")
        }
      }
    

What is the time complexity of this concatenate?

What happens to reads?

Lazy Evaluation

Save work for later...

  • ... and avoid work that is never requred.
  • ... to spread out work over multiple calls.
  • ... for better "amortized" costs.

Amortized Analysis

Allow operation A to 'pay it forward' for another operation B that hasn't happened yet.

... or allow an operation B to 'borrow' from another operation A that hasn't happened yet.

  • A's time complexity goes up by X.
  • B's time complexity goes down by X.

Example: Amortized Queues

Example: Amortized Queues


      queue enqueue(queue, item) {
        return {
          current : queue.current, 
          todo : push(queue.todo, item)
        )
      }
    

What is the cost?

Example: Amortized Queues


    queue dequeue(queue) {
      if(queue.current != NULL){

        return { current: pop(queue.current), todo: queue.todo }

      } else if(queue.todo != NULL) {

        return { current: reverse(queue.todo), todo: NULL }

      } else { 

        return { current: NULL, todo: NULL }

      }
    }
    

What is the cost?

Example: Amortized Analysis

enqueue(): Push onto todo stack
$O(1) + \text{create } 1 \text{ credit}$
dequeue(): Pop current OR Reverse todo
Either:
  • Pop current queue: $O(1)$
  • Reverse stack: $O(1) + \text{consume } N \text{ credits}$

Critical requirement of amortized analysis: Must ensure that every credit consumed was once created.