The Online Data Interactions Lab
https://odin.cse.buffalo.edu

Oliver Kennedy

What does The ODIn Lab do?

  • Data(bases): How do we represent data and knowledge. How do we make it possible for users to quickly and easily answer questions aout it?
  • Uncertainty: How do you reliably and efficiently answer questions about incomplete information? How should uncertainty be presented to users who may not be familiar with statistics? Is the user better served by a less accurate answer?
  • Data Structures: What is the most efficient way of storing data so it can be queried later? What happens when the workload changes? Can we dynamically construct data structures on the fly to respond to workloads?
  • (Programming) Languages: How do users phrase queries? What kind of properties of the language (or mode of interaction) can we exploit to answer a user's queries more quickly and accurately?

(azure.microsoft.com)

(timoelliott.com)

Cleaning Data

  • Schema Mismatches
  • Entity Duplication
  • Invalid/Missing Values
  • ... and much more

Cleaning Data is Hard!

  • Clean everything upfront -- can take days or weeks
  • Clean what you need -- easy to get disorganized
  • On-Demand Data Cleaning -- leverage automation

Mimir

  • Lenses that automatically clarify data by making guesses.
  • Lots of gimmickery for working with existing databases.
  • Interface layer for communicating the DB's uncertainty.

Demo

Teaching

  • Fall 2015 - CSE 662 - Database Languages and Runtimes
    • Programming Languages meets Databases
  • Spring 2016 - CSE 462 - Database Systems
    • Build your own database

The ODIn Lab - Projects

  • Mimir (with Jan Chomicki)
    • Ying Yang
    • Niccolo Meneghetti
    • Aringam Nandi
    • Vinayak Karuppasamy
  • Insider Threats (with Varun Chandola, Shambhu Upadhyay, Hung Ngo)
    • Ting Xie
    • Gokhan Kul
    • Duc Thanh Anh Luong
  • Pocket Data/ASTralDB (with Luke Ziarek, Geoff Challen)
    • Jerry Ajay

(if you see one of these people, ask them about their project!)