CSE-4/562 Database Systems (Spring 2019)
Data Management Systems (including Relational Databases, Non-Relational Databases, and NoSQL storage systems) form the basis of the Big Data Economy we now live in. A data management system is responsible for storing data, enabling efficient access to that data, as well as mediating concurrent modifications. This class approaches the challenges of designing a data management system from a standpoint that is both principled and practical. The course revolves around a term-long programming assignment, in which you will build a system that answers SQL queries efficiently. Course lectures will focus on the conceptual basis for this system, and will discuss how the techniques you learn generalize (e.g., to the use of NoSQL systems)
In this course, you will learn...
- ... how to efficiently store and retrieve data programatically.
- ... how to optimize big-data computations.
- ... how to use index structures to accelerate computations.
- ... how to safely and efficiently manipulate data concurrently.
- ... how to recover state after software and hardware failures.
- ... how to query and update distributed data consistently.
Course Details
- Class M/W/F 4:00-4:50 PM in in Hoch 114
- CSE 462 Recitation W 5:00-5:50 PM in in Cooke 248
Instructors:
- Oliver Kennedy: Capen 211 (inside the library); Weds: 10:00-12:00 (Starting Feb 6)
- TAs:
- Ninjas:
- William Spoth: Davis TA Area, Fridays 9:00-11:00 AM
- Darshana Balakrishnan: 212 Capen Hall, Thursdays 2:00-4:00 PM
- Course Discussions: Piazza
- Textbook:
- "Database Systems: The Complete Book" 2e. by Garcia-Molina, Ullman, and Widom
- Optional References:
- Software:
- Homework Submission: Autolab
- Project Submission: DuBStep
- Project Groups: 1-3 people
- Grading:
- 50% theory
- 10% Homeworks (Skip any 4 for any reason)
- 20% (or 15%) Midterm on Mar.13 (in class)
- 20% (or 25%) Comprehensive Final (see HUB for time/location)
- 50% projects
Lecture Schedule
-
Jan 28
-
Jan 30
Class Cancelled: Snow
-
Feb 1
Checkpoint 0, JSQLParser, SQL Overview
(
slides |
sql )
Ch. 16.1, 2.3, 6.1-6.4
-
Feb 4
Relational Algebra
(
slides )
Ch. 2.4, 5.1
-
Feb 6
Relational Algebra Equivalences
(
slides )
16.2
-
Feb 8
Due:
Checkpoint 0
Checkpoint 1
(
slides )
Ch. 15.1-15.3, 16.3
-
Feb 11
Basic Algorithms
(
slides )
Ch. 15.1-15.5, 16.7
-
Feb 13
Due:
Homework 1
Extended Relational Algebra
(
slides )
Ch. 5.2, 15.4
-
Feb 15
Physical Data Layout
(
slides )
Ch. 13.1-13.7, 15.7, 16.7
-
Feb 18
Indexing (Intro + Tree Indexes)
(
slides )
Ch. 8.3-8.4, 14.1-14.2, 14.4
-
Feb 20
-
Feb 22
Indexing (Modern Indexes; Usage) and View Selection
(
slides )
Ch. 8.1-8.2, 15.4-15.5, 15.8
-
Feb 25
Cost-Based Optimization
(
slides )
Ch. 16
-
Feb 27
Due:
Homework 3
Cost-Based Optimization
(
slides )
Ch. 16
-
Mar 1
Due:
Checkpoint 1
Checkpoint 2
(
slides )
Ch. 15.1-15.5, 16.2-16.3, 16.7
-
Mar 4
-
Mar 6
Due:
Homework 4
Data Sketching
(
slides )
-
Mar 8
-
Mar 11
-
Mar 13
-
Mar 15
-
Mar 18
Spring Break
-
Mar 20
Spring Break
-
Mar 22
Spring Break
-
Mar 25
Materialized Views
(
slides )
Ch. 8.5
-
Mar 27
Theory of Transactions
(
slides )
Ch. 18.1-18.2, 19.1
-
Mar 29
Project Workshop
-
Apr 1
Due:
Checkpoint 2
Pessimistic Transaction Control
(
slides )
Ch. 18.3-18.7, 19.2
-
Apr 3
Due:
Homework 5
Optimistic Transaction Control
(
slides )
Ch. 18.8-18.9
-
Apr 5
Optimistic Transaction Control (contd.)
Ch. 18.8-18.9
-
Apr 8
-
Apr 10
Due:
Homework 6
Logging + ARIES Recovery
(
slides )
Ch. 17.1-17.5
-
Apr 12
Distributed Query Processing
(
slides )
-
Apr 15
Distributed Query Processing (contd.)
(
slides )
Ch. 20.1-20.4
-
Apr 17
Due:
Homework 7
Distributed Query Processing (contd.)
(
slides )
Ch. 20.1-20.4
-
Apr 19
Distributed Data Updates
(
slides )
Ch. 20.5-20.6
-
Apr 22
Distributed Data Updates (contd.)
(
slides )
-
Apr 24
Due:
Homework 8
Provenance / Datalog
(
slides )
-
Apr 26
Provenance Semirings
(
slides )
-
Apr 29
Implementing Provenance Queries
(
slides )
Ch. 23.5
-
May 1
Due:
Homework 9
Incomplete Databases
(
slides )
-
May 3
-
May 6
Probabilistic Databases
(
slides )
-
May 8
-
May 10
Demo Day
Academic Integrity
Students may discuss and advise one another on their lab projects, but groups are expected to turn in their own work. Discussing concepts is permitted. Referencing another group's code is not. Cheating on an exam or project submission will result in an grade of F in the course for all involved. It is the CSE department's policy not to provide financial support to any student disciplined for plagarism. University policies on academic integrity can be reviewed at:
CSE Departmental Policy on Academic Integrity
UB's University-Wide Undergraduate Academic Integrity Policy
The Graduate School Policy Library
Medical Emergencies
Accommodations for medical emergencies will be made on a case-by-case basis. Requests for extensions based on medical emergencies must be accompanied by documentation of the emergency from student health services:
Student Health Services
Accessibility Resources
If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, please advise the instructor during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations. In addition, if you have not yet done so, contact:
The Office of Accessibility Resources.