CSE-4/562 Database Systems (Spring 2019)
Data Management Systems (including Relational Databases, Non-Relational Databases, and NoSQL storage systems) form the basis of the Big Data Economy we now live in. A data management system is responsible for storing data, enabling efficient access to that data, as well as mediating concurrent modifications. This class approaches the challenges of designing a data management system from a standpoint that is both principled and practical. The course revolves around a term-long programming assignment, in which you will build a system that answers SQL queries efficiently. Course lectures will focus on the conceptual basis for this system, and will discuss how the techniques you learn generalize (e.g., to the use of NoSQL systems)
In this course, you will learn...
- ... how to efficiently store and retrieve data programatically.
- ... how to optimize big-data computations.
- ... how to use index structures to accelerate computations.
- ... how to safely and efficiently manipulate data concurrently.
- ... how to recover state after software and hardware failures.
- ... how to query and update distributed data consistently.
Course Details
- Class M/W/F 4:00-4:50 PM in in Hoch 114
- CSE 462 Recitation W 5:00-5:50 PM in in Cooke 248
Instructors:
- Oliver Kennedy: Capen 211 (inside the library); Weds: 10:00-12:00 (Starting Feb 6)
- TAs:
- Ninjas:
- William Spoth: Davis TA Area, Fridays 9:00-11:00 AM
- Darshana Balakrishnan: 212 Capen Hall, Thursdays 2:00-4:00 PM
- Carl Nuessle (Availability TBD)
- Course Discussions: Piazza
- Textbook:
- "Database Systems: The Complete Book" 2e. by Garcia-Molina, Ullman, and Widom
- Optional References:
- Software:
- Homework Submission: Autolab
- Project Submission: DuBStep
- Project Groups: 1-3 people
- Grading:
- 50% theory
- 10% Homeworks (Skip any 4 for any reason)
- 20% (or 15%) Midterm on Mar.13 (in class)
- 20% (or 25%) Comprehensive Final (see HUB for time/location)
- 50% projects
- 5% Checkpoint 0 due on Feb. 8.
- 10% Checkpoint 1 due on Feb. 25
- 10% Checkpoint 2 due on Mar. 11
- 15% Checkpoint 3 due on Apr. 8
- 10% Checkpoint 4 due on May 6
Lecture Schedule
-
Jan 28
-
Jan 30
Class Cancelled: Snow
-
Feb 1
Checkpoint 0, JSQLParser, SQL Overview
(
slides |
sql )
Ch. 16.1, 2.3, 6.1-6.4
-
Feb 4
Relational Algebra
(
slides )
Ch. 2.4, 5.1
-
Feb 6
Relational Algebra Equivalences
(
slides )
16.2
-
Feb 8
Due:
Checkpoint 0
Checkpoint 1
(
slides )
Ch. 15.1-15.3, 16.3
-
Feb 11
Basic Algorithms
(
slides )
Ch. 15.1-15.5, 16.7
-
Feb 13
Due:
Homework 1
Extended Relational Algebra
(
slides )
Ch. 5.2, 15.4
-
Feb 15
Physical Data Layout
(
slides )
Ch. 13.1-13.7, 15.7, 16.7
-
Feb 18
Indexing (Intro + Tree Indexes)
(
slides )
Ch. 8.3-8.4, 14.1-14.2, 14.4
-
Feb 20
-
Feb 22
External (2-Pass) Algorithms
Ch. 15.4-15.5, 15.8
-
Feb 25
Due:
Checkpoint 1
Checkpoint 2
Ch. 15.1-15.5, 16.2-16.3, 16.7
-
Feb 27
Due:
Homework 3
External (2-Pass) Algorithms (contd.)
-
Mar 1
Query Optimization
Ch. 16.2, 16.6, 16.7
-
Mar 4
Database Statistics 1/2
Ch. 16.4-16.5
-
Mar 6
Due:
Homework 4
Database Statistics 2/2
Ch. 16.4-16.5
-
Mar 8
-
Mar 11
Due: Checkpoint 2
Midterm Review
-
Mar 13
Midterm
-
Mar 15
-
Mar 18
Spring Break
-
Mar 20
Spring Break
-
Mar 22
Spring Break
-
Mar 25
Materialized Views
Ch. 8.5
-
Mar 27
Due:
Homework 5
SQL DDL and Constraints
Ch. 7.1-7.4
-
Mar 29
Managing Updates
-
Apr 1
Theory of Transactions
Ch. 18.1-18.2, 19.1
-
Apr 3
Due:
Homework 6
Pessimistic Transaction Control
Ch. 18.3-18.7, 19.2
-
Apr 5
Pessimistic Transaction Control (contd.)
Ch. 18.3-18.7, 19.2
-
Apr 8
Due:
Checkpoint 3
Checkpoint 4
Ch. 15.7, 16.4-16.6
-
Apr 10
Due:
Homework 7
Optimistic Transaction Control
Ch. 18.8-18.9
-
Apr 12
Logging + ARIES Recovery
Ch. 17.1-17.5
-
Apr 15
Logging + ARIES Recovery (contd.)
Ch. 17.1-17.5
-
Apr 17
Due:
Homework 8
Distributed Algorithms
Ch. 20.1-20.4
-
Apr 19
Distributed Algorithms (contd.)
Ch. 20.1-20.4
-
Apr 22
Distributed Updates
Ch. 20.5-20.6
-
Apr 24
Due:
Homework 9
Distributed Updates (contd.)
Ch. 20.5-20.6
-
Apr 26
-
Apr 29
Approximate Query Processing
Ch. 23.5
-
May 1
Due: Homework 10
Provenance Queries
-
May 3
Provenance for Uncertainty
-
May 6
Due: Homework 11
Buffer Day
-
May 8
Final Exam Review 1
-
May 10
Final Exam Review 2
Academic Integrity
Students may discuss and advise one another on their lab projects, but groups are expected to turn in their own work. Discussing concepts is permitted. Referencing another group's code is not. Cheating on an exam or project submission will result in an grade of F in the course for all involved. It is the CSE department's policy not to provide financial support to any student disciplined for plagarism. University policies on academic integrity can be reviewed at:
CSE Departmental Policy on Academic Integrity
UB's University-Wide Undergraduate Academic Integrity Policy
The Graduate School Policy Library
Medical Emergencies
Accommodations for medical emergencies will be made on a case-by-case basis. Requests for extensions based on medical emergencies must be accompanied by documentation of the emergency from student health services:
Student Health Services
Accessibility Resources
If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, please advise the instructor during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations. In addition, if you have not yet done so, contact:
The Office of Accessibility Resources.