CSE-4/562 Database Systems (Spring 2018)
Data Management Systems (including Relational Databases, Non-Relational Databases, and NoSQL storage systems) form the basis of the Big Data Economy we now live in. A data management system is responsible for storing data, enabling efficient access to that data, as well as mediating concurrent modifications. This class approaches the challenges of designing a data management system from a standpoint that is both principled and practical. The course revolves around a term-long programming assignment, in which you will build a system that answers SQL queries efficiently. Course lectures will focus on the conceptual basis for this system, and will discuss how the techniques you learn generalize (e.g., to the use of NoSQL systems)
In this course, you will learn...
- ... how to efficiently store and retrieve data programatically.
- ... how to optimize big-data computations.
- ... how to use index structures to accelerate computations.
- ... how to safely and efficiently manipulate data concurrently.
- ... how to recover state after software and hardware failures.
- ... how to query and update distributed data consistently.
Course Details
- Class M/W/F 4:00-4:50 PM in in Cooke 121
- Recitation/Hacking Session W 5:00-5:50 PM in in Cooke 248
Instructors:
- TAs:
- William Spoth (Davis Hall TA Area; Thursday 1-3)
- Saurav Singhi (Davis Hall TA Area: Tuesday 1-3)
- Alexander Stachnik (Davis Hall TA Area; Friday 2:30-4)
- Course Discussions: Piazza
- Textbook:
- "Database Systems: The Complete Book" 2e. by Garcia-Molina, Ullman, and Widom
- Optional References:
- Software:
- Project Submission: TBD
- Project Groups: 1-3 people
- Grading:
- 50% theory
- 10% Homeworks (1%/homework, Only 10 best grades)
- 20% (or 15%) Midterm on Mar.12 (in class)
- 20% (or 25%) Comprehensive Final on May 18; 3:30-6:30 (in NSC 225)
- 50% projects
Lecture Schedule
-
Jan 29: Introduction
( slides | print )
-
Jan 31: SQL Overview + Physical Data Organization
( slides | print )
Ch. 2.3, 6.1-6.4, and 13.5-13.7
-
Feb 2: Practicum – Checkpoint 0 + Java + GIT
( slides )
-
Feb 5: Relational Algebra
( slides | print )
Ch 2.4, 5.1
-
Feb 7: Relational Algebra Equivalences
( slides | print )
Ch 2.4, 5.1, 16.2
-
Due: Checkpoint 0
Feb 9: Practicum – Project 1 and Translating SQL to RA
( slides )
-
Feb 12: Query Evaluation
( slides | print )
Ch 15.4-15.5
-
Feb 14: Extended Relational Algebra
( slides | print | homework 1 )
Ch 5.2, 13.1-13.3, 15.7-15.8
-
Feb 16: Practicum – Evaluating Relational Algebra
( slides )
-
Feb 19: Indexing (Intro + Tree Indexes)
( slides )
Ch. 8.3, 14.1-14.4
-
Feb 21: Practicum – Project 2
( slides )
-
Due: Checkpoint 1
Feb 23: Indexing (Hash Indexes + Learned Indexes)
( slides )
-
Feb 26: Aggregation and Access Paths
( slides )
-
Feb 28: Cost-Based Optimization
( slides | homework 2 )
Ch 15.1-15.3, 16.1, 16.3
-
Mar 2: Practicum – [Class Canceled]
( slides )
-
Mar 5: Database Statistics 1/2
( slides )
-
Mar 7: Database Statistics 2/2
( slides )
-
Due: Homework 2
Mar 9: Review – Midterm Review
( slides | homework 2 answers )
-
Mar 12: Midterm
( 2013 | 2014 | 2015 | 2017 )
-
Mar 14: Practicum – Translating SQL to RA and Optimization Review
( slides )
-
Due: Checkpoint 2
Mar 16: Practicum – Project 3 and Optimization Review
-
Mar 19: Break – Spring Break
-
Mar 21: Break – Spring Break
-
Mar 23: Break – Spring Break
-
Mar 26: Views
( slides )
-
Mar 28: Materialized Views
( slides )
-
Mar 30: Practicum – Algorithms for Aggregation
( slides )
-
Apr 2: Buffer Management
( slides )
-
Apr 4: Logging + ARIES Recovery
( slides )
-
Apr 6: Practicum – Open Help Session
( slides )
-
Apr 9: Theory of Transactions
( slides )
-
Apr 11: Theory of Transactions (contd.)
-
Due: Checkpoint 3
Apr 13: Practicum – Project 4 and Memory-Aware Algorithms Review
( slides )
-
Apr 16: Locking and Deadlock Avoidance
( slides )
-
Apr 18: Optimistic Transaction Control
( slides )
-
Apr 20: Practicum – In-Memory Indexes
( slides )
-
Apr 23: Parallel Query Evaluation
( slides )
-
Apr 25: BloomJoin
( slides )
-
Apr 27: Practicum – Precomputation
( slides )
-
Apr 30: Parallel Updates and Consistency
( slides )
-
May 2: ... continued
-
May 4: Practicum – Open Help Session
( notes )
-
Due: Homework 3
May 7: Probabilistic Databases
-
May 9: Review – Final Exam Review 1
( slides | homework 3 answers )
-
May 11: Review – Final Exam Review 2
( 2013 | 2014 | 2015 | 2017 )
Academic Integrity
Students may discuss and advise one another on their lab projects, but groups are expected to turn in their own work. Discussing concepts is permitted. Referencing another group's code is not. Cheating on any course deliverable will result in an automatic grade of F in the course. It is the CSE department's policy not to provide financial support to any student disciplined for plagarism. The University's policy on academic integrity can be reviewed at:
The Graduate School Academic Integrity Policy
Medical Emergencies
Accommodations for medical emergencies will be made on a case-by-case basis. Requests for extensions based on medical emergencies must be accompanied by documentation of the emergency from student health services:
Student Health Services
Accessibility Resources
If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, please advise the instructor during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations. In addition, if you have not yet done so, contact:
The Office of Accessibility Resources.