April 14, 2021
name=Sp4rk_T1t4n (Hari) | 8.819752931594849 |
S | 9.216266870498657 |
star | 9.415964841842651 |
OptimusPrime | 9.521755933761597 |
strayed | 9.534846782684326 |
AA | 16.85392713546753 |
Mr.Peanut Butter | 17.192047357559204 |
MS | 18.200833559036255 |
iCheerios | 18.433584213256836 |
Time | T1 | T2 | |
---|---|---|---|
| | W(A) |
||
| | W(A) |
||
| | W(B) |
||
| | W(B) |
||
| | W(C) |
||
↓ | W(C) |
3x T2 "happens before" T1
The schedule is conflict-serializable
... but 2-Phase locking won't work if we can't predict T1's accesses.
We don't know what a transaction will do... until it does.
Idea: Let the transaction do it.
(Then fix it if it broke anything later)
Collect...
Pick a serial order (e.g., the order in which transactions reach the validation phase)
Make sure the transaction's operations follow this order
Which conflicts we need to check for are different depending on how the phases overlap.
Note: The validation and write phases are NOT instantaneous.
Validation (and selecting a serial order) are a critical section: only one transaction at a time.
Allow write phases to proceed concurrently.
If T1's write phase ends before T2's read phase starts, the transactions are already serial.
If T1's write phase ends before T2's write phase starts, there's a possibility of write-read conflicts. Abort unless: $$WriteSet(T_1) \cap ReadSet(T_2) = \emptyset$$
Otherwise write-read and write-write conflicts are possible. Abort unless: $$WriteSet(T_1) \cap ReadSet(T_2) = \emptyset$$ $$WriteSet(T_1) \cap WriteSet(T_2) = \emptyset$$
Idea: Check while the transaction is running
(abort immediately if a violation happens)
Each object $A$ gets a read timestamp ($RTS(A)$) and a write timestamp ($WTS(A)$)
Each transaction $\mathcal T$ gets a timestamp ($TS(\mathcal T)$).
(note that these can be logical timestamps like sequence numbers)
Basic Idea: Require transactions to follow $TS(\mathcal T)$ as a serial order.
Abort (and restart) transactions that would break this order.
Assign restarted transactions a brand new timestamp for fairness
ABORT
ABORT
Time | T1 | T2 | T3 |
---|---|---|---|
| | W(A)
WTS(A) = 3
|
||
| | W(A)
Ignore!
|
||
| | W(A)
Ignore!
|
||
| | W(B)
WTS(B) = 1
|
||
↓ | W(B)
WTS(B) = 2
|
A: $T_3 \rightarrow T_2 \rightarrow T_1$
B: $T_1 \rightarrow T_2$
Cycle... but allowed by Timestamp Concurrency Control
Timestamp CC DOES NOT guarantee conflict-serializability
So is it correct?
Time | T1 | T2 | T3 |
---|---|---|---|
| | W(A) |
||
| | W(A) |
||
| | W(A) |
||
| | W(B) |
||
↓ | W(B) |
T3's write to $A$ "hides" T2's
Two schedules are conflict-equivalent when you can transform one into the other by reordering any pair of operations that...
Two schedules are view-equivalent when you can transform one into the other by reordering any pair of operations that...
A schedule is view serializable if it is view-equivalent to some serial schedule
Timestamp concurrency control is guaranteed to produce view-serializable schedules.
On the happens-before graph, throw away edges created by "hidden" write-write conflicts.
If the resulting graph is acyclic, the schedule is view serializable
Read timestamps are expensive.
(every read becomes a write)
Snapshot Isolation in Oracle, Postgresql, SQLServer, etc... is Timestamp Concurrency Control without read timestamps.
Write-write conflicts will be detected.
Write before read conflicts will be detected.
Read before write conflicts will not be detected.
Another problem... recoverability
Time | T1 | T2 |
---|---|---|
| | W(A) |
|
| | R(A) |
|
| | W(B) |
|
| | COMMIT |
|
↓ | ABORT |
oops...
Observation: Write-Read conflicts are avoidable... the necessary data existed at one point
Idea: Keep old versions of objects.
Each version of an object has...
ABORT