0% found this document useful (0 votes)
47 views94 pages

CC Part 2 No Quizzes

Uploaded by

est09serra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views94 pages

CC Part 2 No Quizzes

Uploaded by

est09serra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Introduction to Transaction

Management

CMPSCI 445

1
Txn Semantics Summary
 Transactions are how programmers make
sense of concurrent behavior
 ACID properties are a contract between
DB and programmer
 A tool to build understandable code
Interleaving execution is vital for
performance
 DB has to run operations concurrently, but
 Simulate that only a single txn was run at
a time
 This is Concurrency Control 2
Anomalies with Interleaved Execution

 Not all interleavings of operations are okay.


 Anomaly: two consistency-preserving
committed transactions that lead to an
inconsistent state.
 Types of anomalies:
 Reading Uncommitted Data (WR Conflicts)
“dirty reads”
 Unrepeatable Reads (RW Conflicts)
 Overwriting Uncommitted Data (WW Conflicts)

3
Reading Uncommitted Data
(WR Conflict)
T1: Transfer T2: Interest
“Dirty Read” R(A)
W(A)
R(A)
Inconsistent result of A is W(A)
exposed to transaction T2
R(B)
(We saw this schedule in
the banking example) W(B)
Commit
Transaction is no longer
ISOLATED from other R(B)
transactions’ effects
W(B)
Commit 4
Acceptable Dirty Read

If we are just checking availability of an


airline seat, a dirty read might be fine! (Why
is that?)
Reservation transaction:
EXEC SQL select occupied into :occ
from Flights
where Num= ‘123’ and date=11-03-99
and seat=‘23f’;
if (!occ) {EXEC SQL
update Flights
set occupied=true
where Num= ‘123’ and date=11-03-99
and seat=‘23f’;}
else {notify user that seat is unavailable}

5
Unrepeatable Reads

T1 could see two values for A, T1 T2


although it has not changed A itself. R(A)
(This could not happen in a serial
execution. ) R(A)
W(A)
Commit
R(A)
W(A)
Commit

6
Overwriting Uncommitted Data
(WW Conflicts)
T1 T2
W(A) Here T2 overwrites value of B that
has been modified by T1.
W(B)
Example: Harry and Larry are 2
W(B) employees. Salaries must be
kept equal.
Commit
W(A) T1: set salaries to 1000
T2 set salaries to 2000
Commit

7
Overwriting Uncommitted Data
(WW Conflicts)
Here T2 overwrites value of B that
T1 T2 has been modified by T1.
W(A) Example: Harry and Larry are 2
W(B) employees. Salaries must be
kept equal.
W(B)
T1: set salaries to 1000
Commit T2 set salaries to 2000
T1; T2 => salaries = 2000
W(A) T2; T1 => salaries = 1000
Commit This interleaving, Harry = 1000,
Larry = 2000 (lucky larry)

8
Lock-Based Concurrency Control
 Just like in procedural programming
 locks can be used to maintain orders of
access to shared objects
 Lock - associated with some object
 shared (r) or exclusive (w)
 Locking protocol - set of rules to be followed
by each transaction to ensure good
properties.

9
Lock Compatibility Matrix

Locks on a data item are granted based on a


lock compatibility matrix:
Mode of Data Item
None Shared Exclusive

Request mode { Shared


Exclusive Y
Y Y
N
N
N

When a transaction requests a lock, it must wait


(block) until the lock is granted

*there’s a whole literature about how to handle waiting for


locked resources 10
Transactions and locks
 Transactions now have two new operations
 lock (written as X for exclusive S for
shared)
 unlock (written as U)
 Depending on the locking protocol
transactions will request and release locks
around data accesses

11
Example Txn with locking
T1

lock-X(A)
R(A)
W(A)
unlock(A)
lock-S(B)
R(B)
unlock(B)

12
Goals of a locking protocol
 Ensure consistency/isolation
 Serializability for this lecture
 Enable interleaving of txn operations

 If locks don’t make things faster, we


shouldn’t use them
 Lock operations should be fast
 Protocol should try to maximize consistent
concurrency

13
Two-Phase Locking (2PL)
 Two-Phase Locking Protocol
 Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on object
before writing.
 A transaction can not request additional locks
once it releases any locks.
• This implies two phases:
• growing phase
• shrinking phase
Example of 2PL
T1 T2
X(A)
R(A)
W(A)
X(B)
U(A)
X(A)
R(A)
W(A)
R(B)
W(B)
U(B)
X(B)
R(B)
W(B)
U(A); U(B)
Commit
Commit
15
Example of 2PL
T1 T2
X(A)
growing phase R(A)
W(A)
X(B)
U(A)
X(A)
R(A)
shrinking phase W(A) growing phase
R(B)
W(B)
U(B)
X(B)
R(B)
W(B)
U(A); U(B) shrinking phase
Commit
Commit
16
Not admissible under 2PL

T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
Commit
R(B)
W(B)
Commit
17
Not admissible under 2PL
Why?
T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
Commit
R(B)
W(B)
Commit
18
Not admissible under 2PL
Why?
T1 T2
X(A)

X(A)
X(B)
U(B)
U(A)
Commit
X(B)
U(B)
U(A)
Commit
19
Not admissible under 2PL
Why?
T1 T2
X(A)

X(A) T1 holds A’s lock


X(B) T2 won’t be able
U(B) to proceed
U(A)
Commit
X(B)
U(B)
U(A)
Commit
20
Lock-based protocols

 2PL ensures conflict serializability


 Transactions can be ordered by their end of
growing phase (called lock point)
 A 2PL schedule is equivalent to the serial
schedule where transactions ordered by lock
point order.
 2PL ensures serializability-consistent
interleaving
 but is it fast enough?
 What about recoverability?
Strict Two-Phase Locking (Strict 2PL)
 Strict Two-phase Locking Protocol:
 Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on object
before writing.
 A transaction can not request additional locks
once it releases any locks.
 All X (exclusive) locks acquired by a transaction
must be held until completion (commit/abort).
Schedule following strict 2PL
T1 T2
S(A)
R(A)
S(A)
R(A)
X(B)
R(B)
W(B)
Commit
X(C)
R(C)
W(C)
Commit
23
Lock-based protocols

 Strict 2PL ensures conflict serializable and


cascadeless schedules
 Writers hold an X lock until they commit.
 Serializability and Cascadeless recoverability
 but locks are held for longer
 the longer locks are held => the less concurrency
you can support
 Consistency + Recoverability vs. Performance
tradeoff
 This is why many databases default to weaker
consistency (more concurrency)
Lock Management

 Lock and unlock requests are handled by the lock


manager, an internal software module of the DBMS
 Lock table entry (for each object) contains:
 Number of transactions currently holding a lock
 Type of lock held (shared or exclusive)
 Pointer to queue of lock requests
 Locking and unlocking have to be atomic operations
 Locks cannot be half-held or half-released
 Built on OS primitives and/or CPU primitives
 Slower (10-100x) than non-atomic memory
accesses
Lock Management
 Lock upgrade: transaction that holds a shared lock
can be upgraded to hold an exclusive lock
 allows max. concurrency until you realize you
need eXclusive lock
 useful for UPDATE (only X on the rows that need
to be written, everyone else gets S)
 special case: X granted immediately if no one else
holds S, puts request at front of queue otherwise
 Does NOT prevent deadlock
 T1: S(A); T2: S(A); T1: Up(A) ; T2: Up(A)
 T2’s S lock blocks T1’s Upgrade request
 T1’s S lock blocks T2’s Upgrade request
Deadlock

T1 T2
X(A) granted

X(B) granted

X(B) queued
X(A) queued

 Deadlock: Cycle of transactions waiting for


locks to be released by each other.

27
Deadlocks
 Tend to be rare in practice.*
 Two ways of dealing with deadlocks:
 Deadlock prevention
 lock protocol makes deadlock impossible
 transactions may have to ‘restart’
 Deadlock detection
 a separate process examines txn state to detect
deadlock
 and then transactions are aborted to clear the
deadlock
*tend to start cropping up when the load on the system increases greatly.
More txns => More concurrency => more opportunities for deadlock
Deadlock Detection

 Create a waits-for graph:


 Nodes are transactions
 There is an edge from Ti to Tj if Ti is waiting for Tj
to release a lock
 add edge when queueing a lock request,
 remove edge when granting lock request.
 Periodically check for cycles in the waits-for
graph
 Resolve by aborting a transaction on cycle,
releasing its locks.
Deadlock Detection (Continued)
T1 T2 T3 T4
S(A)
R(A)
X(B)
W(B)
S(B)
S(C)
R(C)
X(C)
X(B)
X(A)

T1 T2

T4 T3
Deadlock Detection (Continued)
T1 T2 T3 T4
S(A)
R(A)
X(B)
W(B)
B is held by T2 S(B)
S(C)
R(C)
X(C)
X(B)
X(A)

T1 T2

T4 T3
Deadlock Detection (Continued)
T1 T2 T3 T4
S(A)
R(A)
X(B)
W(B)
S(B)
S(C)
R(C)
C is held by T3 X(C)
X(B)
X(A)

T1 T2

T4 T3
Deadlock Detection (Continued)
T1 T2 T3 T4
S(A)
R(A)
X(B)
W(B)
S(B)
S(C)
R(C)
X(C)
X(B) B is held by T2
X(A)

T1 T2

T4 T3
Deadlock Detection (Continued)
T1 T2 T3 T4
S(A)
R(A)
X(B)
W(B)
S(B)
S(C)
R(C)
X(C)
X(B)
X(A) A is held by T1

T1 T2
A waits-for cycle means
deadlock

T4 T3
Deadlock Prevention
 Idea 1: Assign priorities based on
timestamps. If Ti wants a lock that Tj holds.
Two policies are possible:
 Wait-Die: If Ti has higher priority, Ti waits for Tj;
otherwise Ti aborts
 aborts most recent txn until deadlock is broken
 Wound-wait: If Ti has higher priority, Tj aborts;
otherwise Ti waits
 Aborts txn when lock conflict is detected

 If txn re-starts, re-use its original


timestamp (to avoid starvation).
Deadlock Prevention
 Idea 2: Conservative 2PL:
 transaction grabs all locks it will need up
front, before issuing any statements
 All blocking happens at the beginning of
txn
 no waiting in the middle of txn
 How can you know what locks you need
before you even start?
 Not always feasible
Performance of Locking
 Lock-based schemes resolve conflicting
schedules by blocking and aborting
 in practice few deadlocks and relatively few
aborts
 most of penalty from blocking
 To increase throughput
 lock smallest objects possible
 reduce time locks are held
 reduce hotspots
• a hotspot is a popular/highly contended object

37
Performance Graphs
 If concurrency is a win, then running more
transactions means higher throughput
Throughput DB linear

# transactions
38
Performance Graphs
 If concurrency is a win, then running more
transactions means higher throughput
Throughput DB linear

Initially, we get near


linear speedup.
E.g. 2x txns means
2x throughput
# transactions
39
Performance Graphs
 If concurrency is a win, then running more
transactions means higher throughput
Throughput DB linear

Then the system begins


to plateau until it hits some
maximum value
# transactions
40
Performance Graphs
 If concurrency is a win, then running more
transactions means higher throughput
Throughput DB linear

What’s going on here?

# transactions
41
Thrashing
 Locking causes blocking. The more popular
an object is, the more transactions are trying
to lock it (contention)
 More contention means blocking is more
costly
 vicious cycle of more contention causing
more slowdowns which lead to even more
contention
 When adding more transactions reduces
throughput we say the system is thrashing

42
Transaction support in SQL
 Explicit transactions with BEGIN/START
TRANSACTION
 ‘autocommit mode’ Transaction
automatically started for SELECT, UPDATE,
CREATE
 Transaction ends with COMMIT or
ROLLBACK (abort)
 SQL 99 supports SAVEPOINTs which are
simple nested transactions
 ABORT can roll back to an intermediate
point rather than just to the beginning
43
What should we lock?
T1 T2

SELECT S.rating, MIN(S.age) INSERT Sailors(Name, Rating,


FROM Sailors S Age) VALUES (“Mary”, 8, 18)
WHERE S.rating = 8

 Lock Tables? (Coarse locking):


 T1 S-lock on Sailors;
 T2 X-lock on Sailors
 Lock rows? (Fine-grained locking):
 T1 S-lock on all rows with rating=8;
 T2 X-lock on Joe’s tuple.
 Fine granularity locking is generally better. But it
introduces a problem: phantoms. 44
The Phantom Problem
T1 T2
SELECT MIN(age) INSERT Sailors(Name, Rating,
FROM Sailors S Age) VALUES (“Mary”, 8, 18)
WHERE rating = 8

SELECT MAX(age)
FROM Sailors S
WHERE rating = 8

 Consider the following schedule


 Initially, no Sailor has rating = 8
 T1 runs the first query, which scans Sailors. It locks no
row and returns an empty table.
 T2 inserts a row for Mary at the end of the relation on a
page not previously seen by T1. T1 has no lock so T2 goes
ahead and commits.
 T1 executes the second query and returns the age of
Mary. This violates serializability despite using locks! 45
The Phantom Problem
T1 T2
SELECT MIN(age)
FROM Sailors S INSERT Sailors(Name, Rating,
WHERE rating = 8 Age) VALUES (“Mary”, 8, 18)
SELECT MAX(age) COMMIT
FROM Sailors S
WHERE rating = 8

 Consider the following schedule


 Initially, no Sailor has rating = 8

46
The Phantom Problem
T1 T2
SELECT MIN(age) No Results, no rows locked
FROM Sailors S
WHERE rating = 8 INSERT Sailors(Name, Rating,
Age) VALUES (“Mary”, 8, 18)
SELECT MAX(age)
COMMIT
FROM Sailors S
WHERE rating = 8

 Consider the following schedule


 Initially, no Sailor has rating = 8
 T1’s first SELECT runs first, no results, no rows locked

47
The Phantom Problem
T1 T2
SELECT MIN(age) No Results, no rows locked
FROM Sailors S
WHERE rating = 8 INSERT Sailors(Name, Rating,
Age) VALUES (“Mary”, 8, 18)
SELECT MAX(age)
COMMIT
FROM Sailors S
WHERE rating = 8

 Consider the following schedule


 Initially, no Sailor has rating = 8
 T1’s first SELECT runs first, no results, no rows locked
 T2’s INSERT runs, adds a new row b/c no conflicting
locks detected

48
The Phantom Problem
T1 T2
SELECT MIN(age) No Results, no rows locked
FROM Sailors S
WHERE rating = 8 INSERT Sailors(Name, Rating,
Age) VALUES (“Mary”, 8, 18)
SELECT MAX(age)
COMMIT
FROM Sailors S
WHERE rating = 8

 Consider the following schedule


 Initially, no Sailor has rating = 8
 T1’s first SELECT runs first, no results, no rows locked
 T2’s INSERT runs, adds a new row b/c no conflicting
locks detected
 T1’s second SELECT runs, now returns a sailor

49
The Phantom Problem
 T1 can only lock rows that already exist
 wants to lock all potential rows with rating = 8
 how do you lock a row that doesn’t exist?
 Idea 1: Lock whole table
 no modification can slip by a transaction
 no other transaction can write the table
•or read it after eXclusive locking
 really bad for performance
 Idea 2: use index to lock ranges of values
 Idea 3: prevent new rows that match the
selection condition
50
Data
Index
Index Locking
r=1
 If there is an index on the rating field, T1
should lock the index page containing the data
entries with rating = 8.
 If there are no records with rating = 8, T1 must
lock the index page where such a data entry would
be, if it existed!
 If there is no suitable index:
 T1 locks all pages when doing table scan
 Also locks the file/table to prevent new
pages from being added
Predicate Locking

 Grant lock on all records that satisfy some


logical predicate, e.g. age > 2*salary.
 Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock.
 In general, predicate locking has a lot of
locking overhead.
 In general it’s constraint satisfaction
Locking in B+ Trees
 How can we efficiently lock a particular leaf
node?
 One solution: Ignore the tree structure, just lock
pages while traversing the tree, following 2PL.
 This has terrible performance!
 Root node (and many higher level nodes) become
bottlenecks because every tree access begins at the
root.
 No different from locking whole table
Two Useful Observations
 Higher levels of the tree only direct searches
for leaf pages.
 All the action is at the leaves
 that’s where the key+values are
 Why not just lock the leaf page(s)?
 A split creates a new leaf and moves half
the keys into it
 So you need to lock internal nodes to
prevent splits
 But maybe not always?
A Simple Tree Locking Algorithm
 Search: Start at root and go down;
repeatedly, S lock child then unlock parent.
 Insert/Delete: Start at root and go down,
obtaining X locks. Once child is locked,
check if it is safe:
 If child is safe, release all locks on ancestors.
 Safe node: Node such that changes will not
propagate up beyond this node.
 Inserts: Node is not full. (no split possible)
 Deletes: Node is not half-empty. (no merge
possible)
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



35 …

23 38 44

leaves

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

T1
Exclusive … 20 …



35 …

23 38 44

leaves

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

T1
Exclusive … 20 …



T1 35 …

23 38 44

leaves

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



T1 35 …

23 T1 38 44

leaves

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



T1 35 …

23 T1 38 44

leaves

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



35 …

23 T1 38 44

leaves
T1

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



T2 35 …

23 38 44

leaves
T1

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



T2 35 …

23 T2 38 44

leaves
T1

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



T2 35 …

23 T2 38 44

leaves
T1 T2

20* 22* 23* 31* 35* 36* 38* 41* 44*


Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T2 … 20 …



T2 35 …

23 T2 38 44

leaves
T1 T2

20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T3 … 20 …



T2 35 …

23 T2 38 44

leaves
T1 T2

20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive T3 … 20 …



T3 35 …

23 T2 38 44

leaves
T1 T2

20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



T3 35 …

T3 23 38 44

leaves
T1 T2

20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



35 …

T3 23 38 44

leaves
T1 T2
T3
20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



35 …

23 38 44

leaves
T1 T2
T3
20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Tree Locking Example T1: Search 38*
T2: Insert 45*
Shared root T3: Search 22*

Exclusive … 20 …



35 …

23 38 44

leaves
T1 T2
T3
20* 22* 23* 31* 35* 36* 38* 41* 44*

Leaf is Safe!
Only 3 pages are locked, tree can still be used by other transactions!
Specify isolation level

 General rules of thumb w.r.t. isolation:


 Fully serializable isolation is more expensive than
“no isolation”
• We can’t do as many things concurrently (or
we have to undo them frequently)
 For performance, we generally want to
specify the most relaxed isolation level that’s
acceptable
 Note that we’re “slightly” violating a correctness
constraint to get performance!

72
Specifying isolation level in SQL

SET TRANSACTION [READ WRITE | READ ONLY]


ISOLATION LEVEL [LEVEL];

LEVEL = SERIALIZABLE
REPEATABLE READ
Less isolation
READ COMMITTED
READ UNCOMMITED

73
REPEATABLE READ

 T reads only changes made by committed


transactions
 No value read/written by T is changed by
another transaction until T completes.
 Phantoms possible

74
READ COMMITTED
 T reads only changes made by committed
transactions
 No value read/written by T is changed by
another transaction until T completes.
 Value read by T may be modified while T in
progress.

75
READ UNCOMMITTED

 Greatest exposure to other transactions


 Dirty reads possible
 Does not obtain shared locks before reading
 Thus no locks ever requested.
 Not always allowed (postgres does not support this)

76
Specifying Acceptable Isolation Levels

 To signal to the system that a dirty read is


acceptable,
SET TRANSACTION READ WRITE
ISOLATION LEVEL READ UNCOMMITTED;

 In addition, there are


 SET TRANSACTION
ISOLATION LEVEL READ COMMITTED;
SET TRANSACTION
ISOLATION LEVEL REPEATABLE READ;

77
Summary of Isolation Levels

Level Dirty Read Unrepeatable Read Phantoms

READ UN- Maybe Maybe Maybe


COMMITTED

READ No Maybe Maybe


COMMITTED

REPEATABLE No No Maybe
READ

SERIALIZABLE No No No

78
Non-locking Concurrency Control
 Locks are complicated
 deadlock detection/prevention
 protocol design and implementation
 Locks can be expensive

 pay cost of locking even if no concurrent access


occurred
 coordination is expensive across threads/cores
 getting comparatively more expensive over time

79
Optimistic Concurrency Control
 Idea: Don’t lock, buffer writes and validate before
‘publishing’
 Transactions are assigned timestamps at start
 Each transaction has its own workspace
 Reads copy values into workspace
 Intermediate writes happen locally in workspace
 Validation checks that nothing has changed

 validation failure => transaction restart


 Write on successful validation all writes are
published to global state 80
Optimistic Concurrency Control
 Copying can be expensive, and validation cost is
crucial for performance
 Idea: tag each record with timestamp of transaction
that last wrote it
 Read: copy record timestamps instead of data
 Validation: compare timestamps of records
 Write: update timestamps and values of records
 Hybrid systems sometimes use locking for writes
only (validation still necessary)

81
Optimistic Concurrency Control
 Commits may now fail
 validation fails
 transactions need to restart
 Not a good fit for statement-by-statement SQL
 Not common in SQL DBs

 Used in Transactional Memory systems


 Good fit for low-contention systems

82
Multi Version Concurrency Control
 Idea: Instead of locking reads, just preserve all
versions of records
 Concurrent transactions can just read ‘visible’
versions
 updates are invisible
 updates create new version (no modify in place)
 Only need to lock if transactions are really updating
the same record
 can do this without locking but requires
transaction restart on conflict
83
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A)
X(A); W(A)
R(A)
R(B)
R(B)
X(B); W(B)
Commit
Commit

84
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A)
R(A)
R(B)
R(B)
X(B); W(B)
Commit
Commit

85
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A)
R(B)
R(B)
X(B); W(B)
Commit
Commit

86
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A) read V1
R(B)
R(B)
X(B); W(B)
Commit
Commit

87
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A) read V1
R(B) read V1
R(B)
X(B); W(B)
Commit
Commit

88
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A) read V1
R(B) read V1
R(B) read V1
X(B); W(B) create V2
Commit
Commit

89
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A) read V1
R(B) read V1
R(B) read V1
X(B); W(B) create V2
Commit B V2 now visible
Commit

90
MVCC example
Assume A and B have Version/Timestamp 1

T1 TS=2 T2 TS=4
R(A) read V 1
X(A); W(A) create V2
R(A) read V1
R(B) read V1
R(B) read V1
X(B); W(B) create V2
Commit B V2 now visible
Commit A V2 now visible

91
Multi Version Concurrency Control
 Readers don’t block readers
 Readers don’t block writers
 Writers don’t block readers
 Only writers block writers
 Each transaction effectively creates a ‘snapshot’ of DB

 all reads are consistent with this snapshot


 Vanilla MVCC supports SNAPSHOT isolation

 weaker than SERIALIZABLE


 stronger than REPEATABLE READ w/ locks
92
Multi Version Concurrency Control
 Fairly popular in databases
 Used by postgres and oracle
 Complications

 old versions accumulate, need to be GCed


(vacuuming in postgres)
 new record versions not guaranteed to be on the
same data page as the prev. version
•bad locality
 Indexes need to track ALL versions
93
Summary
 Concurrency control and recovery are among the
most important functions provided by a DBMS.
 Users need not worry about concurrency.
 System guarantees nice properties: ACID
 This is implemented using a locking protocol

 Users can trade isolation for performance using SQL


commands
 Consistent concurrency can be implemented in
numerous ways
 different contexts make different approaches better
94

You might also like