Husseinnasser.
com
Updated-Oct/2021
ACID
Atomicity, Consistency, Isolation and Durability in
Relational Database Systems
Agenda
● What is a Transaction?
● Atomicity
● Isolation
● Consistency
● Durability
● Quiz
What is a Transaction?
Transaction
● A collection of queries
● One unit of work
● E.g. Account deposit (SELECT, UPDATE, UPDATE)
Transaction Lifespan
● Transaction BEGIN
● Transaction COMMIT
● Transaction ROLLBACK
● Transaction unexpected ending = ROLLBACK (e.g. crash)
Nature of Transactions
● Usually Transactions are used to change and modify data
● However, it is perfectly normal to have a read only
transaction
● Example, you want to generate a report and you want to
get consistent snapshot based at the time of transaction
● We will learn more about this in the Isolation section
ACCOUNT_ID BALANCE
Transaction 1 $1000
$900
2 $600
$500
Send $100 From Account 1 to Account 2
BEGIN TX1
SELECT BALANCE FROM ACCOUNT WHERE ID = 1
BALANCE > 100
UPDATE ACCOUNT SET BALANCE = BALANCE - 100 WHERE ID = 1
UPDATE ACCOUNT SET BALANCE = BALANCE + 100 WHERE ID = 2
COMMIT TX1
Summary
What is a Transaction?
Atomicity
Atomicity
● All queries in a transaction must succeed.
● If one query fails, all prior successful queries in the transaction
should rollback.
● If the database went down prior to a commit of a transaction,
all the successful queries in the transactions should rollback
ACCOUNT_ID BALANCE
Atomicity 1 $1000
$900
2 $500
Send $100 From Account 1 to Account 2
BEGIN TX1
SELECT BALANCE FROM ACCOUNT WHERE ID = 1
BALANCE > 100
UPDATE ACCOUNT SET BALANCE = BALANCE - 100 WHERE ID = 1
ACCOUNT_ID BALANCE
Atomicity 1 $900
2 $500
● After we restarted the machine the first account has been debited but the
other account has not been credited.
● This is really bad as we just lost data, and the information is inconsistent
● An atomic transaction is a transaction that will rollback all queries if one or
more queries failed.
● The database should clean this up after restart.
Summary
Atomicity
Isolation
Isolation
● Can my inflight transaction see changes made by other
transactions?
● Read phenomena
● Isolation Levels
Isolation - Read phenomena
● Dirty reads
● Non-repeatable reads
● Phantom reads
● Lost updates
SALES
PID QNT PRICE
Dirty Reads
Product 1 10
10
15 $5
Product 2 20 $4
BEGIN TX1 BEGIN TX2
SELECT PID, QNT*PRICE FROM SALES
Product 1, 50
UPDATE SALES SET QNT = QNT+5
Product 2, 80
WHERE PID =1
SELECT SUM(QNT*PRICE) FROM SALES
We get $155 when it should be $130
We read a “dirty” value that has not been
committed
COMMIT TX1 ROLLBACK TX2
SALES
PID QNT PRICE
Non-repeatable read
Product 1 10
15 $5
Product 2 20 $4
BEGIN TX1 BEGIN TX2
SELECT PID, QNT*PRICE FROM SALES
Product 1, 50 UPDATE SALES SET QNT = QNT+5
Product 2, 80 WHERE PID =1
COMMIT TX2
SELECT SUM(QNT*PRICE) FROM SALES
We get $155 when it should be $130
We did read a committed value, but it
COMMIT TX1
gave us inconsistent results
SALES
PID QNT PRICE
Phantom read
Product 1 10 $5
Product 2 20 $4
BEGIN TX1 Product 3 10 1
$1
SELECT PID, QNT*PRICE FROM SALES BEGIN TX2
Product 1, 50 INSERT INTO SALES
Product 2, 80 VALUES (‘Product 3’, 10, 1)
COMMIT TX2
SELECT SUM(QNT*PRICE) FROM SALES
We get $140 when it should be $130
We read a committed value that showed up in our
COMMIT TX1
range query
SALES
PID QNT PRICE
Lost updates
Product 1 10
15
20 $5
Product 2 20 $4
BEGIN TX1 BEGIN TX2
UPDATE SALES SET QNT = QNT+10
WHERE PID =1
UPDATE SALES SET QNT = QNT+5
WHERE PID =1
COMMIT TX2
SELECT SUM(QNT*PRICE) FROM SALES
We get $155 when it should be $180
Our update was overwritten another
COMMIT TX1 transaction and as a result “lost”
Isolation - Isolation Levels for inflight transactions
● Read uncommitted - No Isolation, any change from the outside is visible to the
transaction, committed or not.
● Read committed - Each query in a transaction only sees committed changes by
other transactions
● Repeatable Read - The transaction will make sure that when a query reads a row,
that row will remain unchanged while its running.
● Snapshot - Each query in a transaction only sees changes that have been
committed up to the start of the transaction. It's like a snapshot version of the
database at that moment.
● Serializable - Transactions are run as if they serialized one after the other.
● Each DBMS implements Isolation level differently
Isolation Levels vs read phenomena
https://en.wikipedia.org/wiki/Isolation_(database_systems)
Database Implementation of Isolation
● Each DBMS implements Isolation level differently
● Pessimistic - Row level locks, table locks, page locks to avoid lost updates
● Optimistic - No locks, just track if things changed and fail the transaction if so
● Repeatable read “locks” the rows it reads but it could be expensive if you
read a lot of rows, postgres implements RR as snapshot. That is why you
don’t get phantom reads with postgres in repeatable read
● Serializable are usually implemented with optimistic concurrency control, you
can implement it pessimistically with SELECT FOR UPDATE
Summary
Isolation
Consistency
Consistency
● Consistency in Data
● Consistency in reads
Consistency in Data
● Defined by the user
● Referential integrity (foreign keys)
● Atomicity
● Isolation
Consistency in Data
Pictures Picture_Likes
ID (PK) BLOB LIKES USER (PK) PICTURE_ID (PK)(FK)
1 xx 2 Jon 1
2 xx 1 Edmond 1
Jon 2
Spot inconsistency in this data
Pictures Picture_Likes
ID (PK) BLOB LIKES USER (PK) PICTURE_ID (PK)(FK)
1 xx 5 Jon 1
2 xx 1 Edmond 1
Jon 2
Edmond 4
Consistency in reads
Update X
Read
X
X
Consistency in reads
● If a transaction committed a change will a new transaction
immediately see the change?
● Affects the system as a whole
● Relational and NoSQL databases suffer from this
● Eventual consistency
Summary
Consistency
Durability
Durability
● Changes made by committed transactions must be
persisted in a durable non-volatile storage.
● Durability techniques
○ WAL - Write ahead log
○ Asynchronous snapshot
○ AOF
Durability - WAL
● Writing a lot of data to disk is expensive (indexes, data
files, columns, rows, etc..)
● That is why DBMSs persist a compressed version of the
changes known as WAL (write-ahead-log segments)
Durability - OS Cache
● A write request in OS usually goes to the OS cache
● When the writes go the OS cache, an OS crash, machine
restart could lead to loss of data
● Fsync OS command forces writes to always go to disk
● fsync can be expensive and slows down commits
Summary
Durability
Summary
● What is a Transaction?
● Atomicity
● Isolation
● Consistency
● Durability