Data Processing Systems inequality pages
Hash – probe parallelisable, Random access – 1 page per access
ACID overallocation
Isolation – alone on the system poor if table doesn’t fit in memory By-Reference Bulk Processing
Atomic – complete or not at all Buffer table, pass around positions
Consistent – constraints retained Partitioning (Hash join) in buffer and candidates.
Durable – cannot undo commits Smaller hash domain to partition, Only 1 function call used for child
parallel smaller joins. Localised next
N-ary Storage random access All operators are pipeline breakers
Effective for insertion, deletion Always using DSM
For transactional (OLTP) systems Indexing (secondary storage) Page Access Probability
Clustered/ primary – just store all Selectivity s, tuples per page n = 1-
Decomposed Storage (DSM) data as hash table instead of
Efficient analysis (OLAP) (1-s)n
normal table
Tuple reconstruction required for Unclustered/ secondary – store Dimensions of Optimisations
searching & deletion pointers to tuples in the table Algorithm-aware vs agnostic
Delta/Main - logical vs physical optimisation
Binned Bitmap Data-aware vs agnostic
Disjoint NSM delta, DSM main Bitvectors cover range
Delegate operations, perform - rule-based vs cost-based
Equi-width – equal width
occasional move to main Equi-height – sort, determine Selection Pushdown (Rule-Based
Metadata quantiles Logical)
Sorted - values increase → binary Run-length encoding – replace Push selection through joins
search consecutive with (run, length) Swap selections if one is more
Dense - increases by 1 → pointer Length-prefix sum – store where restrictive
arithmetic runs start, length = next – current Merge if one implies another
Re-establish by sorting, then check B-Tree Histogram (Cost-Based Logical)
if dense Good updatability & lookup Estimate cardinality by tracking
Dictionary Compression binary search within node counts of binned unique values
Point to same entry, used for out-of- Multidimensional – combinations of
place variable length data. values
Non-root has ⌊n-1/2⌋ (key: value)
In-place dictionary will always be Anomalies
loaded, but may have duplication Dirty read – read after uncommitted
pairs
Buffer Manager write
Insert – 1. walk to correct leaf, 2. On
For disk based systems, write in Phantom read – followed by insert/
overflow split, 3. insert split
open pages, commit to disk delete
(middle) element to parent
Check pages as mini database Dirty write – write after
Delete – 1. find & delete, 2. if node,
uncommitted write
replace with max of left child, 3. on
Spanned, Unspanned and Slotted Write skew – concurrent update
underflow, replace splitting key with
Pages a=c, c=a
neighbour, 4. if underflow, merge
Spanned – allows large records, but Inconsistent Analysis – insert while
and take parent split key, 5.
worse random access, tuples/page scan
rebalance from parent
is not constant Lost Update – read1, update2,
B+-Tree
Unspanned – page is mini database update1
Keep all data in leaves, fast scans,
Slotted – Null-terminated variable
link leaves. Leaves payload in Isolation Levels
sized data (tuples variable size),
pointer space Read Uncommitted – read
store count and offset (grow
towards each other) immediately
Volcano
Offset type → byte for 256B, short Read Committed – stops dirty
Open, Next, Close
for 65KB, int for 4GB read/write
Grouped aggregation – Hash table
Repeatable Read – Only phantom
on attribute, aggregate in slots.
Foreign Key – 1 join partner read
Table size is (key + aggregation) *
Serialisable with predicate locking
Joins group cardinality * 2
Inner – cross product with selection Function ptr → CPU pipeline stall, 15 2-Phase Locking – grow & shrink
Outer – returns all of 1 side, filling cycles phase
nulls Scan – 0. Select, project, group – 2
Full outer – union of left and right (read + apply). Cross/ join – 1 (hash Deadlock – timeout or cycle
inline). Out – 1. detection
Join Algorithms
Nested loop – O(nm), easy to Buffer I/O Optimistic Concurrency Control
parallelise Scans – assume spanned pages Use timestamped transactions &
Sort-merge – O(nlogm + mlogn + n If buffers fit in memory, 0 I/O tuples
+ m), sequential I/O, works for Sequential – number of occupied Abort if committing to modified
tuple
Multi-Version Concurrency Control k ≈ m/n ln 2
Multiple timestamped versions by ε = (1 – e-km/n)k
snapshotting, may be mid-
transaction
Push with timestamp when writing
Time (Stream Processing)
Processing-Time – use time of
processing tuple
Ingestion-Time – time of arrival to
system
Event-Time – external time source
Watermarks/ Punctuations
Watermark – guarantees all tuples
have arrived up to this point
Punctuation – marker in event
stream that can be watermark or
other conditions
Windows
Size – Elements in instance
Slide – Distance between starts
Sliding window – size < slide
(moving avg)
Tumbling window – size = slide
Stream sampling – size > slide
Session window – dynamic based on
events
Window Aggregation
Distributive – calculatable from
pairs
Algebraic – multiple distributive
Holistic – look at entire window
Invertible – can add/remove value
by itself
Non-invertible – entire window to
remove
Invertible Distributive
Ring buffer implementation
Holistic
Overwrite oldest and go over
window
2-Stacks Algorithm (Non-Invertible)
Front – newest on top
Back – oldest on top
Each has second stack that
aggregates up
Push front, pop back on change
Push all front to back on empty
back
Aggregate top value of front and
back
Handshake Join
Windows opposite direction, join
when meeting
Bloom filter
Multiple hash functions, set bit for
flag
m bits, n expected unique, k
functions, ε false positive rate
m ≈ -1.44 n log2 ε