Data Processing Systems

Uploaded by

francoispoh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views2 pages

Data Processing Systems

Uploaded by

francoispoh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Data Processing Systems inequality pages

Hash – probe parallelisable, Random access – 1 page per access

ACID overallocation
Isolation – alone on the system poor if table doesn’t fit in memory By-Reference Bulk Processing
Atomic – complete or not at all Buffer table, pass around positions
Consistent – constraints retained Partitioning (Hash join) in buffer and candidates.
Durable – cannot undo commits Smaller hash domain to partition, Only 1 function call used for child
parallel smaller joins. Localised next
N-ary Storage random access All operators are pipeline breakers
Effective for insertion, deletion Always using DSM
For transactional (OLTP) systems Indexing (secondary storage) Page Access Probability
Clustered/ primary – just store all Selectivity s, tuples per page n = 1-
Decomposed Storage (DSM) data as hash table instead of
Efficient analysis (OLAP) (1-s)n
normal table
Tuple reconstruction required for Unclustered/ secondary – store Dimensions of Optimisations
searching & deletion pointers to tuples in the table Algorithm-aware vs agnostic
Delta/Main - logical vs physical optimisation
Binned Bitmap Data-aware vs agnostic
Disjoint NSM delta, DSM main Bitvectors cover range
Delegate operations, perform - rule-based vs cost-based
Equi-width – equal width
occasional move to main Equi-height – sort, determine Selection Pushdown (Rule-Based
Metadata quantiles Logical)
Sorted - values increase → binary Run-length encoding – replace Push selection through joins
search consecutive with (run, length) Swap selections if one is more
Dense - increases by 1 → pointer Length-prefix sum – store where restrictive
arithmetic runs start, length = next – current Merge if one implies another
Re-establish by sorting, then check B-Tree Histogram (Cost-Based Logical)
if dense Good updatability & lookup Estimate cardinality by tracking
Dictionary Compression binary search within node counts of binned unique values
Point to same entry, used for out-of- Multidimensional – combinations of
place variable length data. values

Non-root has ⌊n-1/2⌋ (key: value)

In-place dictionary will always be Anomalies
loaded, but may have duplication Dirty read – read after uncommitted
pairs
Buffer Manager write
Insert – 1. walk to correct leaf, 2. On
For disk based systems, write in Phantom read – followed by insert/
overflow split, 3. insert split
open pages, commit to disk delete
(middle) element to parent
Check pages as mini database Dirty write – write after
Delete – 1. find & delete, 2. if node,
uncommitted write
replace with max of left child, 3. on
Spanned, Unspanned and Slotted Write skew – concurrent update
underflow, replace splitting key with
Pages a=c, c=a
neighbour, 4. if underflow, merge
Spanned – allows large records, but Inconsistent Analysis – insert while
and take parent split key, 5.
worse random access, tuples/page scan
rebalance from parent
is not constant Lost Update – read1, update2,
B+-Tree
Unspanned – page is mini database update1
Keep all data in leaves, fast scans,
Slotted – Null-terminated variable
link leaves. Leaves payload in Isolation Levels
sized data (tuples variable size),
pointer space Read Uncommitted – read
store count and offset (grow
towards each other) immediately
Volcano
Offset type → byte for 256B, short Read Committed – stops dirty
Open, Next, Close
for 65KB, int for 4GB read/write
Grouped aggregation – Hash table
Repeatable Read – Only phantom
on attribute, aggregate in slots.
Foreign Key – 1 join partner read
Table size is (key + aggregation) *
Serialisable with predicate locking
Joins group cardinality * 2
Inner – cross product with selection Function ptr → CPU pipeline stall, 15 2-Phase Locking – grow & shrink
Outer – returns all of 1 side, filling cycles phase
nulls Scan – 0. Select, project, group – 2
Full outer – union of left and right (read + apply). Cross/ join – 1 (hash Deadlock – timeout or cycle
inline). Out – 1. detection
Join Algorithms
Nested loop – O(nm), easy to Buffer I/O Optimistic Concurrency Control
parallelise Scans – assume spanned pages Use timestamped transactions &
Sort-merge – O(nlogm + mlogn + n If buffers fit in memory, 0 I/O tuples
+ m), sequential I/O, works for Sequential – number of occupied Abort if committing to modified
tuple
Multi-Version Concurrency Control k ≈ m/n ln 2
Multiple timestamped versions by ε = (1 – e-km/n)k
snapshotting, may be mid-
transaction
Push with timestamp when writing

Time (Stream Processing)

Processing-Time – use time of
processing tuple
Ingestion-Time – time of arrival to
system
Event-Time – external time source

Watermarks/ Punctuations
Watermark – guarantees all tuples
have arrived up to this point
Punctuation – marker in event
stream that can be watermark or
other conditions

Windows
Size – Elements in instance
Slide – Distance between starts
Sliding window – size < slide
(moving avg)
Tumbling window – size = slide
Stream sampling – size > slide
Session window – dynamic based on
events

Window Aggregation
Distributive – calculatable from
pairs
Algebraic – multiple distributive
Holistic – look at entire window
Invertible – can add/remove value
by itself
Non-invertible – entire window to
remove
Invertible Distributive
Ring buffer implementation
Holistic
Overwrite oldest and go over
window

2-Stacks Algorithm (Non-Invertible)

Front – newest on top
Back – oldest on top
Each has second stack that
aggregates up
Push front, pop back on change
Push all front to back on empty
back
Aggregate top value of front and
back

Handshake Join
Windows opposite direction, join
when meeting

Bloom filter
Multiple hash functions, set bit for
flag
m bits, n expected unique, k
functions, ε false positive rate
m ≈ -1.44 n log2 ε

Untitled Document
No ratings yet
Untitled Document
11 pages
File Organization
No ratings yet
File Organization
11 pages
Exam 5020 L16-22
No ratings yet
Exam 5020 L16-22
10 pages
Query Optimization in Database Systems
No ratings yet
Query Optimization in Database Systems
32 pages
System Design
No ratings yet
System Design
6 pages
OS/2 Installable File System Overview
No ratings yet
OS/2 Installable File System Overview
5 pages
OS Unit 5 - Merged
No ratings yet
OS Unit 5 - Merged
33 pages
MSBD 5020 L16-22
No ratings yet
MSBD 5020 L16-22
15 pages
DBMS R19 Unit Iv
No ratings yet
DBMS R19 Unit Iv
25 pages
Assignment (DS)
No ratings yet
Assignment (DS)
8 pages
Bca Dec 23 Ak
No ratings yet
Bca Dec 23 Ak
5 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
Coforge Graduate Engineer Trainee - Technical Interview Answers (Himanshu Tripathi)
No ratings yet
Coforge Graduate Engineer Trainee - Technical Interview Answers (Himanshu Tripathi)
12 pages
4 Marks Chapter (12) : 1) Physical Storage Media
No ratings yet
4 Marks Chapter (12) : 1) Physical Storage Media
6 pages
File Organization and Storage Access Guide
No ratings yet
File Organization and Storage Access Guide
185 pages
Dbms Unit Vnew
No ratings yet
Dbms Unit Vnew
49 pages
Cheat Sheet Dbms
No ratings yet
Cheat Sheet Dbms
1 page
DD Sem II Answer
No ratings yet
DD Sem II Answer
17 pages
DSA Chapter 08 (Searching)
No ratings yet
DSA Chapter 08 (Searching)
65 pages
Data - Structures - Algorithms - With - Java Material
No ratings yet
Data - Structures - Algorithms - With - Java Material
4 pages
02 Blocking - Addional
No ratings yet
02 Blocking - Addional
74 pages
Unit Iv
No ratings yet
Unit Iv
6 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
Database Indexing & Hashing Basics
No ratings yet
Database Indexing & Hashing Basics
7 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
Hashing Techniques and Collision Resolution
100% (2)
Hashing Techniques and Collision Resolution
31 pages
Lecture 4.Pptx 2
No ratings yet
Lecture 4.Pptx 2
15 pages
Query Processing for CS Students
No ratings yet
Query Processing for CS Students
47 pages
Adsunit III
No ratings yet
Adsunit III
11 pages
Dsa Cheat Sheet
No ratings yet
Dsa Cheat Sheet
4 pages
DS Mid - 2 Saq
No ratings yet
DS Mid - 2 Saq
7 pages
Dennis Shasha, Vladimir Lanin, Jeanette Schmidt ([email protected], [email protected], Schmidtj@nyu - Edu)
No ratings yet
Dennis Shasha, Vladimir Lanin, Jeanette Schmidt ([email protected], [email protected], Schmidtj@nyu - Edu)
31 pages
Modified by Dr. ISSAM ALHADID 11/3/2019
No ratings yet
Modified by Dr. ISSAM ALHADID 11/3/2019
112 pages
Software Engineer Concepts - 4030afdb-00a4-4f83-A520 - 241007 - 202416
No ratings yet
Software Engineer Concepts - 4030afdb-00a4-4f83-A520 - 241007 - 202416
26 pages
Disk Storage & File Structures Guide
No ratings yet
Disk Storage & File Structures Guide
10 pages
Dsa Solved Paper
No ratings yet
Dsa Solved Paper
21 pages
IT3020 L06 Indexing
No ratings yet
IT3020 L06 Indexing
41 pages
DS TM Study Material Presentations Unit-4 1TM
No ratings yet
DS TM Study Material Presentations Unit-4 1TM
22 pages
Final Exam Review Sheet 11
No ratings yet
Final Exam Review Sheet 11
3 pages
CH 17 Sum
No ratings yet
CH 17 Sum
9 pages
Lecture 10 File Management
No ratings yet
Lecture 10 File Management
73 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
Parallel and Distributed Databases Overview
No ratings yet
Parallel and Distributed Databases Overview
23 pages
Extendible Hashing Report
100% (2)
Extendible Hashing Report
37 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
Unit-V Storage Management
No ratings yet
Unit-V Storage Management
98 pages
Data Structures Exam Guide 2020-21
100% (1)
Data Structures Exam Guide 2020-21
21 pages
CSI104 Summary
No ratings yet
CSI104 Summary
114 pages
Dsa 240404 220052
No ratings yet
Dsa 240404 220052
9 pages
File Management in Operating Systems
No ratings yet
File Management in Operating Systems
40 pages
Final Review
No ratings yet
Final Review
96 pages
CS 143 Disk and Indexing Exam Notes
No ratings yet
CS 143 Disk and Indexing Exam Notes
36 pages
Julain Design Analysis and Algorithms Course Work
No ratings yet
Julain Design Analysis and Algorithms Course Work
10 pages
Disk Storage & File Structures
No ratings yet
Disk Storage & File Structures
27 pages
Fundamentals of Database Systems
No ratings yet
Fundamentals of Database Systems
105 pages
CH 13
No ratings yet
CH 13
6 pages
Grade 10 Physical Sciences Plan
No ratings yet
Grade 10 Physical Sciences Plan
7 pages
TITLE 1: Article 114-123: Criminal Law II Review Notes - Revised Penal Code
89% (27)
TITLE 1: Article 114-123: Criminal Law II Review Notes - Revised Penal Code
10 pages
Criba
100% (2)
Criba
104 pages
Ms7221 Professional Voltma Voltage Current Calibrator Meter Manual
No ratings yet
Ms7221 Professional Voltma Voltage Current Calibrator Meter Manual
5 pages
Health and Wellness
No ratings yet
Health and Wellness
13 pages
319D Excavator Hydraulic System Guide
No ratings yet
319D Excavator Hydraulic System Guide
2 pages
Inbound 706381463902323526
No ratings yet
Inbound 706381463902323526
2 pages
Indian Mammals Species List
No ratings yet
Indian Mammals Species List
17 pages
STD 6 SCIENCE WORKSHEET2025-term1
67% (3)
STD 6 SCIENCE WORKSHEET2025-term1
2 pages
English Exam Paper for Grade 9
No ratings yet
English Exam Paper for Grade 9
6 pages
2022 LIFE SCIENCE Grade 10 Formal Test
No ratings yet
2022 LIFE SCIENCE Grade 10 Formal Test
6 pages
Soal Asesmen Bahasa Inggris Kelas 9
100% (3)
Soal Asesmen Bahasa Inggris Kelas 9
9 pages
Familysheet - Smart MT36 Magnetic Track Light System
No ratings yet
Familysheet - Smart MT36 Magnetic Track Light System
3 pages
Simple Beginnings Soldering Jewelry A Step by Step Guide to Creating Your Own Necklaces Bracelets Rings More 1st Edition Suzann Sladcik Wilson ebook signature edition
100% (6)
Simple Beginnings Soldering Jewelry A Step by Step Guide to Creating Your Own Necklaces Bracelets Rings More 1st Edition Suzann Sladcik Wilson ebook signature edition
161 pages
Sinclair C5: Wikipedia Encyclopedia Britannica Wikipedia Wikipedia
No ratings yet
Sinclair C5: Wikipedia Encyclopedia Britannica Wikipedia Wikipedia
2 pages
Cell Theory and Eukaryotic Cells Review
No ratings yet
Cell Theory and Eukaryotic Cells Review
9 pages
ATS Kingston Heath Price List With Effect 15 July 25
No ratings yet
ATS Kingston Heath Price List With Effect 15 July 25
1 page
MCQ in Machine Design and Shop Practice Part 17 ME Board Exam
No ratings yet
MCQ in Machine Design and Shop Practice Part 17 ME Board Exam
18 pages
Math Project
No ratings yet
Math Project
11 pages
Transpose of Matrix
No ratings yet
Transpose of Matrix
5 pages
General Physics 2 Midterm
67% (3)
General Physics 2 Midterm
5 pages
Din Alsl Bs Jis O: Bohler High Grade
No ratings yet
Din Alsl Bs Jis O: Bohler High Grade
1 page
Hospital Procedure Bill Summary
No ratings yet
Hospital Procedure Bill Summary
2 pages
HP Color LaserJet Enterprise M455
No ratings yet
HP Color LaserJet Enterprise M455
6 pages
Gaggia Anima Deluxe Parts Diagram
No ratings yet
Gaggia Anima Deluxe Parts Diagram
7 pages
BY: Dibya Ranjan Behera-10211
No ratings yet
BY: Dibya Ranjan Behera-10211
23 pages
PEGASUS RM
No ratings yet
PEGASUS RM
12 pages
V1N5
No ratings yet
V1N5
271 pages
Fyrquel Ehc 52897a
No ratings yet
Fyrquel Ehc 52897a
1 page
7 - Kriya - Pranic Body, Physical Body
No ratings yet
7 - Kriya - Pranic Body, Physical Body
3 pages

Data Processing Systems

Uploaded by

Data Processing Systems

Uploaded by

Data Processing Systems inequality pages

Hash – probe parallelisable, Random access – 1 page per access

Non-root has ⌊n-1/2⌋ (key: value)

Time (Stream Processing)

2-Stacks Algorithm (Non-Invertible)

You might also like