0% found this document useful (0 votes)
85 views25 pages

Randomized Data Structures Overview

The document discusses randomized data structures. It begins by motivating the use of randomized data structures to avoid worst-case inputs that can occur with deterministic structures like binary search trees. It then explains how randomizing a data structure can transform average-case runtimes into expected runtimes, providing probabilistic guarantees rather than being dependent on a specific input. Two examples of randomized data structures are presented in more detail: Treaps, which combine binary search trees with heaps, and randomized skip lists, which maintain a probabilistic structure rather than a perfect structure. Both support operations like insertion and deletion in expected logarithmic time.

Uploaded by

Rakhi Mahendra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views25 pages

Randomized Data Structures Overview

The document discusses randomized data structures. It begins by motivating the use of randomized data structures to avoid worst-case inputs that can occur with deterministic structures like binary search trees. It then explains how randomizing a data structure can transform average-case runtimes into expected runtimes, providing probabilistic guarantees rather than being dependent on a specific input. Two examples of randomized data structures are presented in more detail: Treaps, which combine binary search trees with heaps, and randomized skip lists, which maintain a probabilistic structure rather than a perfect structure. Both support operations like insertion and deletion in expected logarithmic time.

Uploaded by

Rakhi Mahendra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

CSE 326

Randomized Data Structures

David Kaplan

Dept of Computer Science & Engineering


Autumn 2001
Motivation for
Randomized Data Structures
We’ve seen many data structures with good average
case performance on random inputs, but bad behavior
on particular inputs
 e.g. Binary Search Trees

 Instead of randomizing the input (since we cannot!),


consider randomizing the data structure
 No bad inputs, just unlucky random numbers
 Expected case good behavior on any input

Randomized Data Structures CSE 326 Autumn 2001 2


Average vs. Expected Time
Average (1/N)   xi
Expectation  Pr(xi)  xi

Deterministic with good average time


 If your application happens to always (or often) use the
“bad” case, you are in big trouble!

Randomized with good expected time


 Once in a while you will have an expensive operation, but no
inputs can make this happen all the time

Like an insurance policy for your algorithm!

Randomized Data Structures CSE 326 Autumn 2001 3


Randomized Data Structures
 Define a property (or subroutine) in an algorithm
 Sample or randomly modify the property
 Use altered property as if it were the true property

Can transform average case runtimes into expected


runtimes (remove input dependency).

Sometimes allows substantial speedup in exchange for


probabilistic unsoundness.

Randomized Data Structures CSE 326 Autumn 2001 4


Randomization in Action
 Quicksort
 Randomized data structures
 Treaps
 Randomized skip lists
 Random number generation
 Primality testing

Randomized Data Structures CSE 326 Autumn 2001 5


Treap
Dictionary Data Structure
Treap is a BST priority
2
 binary tree property 9
key

 search tree property


6 4
7 18
Treap is also a heap
 heap-order property 7 9 10
 random priorities 8 15 30

15
12

Randomized Data Structures CSE 326 Autumn 2001 6


Treap Insert
 Choose a random priority
 Insert as in normal BST
 Rotate up until heap order is restored

2 insert(15) 2 2
9 9 9

6 14 6 14 6 9
7 12 7 12 7 15

7 7 9 7 14
8 8 15 8 12
Randomized Data Structures CSE 326 Autumn 2001 7
Tree + Heap… Why Bother?
 Insert data in sorted order into a treap …
What shape tree comes out?

insert(7) insert(8) insert(9) insert(12)


6 6 2 2
7 7 9 9

7 6 6 15
8 7 7 12

7 7
8 8
Randomized Data Structures CSE 326 Autumn 2001 8
Treap Delete delete(9)

2 6
 Find the key 7
9
 Increase its value to  7 
6 9
 Rotate it to the fringe 7 15 8 9
 Snip it off 7 15 9
8 12 15
15
6 6 12
7 7
7 9 7 9 6
8 15 8 15 7
 15 7 9
9 12 8 15
15  15
12 9 12
Treap Summary
 Implements Dictionary ADT
 insert in expected O(log n) time
 delete in expected O(log n) time
 find in expected O(log n) time
 but worst case O(n)

 Memory use
 O(1) per node
 about the cost of AVL trees

 Very simple to implement


 little overhead – less than AVL trees

Randomized Data Structures CSE 326 Autumn 2001 10


Perfect Binary Skip List
 Sorted linked list
 # of links of a node is its height
 The height i link of each node (that has one)
links to the next node of height i or greater

22
11

19

29
8

10

13

20

23
2

Randomized Data Structures CSE 326 Autumn 2001 11


Find in a Perfect Binary Skip List
 Start i at the maximum height
 Until the node is found or i is one and the
next node is too large:
 If the next node along the i link is less than the
target, traverse to the next node
 Otherwise, decrease i by one

Runtime?

Randomized Data Structures CSE 326 Autumn 2001 12


Randomized Skip List Intuition
It’s far too hard to insert into a perfect skip list

But is perfection necessary?

What matters in a skip list?


 We want way fewer tall nodes than short ones
 Make good progress through the list with each
high traverse

Randomized Data Structures CSE 326 Autumn 2001 13


Randomized Skip List
 Sorted linked list
 # of links of a node is its height
 The height i link of each node (that has one) links to
the next node of height i or greater
 There should be about 1/2 as many height i+1 nodes
as height i nodes
13

22
8

10

20

29
19

23
11
2

Randomized Data Structures CSE 326 Autumn 2001 14


Find in a RSL
 Start i at the maximum height
 Until the node is found or i is one and the
next node is too large:
 If the next node along the i link is less than the
target, traverse to the next node
 Otherwise, decrease i by one

Same as for a perfect skip list!

Runtime?

Randomized Data Structures CSE 326 Autumn 2001 15


Insertion in a RSL
 Flip a coin until it comes up heads; that takes
i flips. Make the new node’s height i.
 Do a find, remembering nodes where we go
down
 Put the node at the spot where the find ends
 Point all the nodes where we went down (up
to the new node’s height) at the new node
 Point the new node’s links where those
redirected pointers were pointing

Randomized Data Structures CSE 326 Autumn 2001 16


RSL Insert Example
insert(22)
with 3 flips

13
8

10

20

29
19

23
11
2

13

22
8

10

20

29
19

23
11
2

Runtime?
Range Queries and Iteration
Range query: search for everything that falls between
two values
 find start point
 walk list at the bottom
 output each node until the end point

Iteration: successively return (in order) each element in


the structure
 start at beginning, walk list to end
 Just like a linked list!
Runtimes?

Randomized Data Structures CSE 326 Autumn 2001 18


Randomized Skip List
Summary
 Implements Dictionary ADT
 insert in expected O(log n)
 find in expected O(log n)

 Memory use
 expected constant memory per node
 about double a linked list

 Less overhead than search trees for iteration


over range queries

Randomized Data Structures CSE 326 Autumn 2001 19


Random Number Generation
Linear congruential generator
 xi+1 = Axi mod M

Choose A, M to minimize repetitions


 M is prime
 (Axi mod M) cycles through {1, … , M-1} before repeating
 Good choices:
M = 231 – 1 = 2,147,483,647
A = 48,271

Must handle overflow!

Randomized Data Structures CSE 326 Autumn 2001 20


Some Properties of Primes
If:
 P is prime
0 < X < P

Then:
 XP-1 = 1 (mod P)
 the only solutions to X2 = 1 (mod P) are:
X = 1 and X = P - 1

Randomized Data Structures CSE 326 Autumn 2001 21


Checking Non-Primality
Exponentiation (pow function, Weiss 2.4.4) can be
modified to detect non-primality (composite-ness)

// Computes x^n mod(M)


HugeInt pow(HugeInt x, HugeInt n, HugeInt M) {
if (n == 0) return 1;
if (n == 1) return x;

HugeInt squared = x * x % M;
// 1 < x < M - 1 && squared == 1 --> M isn’t prime!

if (isEven(n)) return pow(squared, n/2, M);


else return (pow(squared, n/2, M) * x);
}

Randomized Data Structures CSE 326 Autumn 2001 22


Checking Primality
Systematic algorithm
For prime P, for all A such that 0 < A < P
Calculate AP-1 mod P using pow
Check at each step of pow and at end for primality conditions

Randomized algorithm
Use one random A
 If the randomized algorithm reports failure, then P isn’t prime.
 If the randomized algorithm reports success, then P might be prime.
 At most (P-9)/4 values of A fool this prime check
 if algorithm reports success, then P is prime with probability > ¾

Each new A has independent probability of false positive


 We can make probability of false positive arbitrarily small

Randomized Data Structures CSE 326 Autumn 2001 23


Evaluating
Randomized Primality Testing
Your probability of being struck by lightning this year:
0.00004%
Your probability that a number that tests prime 11 times
in a row is actually not prime: 0.00003%

Your probability of winning a lottery of 1 million people


five times in a row: 1 in 2100
Your probability that a number that tests prime 50 times
in a row is actually not prime: 1 in 2100

Randomized Data Structures CSE 326 Autumn 2001 24


Cycles
Randomized
algorithms

Random numbers

Prime numbers

Randomized Data Structures CSE 326 Autumn 2001 25

You might also like