0% found this document useful (0 votes)
102 views6 pages

Counting Frequent Item Set Apriori Algo

The document describes the A-Priori algorithm for finding frequent item sets in a data stream. The algorithm uses two passes over the data: (1) the first pass counts single items and determines which are frequent, (2) the second pass counts pairs of frequent items from the first pass to determine frequent pairs. The algorithm exploits the property that if an item/pair is not frequent, then none of its supersets can be frequent. It stores frequent items and counts of candidate pairs in main memory to identify truly frequent pairs in the second pass without reexamining the entire data stream. This process can be extended to find frequent triples and higher-order sets.

Uploaded by

Tanya Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views6 pages

Counting Frequent Item Set Apriori Algo

The document describes the A-Priori algorithm for finding frequent item sets in a data stream. The algorithm uses two passes over the data: (1) the first pass counts single items and determines which are frequent, (2) the second pass counts pairs of frequent items from the first pass to determine frequent pairs. The algorithm exploits the property that if an item/pair is not frequent, then none of its supersets can be frequent. It stores frequent items and counts of candidate pairs in main memory to identify truly frequent pairs in the second pass without reexamining the entire data stream. This process can be extended to find frequent triples and higher-order sets.

Uploaded by

Tanya Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Counting frequent item sets in a

stream

unit4/Counting frequent item sets in a


1
stream
A-Priori Algorithm – (1)
• A two-pass approach called
A-Priori limits the need for
main memory
• Key idea: monotonicity
– If a set of items I appears at
least s times, so does every subset J of I
• Contrapositive for pairs:
If item i does not appear in s baskets, then no
pair including i can appear in s baskets

• So, how does A-Priori find freq. pairs?


unit4/Counting frequent item sets in a
2
stream
A-Priori Algorithm – (2)

unit4/Counting frequent item sets in a


3
stream
Main-Memory: Picture of A-Priori

Frequent items
Item counts
Main memory

Counts of
pairs of
frequent items
(candidate
pairs)

Pass 1 Pass 2

unit4/Counting frequent item sets in a


4
stream
Detail for A-Priori
• You can use the
triangular matrix method
with n = number of Item counts Frequent Old
item
frequent items items
#s
– May save space compared
Counts of pairs

Main memory
with storing triples of frequent
Counts of
• Trick: re-number itemsof
pairs
frequent items
frequent items 1,2,… and
keep a table relating new
numbers to original item Pass 1 Pass 2
numbers
unit4/Counting frequent item sets in a
5
stream
Frequent Triples, Etc.
• For each k, we construct two sets of
k-tuples (sets of size k):
– Ck = candidate k-tuples = those that might be
frequent sets (support > s) based on information
from the pass for k–1
– Lk = the set of truly frequent k-tuples
Count All pairs Count
All of items To be
the items the pairs explained
items from L1

C1 Filter L1 Construct C2 Filter L2 Construct C3


unit4/Counting frequent item sets in a
6
stream

You might also like