Association Pattern Mining - Intro
Association Pattern Mining - Intro
Subhasis Ray
2023-04-12
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Are customers
buying cereals
likely to buy milk?
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
A database of transactions
tid itemset
1 Apple, Coke, DVD
2 Bread, Coke, Egg
3 Apple, Bread, Coke, Egg
4 Bread, Egg
In terms of probabilities
support(A => B) = P(A ∪ B), here A ∪ B means both
A and B occuring in the same transaction (note the
reversal from probability notation)
confidence(A => B) = P(B | A)
confidence(A => B) = P(A ∪ B) / P(A)
= support(A ∪ B) / support(A)
= support-count(A ∪ B) / support-count(A)
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
∑ ( )
Now |C| = ki=1 |U| i
What is |C| with |U| = 1000 and k = 10?
~ 2.7 * 1023
271 orders of magnitude improvement!
But still will take more time than the age of the
universe!!!
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
References
Charu Aggarwal
Han, Kamber, Pei
Hongbo Du
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .