0% found this document useful (0 votes)

119 views4 pages

Data Smart

The document discusses the naïve Bayes classification model for document classification. It explains that the model calculates the probabilities of a document belonging to different classes (e.g. p(app|words) and p(other|words)) based on the words in the document. It makes the independence assumption that word probabilities are independent, allowing it to simplify the calculations. However, it notes that words are not truly independent in documents. Nonetheless, the naïve Bayes model often performs reasonably well due to how it compares class probabilities.

Uploaded by

gasaas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views4 pages

Data Smart

Uploaded by

gasaas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

data-smart

Everything You Ever Needed to

Know about Spreadsheets but Were
Too Afraid to Ask
Cluster Analysis
Part I: Using K-Means to Segment
Your Customer Base
When You Name a Product Mandrill, Youre Going to
Get Some Signal and Some Noise
Supervised articial intelligence models: nave Bayes model. In
supervised articial intelligence, you train a model to make predictions
using data thats already been classied. The most common use of nave
Bayes is for document classication. Training data are provided to the
training algorithm and the model can classify new documents into these
categories using its knowledge p 77
The Worlds Fastest Intro to Probability Theory
High-Level Class Probabilities Are Often Assumed
to Be Equal
A Couple More Odds and Ends
Using Bayes Rule to Create an AI Model
Treat each tweet as a bag of words which means breaking each tweet up into words (often
called tokens) at spaces and punctuation. There are two classes of tweetscalled app for
the Mandrill.com tweets and other for everything else.
You care about these two probabilities: p(app | word1, word2, word3, ...) and p(other |
word1, word2, word3, ...)
These are the probabilities of a tweet being either about the app or about something else
given that we see the words word1, word2, word3, etc.
The standard implementation of a nave Bayes model classies a new document based on
which of these two classes is most likely given the words.
Maximum a posteriori rule (MAP rule) - decision that picks the class thats most likely
given the words
Use the Bayes Rule on them. Using the Bayes Rule, you can rewrite the conditional app
probability as follows:
p(app | word1, word2, ...) = p(app) p(word1, word2, ...| app) / p(word1, word2, ...)
Similarly, you get: p(other | word1, word2, ...) = p(other) p(word1, word2, ...| other) /
p(word1, word2, ...)
Which is larger: p(app) p(word1, word2, ...| app) or p(other) p(word1, word2, ...| other) ?
Assume that the probabilities of these words being in the document are independent from
one another.
p(app) p(word1, word2, ...| app) = p(app) p(word1| app) p(word2| app) p(word3| app)...
p(other) p(word1, word2, ...| other) = p(other) p(word1| other) p(word2| other) p(word3|
other)...
The independence assumption allows you to break that joint conditional probability of the
bag of words given the class into probabilities of single words given the class.
However words are not independent of one another in a document!
MAP rule doesnt really care that you calculated your class probabilities correctly; it just
cares about which incorrectly calculated probability is larger.
And by assuming independence of words, youre injecting all sorts of error into that
calculation, but at least this sloppiness is across the board. The comparisons used in the
MAP rule tend to come out in the same direction they would have had you applied all sorts
of fancier linguistic understanding to the model.
notes
formulae
problems
solutions
do in excel
Removing Extraneous Punctuation
Splitting on Spaces
Counting Tokens and Calculating Probabilities
And We Have a Model! Lets Use It
Lets Get This Excel Party Started
Wrapping Up
Nave Bayes and the Incredible
Lightness of Being an Idiot
Optimisation Modelling: Because
That Fresh Squeezed Orange Juice
Aint Gonna Blend Itself
Cluster Analysis
Part II: Network Graphs and
Community Detection
The Granddaddy of Supervised
Articial Intelligence Regression
Ensemble Models: A Whole Lot of
Bad Pizza
Forecasting: Breathe Easy; You Cant
Win
Outlier Detection: Just Because
Theyre Odd Doesnt Mean Theyre
Unimportant
Moving from Spreadsheets into R
Conclusion
Data Smart
Using Data Science to
Transform Information into
Insight
by John W. Foreman 2014
Everything You Ever Needed to
Know about Spreadsheets but Were
Too Afraid to Ask
Cluster Analysis
Part I: Using K-Means to Segment
Your Customer Base
When You Name a Product Mandrill, Youre Going to
Get Some Signal and Some Noise
Supervised articial intelligence models: nave Bayes model. In
supervised articial intelligence, you train a model to make predictions
using data thats already been classied. The most common use of nave
Bayes is for document classication. Training data are provided to the
training algorithm and the model can classify new documents into these
categories using its knowledge p 77
The Worlds Fastest Intro to Probability Theory
High-Level Class Probabilities Are Often Assumed
to Be Equal
A Couple More Odds and Ends
Using Bayes Rule to Create an AI Model
Treat each tweet as a bag of words which means breaking each tweet up into words (often
called tokens) at spaces and punctuation. There are two classes of tweetscalled app for
the Mandrill.com tweets and other for everything else.
You care about these two probabilities: p(app | word1, word2, word3, ...) and p(other |
word1, word2, word3, ...)
These are the probabilities of a tweet being either about the app or about something else
given that we see the words word1, word2, word3, etc.
The standard implementation of a nave Bayes model classies a new document based on
which of these two classes is most likely given the words.
Maximum a posteriori rule (MAP rule) - decision that picks the class thats most likely
given the words
Use the Bayes Rule on them. Using the Bayes Rule, you can rewrite the conditional app
probability as follows:
p(app | word1, word2, ...) = p(app) p(word1, word2, ...| app) / p(word1, word2, ...)
Similarly, you get: p(other | word1, word2, ...) = p(other) p(word1, word2, ...| other) /
p(word1, word2, ...)
Which is larger: p(app) p(word1, word2, ...| app) or p(other) p(word1, word2, ...| other) ?
Assume that the probabilities of these words being in the document are independent from
one another.
p(app) p(word1, word2, ...| app) = p(app) p(word1| app) p(word2| app) p(word3| app)...
p(other) p(word1, word2, ...| other) = p(other) p(word1| other) p(word2| other) p(word3|
other)...
The independence assumption allows you to break that joint conditional probability of the
bag of words given the class into probabilities of single words given the class.
However words are not independent of one another in a document!
MAP rule doesnt really care that you calculated your class probabilities correctly; it just
cares about which incorrectly calculated probability is larger.
And by assuming independence of words, youre injecting all sorts of error into that
calculation, but at least this sloppiness is across the board. The comparisons used in the
MAP rule tend to come out in the same direction they would have had you applied all sorts
of fancier linguistic understanding to the model.
notes
formulae
problems
solutions
do in excel
Removing Extraneous Punctuation
Splitting on Spaces
Counting Tokens and Calculating Probabilities
And We Have a Model! Lets Use It
Lets Get This Excel Party Started
Wrapping Up
Nave Bayes and the Incredible
Lightness of Being an Idiot
Optimisation Modelling: Because
That Fresh Squeezed Orange Juice
Aint Gonna Blend Itself
Cluster Analysis
Part II: Network Graphs and
Community Detection
The Granddaddy of Supervised
Articial Intelligence Regression
Ensemble Models: A Whole Lot of
Bad Pizza
Forecasting: Breathe Easy; You Cant
Win
Outlier Detection: Just Because
Theyre Odd Doesnt Mean Theyre
Unimportant
Moving from Spreadsheets into R
Conclusion
Data Smart
Using Data Science to
Transform Information into
Insight
by John W. Foreman 2014
Everything You Ever Needed to
Know about Spreadsheets but Were
Too Afraid to Ask
Cluster Analysis
Part I: Using K-Means to Segment
Your Customer Base
When You Name a Product Mandrill, Youre Going to
Get Some Signal and Some Noise
Supervised articial intelligence models: nave Bayes model. In
supervised articial intelligence, you train a model to make predictions
using data thats already been classied. The most common use of nave
Bayes is for document classication. Training data are provided to the
training algorithm and the model can classify new documents into these
categories using its knowledge p 77
The Worlds Fastest Intro to Probability Theory
High-Level Class Probabilities Are Often Assumed
to Be Equal
A Couple More Odds and Ends
Using Bayes Rule to Create an AI Model
Treat each tweet as a bag of words which means breaking each tweet up into words (often
called tokens) at spaces and punctuation. There are two classes of tweetscalled app for
the Mandrill.com tweets and other for everything else.
You care about these two probabilities: p(app | word1, word2, word3, ...) and p(other |
word1, word2, word3, ...)
These are the probabilities of a tweet being either about the app or about something else
given that we see the words word1, word2, word3, etc.
The standard implementation of a nave Bayes model classies a new document based on
which of these two classes is most likely given the words.
Maximum a posteriori rule (MAP rule) - decision that picks the class thats most likely
given the words
Use the Bayes Rule on them. Using the Bayes Rule, you can rewrite the conditional app
probability as follows:
p(app | word1, word2, ...) = p(app) p(word1, word2, ...| app) / p(word1, word2, ...)
Similarly, you get: p(other | word1, word2, ...) = p(other) p(word1, word2, ...| other) /
p(word1, word2, ...)
Which is larger: p(app) p(word1, word2, ...| app) or p(other) p(word1, word2, ...| other) ?
Assume that the probabilities of these words being in the document are independent from
one another.
p(app) p(word1, word2, ...| app) = p(app) p(word1| app) p(word2| app) p(word3| app)...
p(other) p(word1, word2, ...| other) = p(other) p(word1| other) p(word2| other) p(word3|
other)...
The independence assumption allows you to break that joint conditional probability of the
bag of words given the class into probabilities of single words given the class.
However words are not independent of one another in a document!
MAP rule doesnt really care that you calculated your class probabilities correctly; it just
cares about which incorrectly calculated probability is larger.
And by assuming independence of words, youre injecting all sorts of error into that
calculation, but at least this sloppiness is across the board. The comparisons used in the
MAP rule tend to come out in the same direction they would have had you applied all sorts
of fancier linguistic understanding to the model.
notes
formulae
problems
solutions
do in excel
Removing Extraneous Punctuation
Splitting on Spaces
Counting Tokens and Calculating Probabilities
And We Have a Model! Lets Use It
Lets Get This Excel Party Started
Wrapping Up
Nave Bayes and the Incredible
Lightness of Being an Idiot
Optimisation Modelling: Because
That Fresh Squeezed Orange Juice
Aint Gonna Blend Itself
Cluster Analysis
Part II: Network Graphs and
Community Detection
The Granddaddy of Supervised
Articial Intelligence Regression
Ensemble Models: A Whole Lot of
Bad Pizza
Forecasting: Breathe Easy; You Cant
Win
Outlier Detection: Just Because
Theyre Odd Doesnt Mean Theyre
Unimportant
Moving from Spreadsheets into R
Conclusion
Data Smart
Using Data Science to
Transform Information into
Insight
by John W. Foreman 2014
Everything You Ever Needed to
Know about Spreadsheets but Were
Too Afraid to Ask
Cluster Analysis
Part I: Using K-Means to Segment
Your Customer Base
When You Name a Product Mandrill, Youre Going to
Get Some Signal and Some Noise
Supervised articial intelligence models: nave Bayes model. In
supervised articial intelligence, you train a model to make predictions
using data thats already been classied. The most common use of nave
Bayes is for document classication. Training data are provided to the
training algorithm and the model can classify new documents into these
categories using its knowledge p 77
The Worlds Fastest Intro to Probability Theory
High-Level Class Probabilities Are Often Assumed
to Be Equal
A Couple More Odds and Ends
Using Bayes Rule to Create an AI Model
Treat each tweet as a bag of words which means breaking each tweet up into words (often
called tokens) at spaces and punctuation. There are two classes of tweetscalled app for
the Mandrill.com tweets and other for everything else.
You care about these two probabilities: p(app | word1, word2, word3, ...) and p(other |
word1, word2, word3, ...)
These are the probabilities of a tweet being either about the app or about something else
given that we see the words word1, word2, word3, etc.
The standard implementation of a nave Bayes model classies a new document based on
which of these two classes is most likely given the words.
Maximum a posteriori rule (MAP rule) - decision that picks the class thats most likely
given the words
Use the Bayes Rule on them. Using the Bayes Rule, you can rewrite the conditional app
probability as follows:
p(app | word1, word2, ...) = p(app) p(word1, word2, ...| app) / p(word1, word2, ...)
Similarly, you get: p(other | word1, word2, ...) = p(other) p(word1, word2, ...| other) /
p(word1, word2, ...)
Which is larger: p(app) p(word1, word2, ...| app) or p(other) p(word1, word2, ...| other) ?
Assume that the probabilities of these words being in the document are independent from
one another.
p(app) p(word1, word2, ...| app) = p(app) p(word1| app) p(word2| app) p(word3| app)...
p(other) p(word1, word2, ...| other) = p(other) p(word1| other) p(word2| other) p(word3|
other)...
The independence assumption allows you to break that joint conditional probability of the
bag of words given the class into probabilities of single words given the class.
However words are not independent of one another in a document!
MAP rule doesnt really care that you calculated your class probabilities correctly; it just
cares about which incorrectly calculated probability is larger.
And by assuming independence of words, youre injecting all sorts of error into that
calculation, but at least this sloppiness is across the board. The comparisons used in the
MAP rule tend to come out in the same direction they would have had you applied all sorts
of fancier linguistic understanding to the model.
notes
formulae
problems
solutions
do in excel
Removing Extraneous Punctuation
Splitting on Spaces
Counting Tokens and Calculating Probabilities
And We Have a Model! Lets Use It
Lets Get This Excel Party Started
Wrapping Up
Nave Bayes and the Incredible
Lightness of Being an Idiot
Optimisation Modelling: Because
That Fresh Squeezed Orange Juice
Aint Gonna Blend Itself
Cluster Analysis
Part II: Network Graphs and
Community Detection
The Granddaddy of Supervised
Articial Intelligence Regression
Ensemble Models: A Whole Lot of
Bad Pizza
Forecasting: Breathe Easy; You Cant
Win
Outlier Detection: Just Because
Theyre Odd Doesnt Mean Theyre
Unimportant
Moving from Spreadsheets into R
Conclusion
Data Smart
Using Data Science to
Transform Information into
Insight
by John W. Foreman 2014

2022 Naive Bayes and Probability
No ratings yet
2022 Naive Bayes and Probability
30 pages
The Best
100% (8)
The Best
43 pages
The Truth About Secret Space Program PDF
100% (5)
The Truth About Secret Space Program PDF
40 pages
Milton at A Solemn Music Analysis
100% (8)
Milton at A Solemn Music Analysis
6 pages
Nizam Al-Mulk Book of Government
No ratings yet
Nizam Al-Mulk Book of Government
3 pages
An Introduction to Naive Bayes Algorithm for Beginners
No ratings yet
An Introduction to Naive Bayes Algorithm for Beginners
11 pages
Bayes
No ratings yet
Bayes
10 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
11 pages
Statistics
No ratings yet
Statistics
25 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
_Bayes theorem & Naive Bayes algorithm
No ratings yet
_Bayes theorem & Naive Bayes algorithm
3 pages
Unit-4
No ratings yet
Unit-4
36 pages
Naive Bayes - Lecture Slides
No ratings yet
Naive Bayes - Lecture Slides
11 pages
Nlp4web Lecture 2 Text Classification
No ratings yet
Nlp4web Lecture 2 Text Classification
109 pages
Naive Bayes Classifier: Coin Toss and Fair Dice Example
No ratings yet
Naive Bayes Classifier: Coin Toss and Fair Dice Example
16 pages
FML Unit3
No ratings yet
FML Unit3
18 pages
Supervised Learning-1
100% (1)
Supervised Learning-1
37 pages
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
6 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
2 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
Naive_Bayes
No ratings yet
Naive_Bayes
11 pages
Naive_Bayes
No ratings yet
Naive_Bayes
11 pages
Bayes Learning
No ratings yet
Bayes Learning
37 pages
Naive Bayes Model
No ratings yet
Naive Bayes Model
10 pages
Naive Bayes
No ratings yet
Naive Bayes
31 pages
UNIT 2 AAM notes (1)
No ratings yet
UNIT 2 AAM notes (1)
38 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Aiml Iii
No ratings yet
Aiml Iii
28 pages
Naïve Bayes classifiers 3_dc4478f7a9b2f677b59859e94c82b62a
No ratings yet
Naïve Bayes classifiers 3_dc4478f7a9b2f677b59859e94c82b62a
16 pages
L4 Naive Bayes
No ratings yet
L4 Naive Bayes
31 pages
Text Mining - Classification
No ratings yet
Text Mining - Classification
28 pages
Preparing Data for Analysis with JMP
From Everand
Preparing Data for Analysis with JMP
Robert Carver
No ratings yet
mod(ML)-4
No ratings yet
mod(ML)-4
48 pages
Bayesian Learning
No ratings yet
Bayesian Learning
49 pages
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
03 ML Essentials
No ratings yet
03 ML Essentials
52 pages
Sample Final Q2 Ans
No ratings yet
Sample Final Q2 Ans
3 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
Unit Iii Bayesian Learning
No ratings yet
Unit Iii Bayesian Learning
5 pages
Wk08
No ratings yet
Wk08
10 pages
Unit-3
No ratings yet
Unit-3
157 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
11 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
Aiml 2 3
No ratings yet
Aiml 2 3
51 pages
Notes On Module 3 - Pattern Recognition
No ratings yet
Notes On Module 3 - Pattern Recognition
17 pages
Bayesian Network
No ratings yet
Bayesian Network
32 pages
ML Material-I (2)
No ratings yet
ML Material-I (2)
35 pages
DWM Exp 4
No ratings yet
DWM Exp 4
7 pages
Bayes' Rule and Its Use
No ratings yet
Bayes' Rule and Its Use
13 pages
bag_of_words nlp
No ratings yet
bag_of_words nlp
23 pages
cs188-fa22-note19
No ratings yet
cs188-fa22-note19
8 pages
ai3
No ratings yet
ai3
41 pages
Bayes Rule
No ratings yet
Bayes Rule
15 pages
ML Module 4 Chapter 8 RNSIT
No ratings yet
ML Module 4 Chapter 8 RNSIT
5 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
6 pages
ML Algo DS
No ratings yet
ML Algo DS
22 pages
JMP for Mixed Models
From Everand
JMP for Mixed Models
Ruth Hummel
No ratings yet
n9
No ratings yet
n9
14 pages
23-Naive Bayes
No ratings yet
23-Naive Bayes
22 pages
Coaching and Mentoring in Social Work-Review of The Evidence
100% (1)
Coaching and Mentoring in Social Work-Review of The Evidence
102 pages
Complementary Calculi
No ratings yet
Complementary Calculi
42 pages
Ed Notes (Revised)
No ratings yet
Ed Notes (Revised)
77 pages
Coaching - Is It Art or Science - Player Development Project PDF
No ratings yet
Coaching - Is It Art or Science - Player Development Project PDF
15 pages
Revised and Polished-Essay 1
No ratings yet
Revised and Polished-Essay 1
6 pages
Samwad Prof - Madhav Gadgil Lecture
No ratings yet
Samwad Prof - Madhav Gadgil Lecture
50 pages
Assessment 2
No ratings yet
Assessment 2
14 pages
What Is Philosophy?: Fr. Jayson S. Gaite
No ratings yet
What Is Philosophy?: Fr. Jayson S. Gaite
14 pages
Language Evaluation Tool Deped-Developed Learning Resource
No ratings yet
Language Evaluation Tool Deped-Developed Learning Resource
2 pages
Thesis For Brave New World
100% (2)
Thesis For Brave New World
5 pages
Ge6 Finals Reviewer
No ratings yet
Ge6 Finals Reviewer
44 pages
Television's Role in The Culture
No ratings yet
Television's Role in The Culture
18 pages
Contemporary Theories of Motivation
No ratings yet
Contemporary Theories of Motivation
45 pages
Chapter 6 Leading
100% (1)
Chapter 6 Leading
14 pages
Gruber Ch9 Soln
No ratings yet
Gruber Ch9 Soln
3 pages
The Silver Key
No ratings yet
The Silver Key
10 pages
f440bd96-269d-4ff3-b7d8-8e76aac176ee the Sacred Vehicle of Light a Divine Guide to Merkaba Activation
No ratings yet
f440bd96-269d-4ff3-b7d8-8e76aac176ee the Sacred Vehicle of Light a Divine Guide to Merkaba Activation
6 pages
Ladefoged EAP Chap 7
No ratings yet
Ladefoged EAP Chap 7
11 pages
Transitional Justice in The World Insights From A New Dataset
No ratings yet
Transitional Justice in The World Insights From A New Dataset
8 pages
Nancy - The Experience of Freedom
100% (2)
Nancy - The Experience of Freedom
234 pages
Al-Majmu'ah Al-Dhahabiyyah' or The Golden Collection'
No ratings yet
Al-Majmu'ah Al-Dhahabiyyah' or The Golden Collection'
63 pages
Job Description Internal Auditor 11-08
No ratings yet
Job Description Internal Auditor 11-08
4 pages
Dielectric Polarization
No ratings yet
Dielectric Polarization
28 pages
The Body and The Bible
No ratings yet
The Body and The Bible
13 pages
Fiber Optics
No ratings yet
Fiber Optics
38 pages

Data Smart

Uploaded by

Data Smart

Uploaded by

data-smart

Everything You Ever Needed to

You might also like