0% found this document useful (0 votes)

30 views

PPPT

The document presents a proposed study on sentiment analysis of tweets using machine learning algorithms like Naive Bayes and Support Vector Machines. It discusses collecting Twitter data, preprocessing the data through steps like tokenization and filtering, extracting features, and classifying sentiment using Naive Bayes and SVMs. The methodology involves training models on labeled tweet data and evaluating accuracy on unlabeled test data.

Uploaded by

Akshita Khanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

PPPT

Uploaded by

Akshita Khanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Presentation On

Sentiment Analysis Of Twitter

B.Tech (IT) – VI Semester

Session : DEC-APR(2019)

Submitted To- Submitted By-

Department Of Mathematics and Akshita Khanna(11007)
Computing Anupriya(11014)
Nischal Mehta(11061)
Pratibha Sharma(11067)
Content
1. Introduction
2. Proposed Study
a. Data Collection
b. Pre-Processing
c. Feature Selection
d. Feature Extraction
e. Classification
3. Research Methodology
4. Conclusion
Problem Statement
 A major benefit of social media is that we can see the good
and bad things people say about the particular brand or
personality.
 The bigger your company gets difficult it becomes to keep
a handle on how everyone feels about your brand. For
large companies with thousands of daily mentions on social
media, news sites and blogs, it’s extremely difficult to do
this manually.
 To combat this problem, sentimental analysis software are
necessary. These soft wares can be used to evaluate the
people's sentiment about particular brand or personality.
Introduction: What is Tweezer!!

 TWEEZER = TWEEts + analyZER

 This product (Tweezer) introduce a novel approach for

automatically classifying the sentiment of Twitter
messages. These messages are classified as positive or
neutral or negative with respect to a query term or the
keyword entered by a user.
Data Collection

1. Data Streaming: For performing sentimental analysis

we need Twitter data consisting of tweets about a
particular keyword or query term. For collecting the
data and tweets we have used Twitter public API
available for general public for free. It is the part of
Data Collection.

#NOTE: Tweets are short messages, restricted to 140

characters in length. Due to the nature of this micro
blogging service (quick and short messages), people use
acronyms, make spelling mistakes, use emoticons and
other characters that express special meanings.
Data Pre Processing

Tokenization

Filtering
Tokens

Stemming

Conversion

Pre-Processing
Data Pre Processing(cont..)

1.Tokenization: This process splits the text of a document

into a sequence of tokens. The splitting points are defined
using all non letter characters. This results in tokens
consisting of one single word (unigrams).

2.Filtering Tokens: Length based filtration scheme was

applied for reducing the generated token set.
Data Pre Processing(cont..)

Removing URLs ,hashtags ,references ,special characters -First step is the

cleaning the data of hash tags, numbers (1, 2, 3 etc.,), URL‘s and targets (@) which will
help to reduce most of the noise.Also, the non word symbols such as a full stop,
comma, inverted commas etc are removed.

For eg.- Lets us consider a tweet-

A great win and a fabulous innings by @imVkohli. Yet another at his adopted home
ground . Excellent role played by @msdhoni and @DineshKarthik to take India over
the line. #INDvAUS https://t.co/7n3M2l3hZS.
This will change to:
A great win and a Fabulous Innings by Yet another at his adopted home ground
Excellent role played by to take India over the line
Removal of stop words- A list of stop words like for, she, he, is ,of ,the etc. are created
and ignored.

The above example will now change to:

great win Fabulous Innings another adopted home ground Excellent role played take
India over the line
Feature Extraction
 Selection of useful words from the tweets from the preprocessed data set is
called as feature extraction . In the feature extraction method, we extract the
aspects from the pre-processed twitter dataset.
 There are different ways of feature extraction – unigram , bigram and n-gram.
For eg:she is not bad.
 If the word ‘bad’ occurs , the sentiment is not necessarily negative. If we
consider 2-gram , the feature ‘not bad’ also has to be taken into account i.e this
statement is most likely to be a positive statement. Therefore, n-grams used as
features in classification can improve the result.
 Parts Of Speech Tags like adjectives, adverbs, verbs and nouns are good
indicators of subjectivity and sentiment which specifies the polarity of the
tweet.
 Negation is very important and a difficult feature to interpret. The presence of
a negation in tweet changes the polarity of the sentiment.
Naïve Bayes
 Naïve Bayes is a machine learning based probabilistic approach for
sentiment analysis. It is based on Bayes Theorem with an
assumption of independence among predictors. .

 In Sentiment analysis of tweets, it classifies the tweets into two

classes(positive/negative) using the frequently used
positive/negative words in the traning dataset as feature.

 Well known application of Naïve Bayes are Categorization of

News, Email Spam Detection etc.
Bayes Theorem
According to the Bayes theorem,

P(A|B) = P(B|A) *P(B) / P(A)

Where,
 P(A|B) is the posterior probability of class (A, target) given predictor
(B, attributes).
 P(A) is the prior probability of class.
 P(B|A) is the likelihood which is the probability of predictor given
class.
 P(B) is the prior probability of predictor.

The Naïve Bayes is an extension to the Bayes Theorem because in case of

naïve bayes we can have multiple classes(C1,C2...Cn) and multiple features
(X1,X2,…Xn) whereas in bayes theorem we have only two.
Example-Naïve Bayes
Ex :whether the players will play the game or not
depending on the weather condition.
Weather Play Frequency Table
Weather No Yes
Sunny No Overcast 4
Overcast Yes Rainy 3 2
Rainy Yes Sunny 2 3
Sunny Yes Grand Total 5 9
Sunny Yes Likelihood Table
Overcast Yes Weather No Yes
Rainy No Overcast 4 =4/14 0.29
Rainy No Rainy 3 2 =5/14 0.36
Sunny Yes Sunny 2 3 =5/14 0.36
Rainy Yes All 5 9
Sunny No
=5/14 =9/14
Overcast Yes
Overcast Yes 0.36 0.64

Rainy No
if we want to calculate the probability of the total number of yes when it is a sunny day, then
=>P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny) .
Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) =5/14 = 0.36, P( Yes)= 9/14 = 0.64
Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
In our case , we have train and test data .Train data is used for generating features to
train the algorithm.Assuming,we have n tweets from which k are positives and (n-k)
are negative.
we are classifying the tweets into two classes (positive /negative) .
Example: @xyz when you are happy , you look beautiful !!!
@xyz I am sad .
A=P(positive | tweet) P(tweet | positive)*P(positive)/P(tweet)
B=P(negative | tweet)=P(tweet | positive)*P(positive)/P(tweet)
P(positive | tweet)=P(happy| positive)*P(beautiful |positive) *P(positive)
Similarly,
P(negative | tweet)=P(sad | negative) *P(negative)
Features Positive Negative
beautiful 4 1
if(A>B) then the Tweet is positive otherwise,
the tweet is negative. sad 2 5
happy 3 0
total 9 6
SUPPORT VECTOR
MACHINE.
 1.“Support Vector Machine”(SVM) is a Supervised Machine Learning
Algorithm. It is used for both classification and regression.

 2. SVM is a supervised learning method that sorts data into two categories.

 3.The task of an SVM algorithm is to determine which category a new data

point belongs in.

 4.In other words, given labeled training data (supervised learning), the
algorithm outputs an optimal hyperplane which categorizes new examples.
HOW DOES IT WORK?

SCENARIO ONE
Here we have three hyperplanes(A,B and C).We need to identify the
right hyperplane to classify star and cirlce.
HOW DOES IT WORK?
SCENARIO TWO
Here we have three hyperplanes(A,B and C) and all are
segregating the classes well. So How can we identify the
right Hyperplane?
Sentiment Analysis using SVM
 Sentiment Analysis is treated as classification task , as it
 classifies orientation of text into positive and negative.

 The goal of SVM is to separate negative and positive

training example by finding n-1 hyperplanes.
 The confusion matrix is obtained after implementing
the SVM algorithm.

POSITIVE NEGATIVE
POSITIVE 3000 0

NEGATIVE 900 0
CALCULATING
ACCURACY
Accuracy can be calculated as number of correctly
predicted reviews to the number of total number of
reviews present in the corpus. The formula for
calculating accuracy is given as:

 The task of an SVM algorithm is to determine which

category a new data point belongs in.

 In other words, given labeled training data (supervised

learning), the algorithm outputs an optimal hyperplane
Accuracy of algorithms

We are also providing a comparative study of the different algorithms

that we have discussed on the basis of the confusion matrix.
Positive Negative

Positive True Positive(TP) False Positive(FP)

Negative True Negative(TN) False Negative(FN)

Confusion Matrix

accuracy can be calculated using the confusion matrix -

Accuracy = (TP+TN)/(TP+TN+FN+FP)

 TP=TRUE POSITIVE
 TN=TRUE NEGATIVE
 FP=FALSE POSITIVE
 FN=FALSE NEGATIVE
A study by Twitter in 2015 shows that 15% of tweets during TV prime time contain
at least one emoji that’s why they are a major factor of consideration. The polarity of
an emoticon is based on the score they are carrying. The polarity of a tweet is the
sum of the polarity of the textual part and the emoticon part. Following is the list of
some of the emoticons along with their scores:-

As we can see that the scores of the negative emoticons are already
negative so when they are added to the polarity of the textual part of the
tweet, the polarity of the tweet changes accordingly.

Sta 341 Class Notes Final
No ratings yet
Sta 341 Class Notes Final
120 pages
Sentiment Analysis Final Documentation Report
50% (2)
Sentiment Analysis Final Documentation Report
21 pages
Hull OFOD10e MultipleChoice Questions Only Ch21
No ratings yet
Hull OFOD10e MultipleChoice Questions Only Ch21
4 pages
Sentiment Analysis of Twitter
No ratings yet
Sentiment Analysis of Twitter
26 pages
Twiiter Sentiment Analysis
No ratings yet
Twiiter Sentiment Analysis
15 pages
Sentiment Analysis of Tweets Using Machine Learning
No ratings yet
Sentiment Analysis of Tweets Using Machine Learning
22 pages
Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
Gartner
No ratings yet
Gartner
8 pages
Sentiment Analysis On Twitter Data
No ratings yet
Sentiment Analysis On Twitter Data
23 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
3 pages
twitter sentiment analysis ppt
100% (2)
twitter sentiment analysis ppt
10 pages
Twitter Analysis
No ratings yet
Twitter Analysis
8 pages
Ppt- Sentiment Analysis Using Machine Learning Algorithms
No ratings yet
Ppt- Sentiment Analysis Using Machine Learning Algorithms
23 pages
Sentiment Analysis in Twitter: Rohit Kumar Jha (11615) Sakaar Khurana (10627)
No ratings yet
Sentiment Analysis in Twitter: Rohit Kumar Jha (11615) Sakaar Khurana (10627)
9 pages
Depression Detection Emotion AI
No ratings yet
Depression Detection Emotion AI
5 pages
Sentiment Analysis: Using Naïve Bayes Classifier
No ratings yet
Sentiment Analysis: Using Naïve Bayes Classifier
18 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
17 pages
571 Document Mod
No ratings yet
571 Document Mod
30 pages
Sentiment Analysis
100% (1)
Sentiment Analysis
19 pages
Sentiment Analysis of Twitter Data: Radhi D. Desai
No ratings yet
Sentiment Analysis of Twitter Data: Radhi D. Desai
4 pages
Sentiment Analysis: Srishti Chaubey
No ratings yet
Sentiment Analysis: Srishti Chaubey
40 pages
Abstract
No ratings yet
Abstract
2 pages
Introduction
No ratings yet
Introduction
27 pages
Classifier Series - Naive Bayes Sentiment Analysis
No ratings yet
Classifier Series - Naive Bayes Sentiment Analysis
10 pages
crowd sourcing platform IEEE paper 1
No ratings yet
crowd sourcing platform IEEE paper 1
7 pages
Lab Report - CSE 816
No ratings yet
Lab Report - CSE 816
17 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
14 pages
Batch-6c Minipro Doc Rev-2
No ratings yet
Batch-6c Minipro Doc Rev-2
33 pages
Template For The First Slide of PPT Presentation1
No ratings yet
Template For The First Slide of PPT Presentation1
18 pages
SentA Russir Day2
No ratings yet
SentA Russir Day2
33 pages
BATCH - 11: Classifying Interactions/Reactions SVM (Machine Learning Concept)
No ratings yet
BATCH - 11: Classifying Interactions/Reactions SVM (Machine Learning Concept)
13 pages
Pre Processing
No ratings yet
Pre Processing
9 pages
Sentiment Analysis On Twitter Data
No ratings yet
Sentiment Analysis On Twitter Data
7 pages
Effective Sentiment Analysis of Twitter with Apache Spark
No ratings yet
Effective Sentiment Analysis of Twitter with Apache Spark
8 pages
Machine Learning With Advance Model
No ratings yet
Machine Learning With Advance Model
19 pages
9)Sentiment Classification in Social Media
No ratings yet
9)Sentiment Classification in Social Media
42 pages
Lecture 3 Sentiment Analysis
No ratings yet
Lecture 3 Sentiment Analysis
41 pages
Unit 3 PPT
No ratings yet
Unit 3 PPT
20 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
74 pages
Thesis - Aru Omarali
No ratings yet
Thesis - Aru Omarali
34 pages
Implementation of Sentiment Analysis On Twitter Data
No ratings yet
Implementation of Sentiment Analysis On Twitter Data
6 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
15 pages
SENTIMENT ANALYSIS From Text and Image
No ratings yet
SENTIMENT ANALYSIS From Text and Image
13 pages
Project Report
No ratings yet
Project Report
10 pages
Machine Learning Algorithm For Sentimental Analysis of Twitter Feeds
No ratings yet
Machine Learning Algorithm For Sentimental Analysis of Twitter Feeds
4 pages
Leveraging Natural Language Processing and Machine Learning For Enhanced Content Rating
No ratings yet
Leveraging Natural Language Processing and Machine Learning For Enhanced Content Rating
8 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
11 pages
Swaraj Project 12march
No ratings yet
Swaraj Project 12march
31 pages
NCSPCN 12 CRP
No ratings yet
NCSPCN 12 CRP
3 pages
Engineering Journal Sentiment Analysis Methodology of Twitter Data With An Application On Hajj Season
No ratings yet
Engineering Journal Sentiment Analysis Methodology of Twitter Data With An Application On Hajj Season
6 pages
Sentiment Analysis On Twitter in R
No ratings yet
Sentiment Analysis On Twitter in R
3 pages
Ajay PD Yadav
No ratings yet
Ajay PD Yadav
7 pages
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
No ratings yet
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
4 pages
Airline Tweets Classification Using Naive Bayes Classifier
No ratings yet
Airline Tweets Classification Using Naive Bayes Classifier
2 pages
Module4-TextAnalytics
No ratings yet
Module4-TextAnalytics
9 pages
Sentiment Analysis On IMDB Movie Comments and Twit
No ratings yet
Sentiment Analysis On IMDB Movie Comments and Twit
8 pages
IC-RTETM_Final_Sentiment_Analysis
No ratings yet
IC-RTETM_Final_Sentiment_Analysis
13 pages
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
No ratings yet
Sentimental Analysis On Twitter Data Using Naive Bayes: Ijarcce
4 pages
13 Chapter 6 PSO GA DT
No ratings yet
13 Chapter 6 PSO GA DT
11 pages
mla_unit-5'2 (1)
No ratings yet
mla_unit-5'2 (1)
74 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
Local Discontinuous Galerkin Methods For High-Order Time-Dependent Partial Differential Equations
No ratings yet
Local Discontinuous Galerkin Methods For High-Order Time-Dependent Partial Differential Equations
57 pages
Artificial Intelligence-Module 2 (1)
No ratings yet
Artificial Intelligence-Module 2 (1)
13 pages
ESE547
No ratings yet
ESE547
3 pages
Holt's Exponential Smoothing: Help Me Practice Let's Chat
No ratings yet
Holt's Exponential Smoothing: Help Me Practice Let's Chat
1 page
Program 6 Dijkstra Algorithm
No ratings yet
Program 6 Dijkstra Algorithm
6 pages
1 Introduction To Control Systems
No ratings yet
1 Introduction To Control Systems
47 pages
L08A - Introduction to FIR and IIR Filters
No ratings yet
L08A - Introduction to FIR and IIR Filters
43 pages
The Zero Error Capacity of A Noisy Channel
100% (1)
The Zero Error Capacity of A Noisy Channel
12 pages
Finite Element Analysis (FEA) of a Tapered Bar
No ratings yet
Finite Element Analysis (FEA) of a Tapered Bar
21 pages
Reisz Representation Theorem: Birla Institute of Technology
No ratings yet
Reisz Representation Theorem: Birla Institute of Technology
6 pages
Vibration Lecture 3
No ratings yet
Vibration Lecture 3
44 pages
2 Encryption
100% (2)
2 Encryption
96 pages
Chapter 8 Simultaneous Equations
No ratings yet
Chapter 8 Simultaneous Equations
8 pages
Deep Learning All Modules
No ratings yet
Deep Learning All Modules
445 pages
Electromagnetic fields in mechatronics electrical and electronic engineering proceedings of ISEF 05 Et Al - The full ebook version is ready for instant download
No ratings yet
Electromagnetic fields in mechatronics electrical and electronic engineering proceedings of ISEF 05 Et Al - The full ebook version is ready for instant download
78 pages
Valencia Vs Levante
No ratings yet
Valencia Vs Levante
2 pages
Lecture 2 - Big O Notation
No ratings yet
Lecture 2 - Big O Notation
15 pages
Production Systems
No ratings yet
Production Systems
14 pages
Data Mining Assignment
0% (1)
Data Mining Assignment
11 pages
Lecture 52
No ratings yet
Lecture 52
11 pages
C Program To Insert An Element in To An Array at A Given Position
No ratings yet
C Program To Insert An Element in To An Array at A Given Position
12 pages
Cap770 - Advanced Data Structures
0% (1)
Cap770 - Advanced Data Structures
2 pages
Improving Floating Search Feature Selection Using Genetic Algorithm
No ratings yet
Improving Floating Search Feature Selection Using Genetic Algorithm
19 pages
DSP Lab Manual 2024-25
No ratings yet
DSP Lab Manual 2024-25
27 pages
Cryptography - RSA Labs FAQ 4.0
No ratings yet
Cryptography - RSA Labs FAQ 4.0
216 pages
FNN
No ratings yet
FNN
24 pages
70 Days of Data Science
No ratings yet
70 Days of Data Science
11 pages
7.01 Feature Selection
No ratings yet
7.01 Feature Selection
3 pages