Sentiment Analysis of Online Data For Business Analytics: Synopsis

The document proposes a system for sentiment analysis of online data from sources like Twitter, Amazon reviews, and other social media platforms. It involves collecting and cleaning the data, performing sentiment analysis using techniques like convolutional neural networks and support vector machines, identifying fake reviews, and providing visualizations and insights. The system is designed to be an all-in-one analytics tool with features like location-based analysis and an attractive user interface.

Uploaded by

sachin mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views6 pages

Sentiment Analysis of Online Data For Business Analytics: Synopsis

Uploaded by

sachin mohan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

SYNOPSIS

Sentiment analysis of Online Data for

Business Analytics
ABSTRACT

In the recent years, social networks in business are gaining unprecedented

popularity because of their potential for business growth. Companies can know
more about consumers’ sentiments towards their products and services, and use it
to better understand the market and improve their brand. Thus, companies
regularly reinvent their marketing strategies and campaigns to fit consumers’
preferences. Social analysis harnesses and utilizes the vast volume of data in social
networks to mine critical data for strategic decision making. It uses machine
learning techniques and tools in determining patterns and trends to gain actionable
insights.

The system is designed to analyse the customer comments on twitter and other
platforms using convolutional neural network combined with support vector
machine for text sentiment analysis. Instead of going through thousands of
reviews, the proposed model has the ability to polarize the reviews and learn from
it. It also adds up the feature of identifying fake reviews by applying supervised
machine learning algorithms. Location based analysis feature helps in identifying
the interests of people from particular region. The data from several sources such
as twitter comments, amazon product reviews, other hotel reviews are used.
INTRODUCTION:

The social media has redefined the nature of how companies strategize their
business processes. The social media contains a massive volume of unstructured
data (e.g. tweets, comments, blogs, forum discussions, user post, and reviews) that
can be used for business intelligence such as customer profiling and content
analytics. Twitter, which is a social networking online service, is mainly used as a
marketing and promotion tool by most companies. Specifically, twitter data
contains not only user information, but also texts that contain subjective
information (such as user sentiments) towards a particular issue. From a business
perspective, the wealth of tweets is enough for companies to gather sufficient
feedback about their products and services from their customers without having to
spend for costly customer surveys and interviews. On the other hand, analyzing
and extracting information from unstructured data poses a formidable challenge to
data miners.Humans can easily find patterns and trends in documents but this
ability is limited when a large amount of data is involved.

The system make use of sentiment analysis in business applications. Furthermore,

this paper demonstrates the text analysis process in reviewing the public opinion of
customers towards a certain brand and presents hidden knowledge (e.g. customer
and business insights) that can be used for decision making after the text analysis is
performed. More so, stressed that there is limited academic literature surrounding
text analytics of Twitter data, as a result, this paper attempts to contribute in this
developing field by providing a practical guide on how to mine and analyse
customers’ tweets.

METHODOLOGY:
1,connect to live media (twitter,facebook,yelp,amazon)data stream, extract using API and store
data on hadoop
2,Process data in hadoop; Restructure, filter and provide useul insights from it
3,Create tables in hadoop
4,Create an attractve user friendly interface to end users for querying
5,Do sentiment analysis by comparing sentiments of public about a subject
6,Provide location based classification and analysis
7,Identify fake reviews using supervised machine learning algorithms
8,Provide visualization of analytics
9,Create pie chart,percentage calculation,bar diagram, word cloud and histogram

Existing System Proposed System

Focus on twitter data Focus on twitter, amazon, yelp and other
online sources
No facility for fake review identification Integrated with fake review identification
using supervised machine learning
algorithms
Location based features available on Integrated location based review analysis
independent models
No user friendly interface Attractive user friendly interface
Not an all in one analysis system Enhancing to be an all in one analysis system
SVM not used SVM is used

OVERVIEW
Tweets are imported using R and the data is cleaned by removing emoticons and
URLs. Lexical Analysis as well as Naive Bayes Classifier is used to predict the
sentiment of tweets and subsequently express the opinion graphically through
ggplots, histogram, pie chart, wordcloud and tables. The front end has been created
using the Shiny App.

FEATURES
1. Extraction of data
(i) Create twitter application
(ii) twitteR - Provides an interface to the Twitter web API
(iii) ROAuth - R Interface For OAuth
(iv) Create twitter authenticated credential object(using key from step (ii) and
cacert.pem certificate): It is done using consumer key, consumer secret, access
token, access secret.
(v) During authentication, we are redirected to a URL automatically where we
click on Authorize app as shown in the image below and enter the unique
7-digit number to get linked to the account from which feeds are being taken.

2. Cleaning data
The tweets are cleaned in R by removing:
● Extra punctuation
● Stop words (Most commonly used words in a language like the, is, at,
which, and on.)
● Redundant Blank spaces
● Emoticons
● URLS

3. Loading Word Database

A database, created by Hui Lui containing positive and negative words, is
loaded into R. This is used for Lexical Analysis, where the words in the tweets are
compared with the words in the database and the sentiment is predicted.
For movie tweets, Naive Bayes Machine Learning Algorithm is used. AFINN is
a list of English words rated for valence with an integer between minus five
(negative) and plus five (positive). The words have been manually labeled by Finn
Årup Nielsen in 2009-2011. The file is tab-separated. The version used is:
AFINN-111: Newest version with 2477 words and phrases.

4. Algorithms used
● Lexical Analysis: By comparing uni-grams to the pre-loaded word
database, the tweet is assigned sentiment score - positive, negative or
neutral and overall score is calculated.
● Naive Bayes Machine Learning Algorithm: Training data sets are
used to teach the machine what kind of sentences are categorized as
positive and what kind are categorized as negative. On arrival of a new
tweet or sentence, the machine uses this algorithm to give the correct
category to the new data and adds level to the emotion.

5. Classification based on location and type of data

The data collected can be classified based location of users . Thus analysis can be made
based on continent, country, state or particular area of users.

6. Fake review identification

In the table tab of our Shiny Web app as shown below, we have presented the
scores, the tweets as well as the percentage of positive/negative emotion in the text.
Th is calculated using simple arithmetic to understand the overall sentiment in a
more better manner.

7. Results representation
In the table tab of our Shiny Web app as shown below, we have presented the
scores, the tweets as well as the percentage of positive/negative emotion in the text.
Th is calculated using simple arithmetic to understand the overall sentiment in a
more better manner.

SYSTEM REQUIREMENTS
● Installation of R
● Installation of R Studio
● Installation of HADOOP HDFS
● Media Authentication to access API(Twitter,amazon etc)

Software Requirement
• Operating System : Windows XP or above
• Web Server : Apache
• IDE : Netbeans
• Language :R
• Interface framework : Shiny
• Storage : Hadoop (HDFS)

Hardware Requirements
• Processor : Intel core i5

• Speed : 2.1 GHz

• RAM : 4GB

Sentiment Analysis of Twitter Data My
75% (4)
Sentiment Analysis of Twitter Data My
14 pages
Data mining and sentiment analysis: discovering emotional patterns in text data
No ratings yet
Data mining and sentiment analysis: discovering emotional patterns in text data
8 pages
Tecucc 3000
No ratings yet
Tecucc 3000
449 pages
Software Project Management: Sixth Edition
100% (1)
Software Project Management: Sixth Edition
43 pages
CP R81.10 QoS AdminGuide
No ratings yet
CP R81.10 QoS AdminGuide
111 pages
f5 Asm Operations Guide
No ratings yet
f5 Asm Operations Guide
121 pages
Buku Ethical Hacking Oleh Andri Muhyidin PDF
No ratings yet
Buku Ethical Hacking Oleh Andri Muhyidin PDF
81 pages
HR811_EN_Col94_CO_A4
No ratings yet
HR811_EN_Col94_CO_A4
25 pages
Changing Roles of HR Managers
No ratings yet
Changing Roles of HR Managers
98 pages
Prav REPORT
No ratings yet
Prav REPORT
82 pages
Mapping 2608
No ratings yet
Mapping 2608
82 pages
Murali Report
No ratings yet
Murali Report
75 pages
Team of One Agentic AI for Security 1743838803
No ratings yet
Team of One Agentic AI for Security 1743838803
115 pages
HRM Management Practices
No ratings yet
HRM Management Practices
74 pages
SMART PUSHNOTE - An Agent Based Intelligent Push Notification System 2017-18
No ratings yet
SMART PUSHNOTE - An Agent Based Intelligent Push Notification System 2017-18
60 pages
Network Intrusion Detection System
No ratings yet
Network Intrusion Detection System
46 pages
Live Amusement Project Report 2020
No ratings yet
Live Amusement Project Report 2020
50 pages
What To Measure in The Peoplesoft 8 Environment: Performance Metrics
No ratings yet
What To Measure in The Peoplesoft 8 Environment: Performance Metrics
67 pages
Big Data and Hadoop-Sentiment Analysis Using Flume and Hive
No ratings yet
Big Data and Hadoop-Sentiment Analysis Using Flume and Hive
27 pages
CH - 1 Bscs Computer Science 1st Semister
No ratings yet
CH - 1 Bscs Computer Science 1st Semister
46 pages
FortiNAC v9.2 Getting Started Guide Partner
100% (1)
FortiNAC v9.2 Getting Started Guide Partner
211 pages
Twitter-Sentiment Documentation
No ratings yet
Twitter-Sentiment Documentation
48 pages
A Case Study On Equity Linked Tax Saving Schemes in Mutual Funds and Risk and Return Analysis With Reference To Private Sector, Kerala
No ratings yet
A Case Study On Equity Linked Tax Saving Schemes in Mutual Funds and Risk and Return Analysis With Reference To Private Sector, Kerala
43 pages
A Case Study Dissertation On
No ratings yet
A Case Study Dissertation On
44 pages
E-Commerce 2018: Business. Technology. Society: Fourteenth Edition
No ratings yet
E-Commerce 2018: Business. Technology. Society: Fourteenth Edition
54 pages
The Impactof Twitter Sentimentson Stock Market Trends
No ratings yet
The Impactof Twitter Sentimentson Stock Market Trends
67 pages
Progress Report: Sl. No. Particulars
No ratings yet
Progress Report: Sl. No. Particulars
40 pages
File Info
No ratings yet
File Info
44 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
25 pages
Fisheries Management System 2019
No ratings yet
Fisheries Management System 2019
20 pages
Setup Guide VEVMware Clustered
No ratings yet
Setup Guide VEVMware Clustered
44 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
27 pages
Twitter Sentiment Analysis by Robin Singh
No ratings yet
Twitter Sentiment Analysis by Robin Singh
57 pages
Project
No ratings yet
Project
28 pages
Oracle 11G Datapump Overview-Part I
No ratings yet
Oracle 11G Datapump Overview-Part I
36 pages
DA Project Report
No ratings yet
DA Project Report
17 pages
Sentiment Analyzer for E-commerce
No ratings yet
Sentiment Analyzer for E-commerce
16 pages
SMA Experiment
No ratings yet
SMA Experiment
29 pages
BI Reading Presentation v0.3
No ratings yet
BI Reading Presentation v0.3
25 pages
Restricting Unsolicited Approaches and Counterfeit Users: Batch No: 28 Guided by Done by
No ratings yet
Restricting Unsolicited Approaches and Counterfeit Users: Batch No: 28 Guided by Done by
28 pages
Portugese in Calicut
No ratings yet
Portugese in Calicut
28 pages
Twitter sentiment analysis
No ratings yet
Twitter sentiment analysis
71 pages
ProjectFinalReport 2copies
No ratings yet
ProjectFinalReport 2copies
26 pages
Sell products or services online
No ratings yet
Sell products or services online
9 pages
Review Analysis Using R Software: Team Members
No ratings yet
Review Analysis Using R Software: Team Members
10 pages
A Study On Training Needs Analysis Process at Kodiyeri Service Co-Operative Bank, Kannur
No ratings yet
A Study On Training Needs Analysis Process at Kodiyeri Service Co-Operative Bank, Kannur
18 pages
About Dr. Ravi Muppirala
No ratings yet
About Dr. Ravi Muppirala
8 pages
The Impact of Cybercrime On Nigerian Youths: December 2020
No ratings yet
The Impact of Cybercrime On Nigerian Youths: December 2020
10 pages
Final Project Report
No ratings yet
Final Project Report
43 pages
Service Quotas: User Guide
No ratings yet
Service Quotas: User Guide
19 pages
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
No ratings yet
Sentiment Analysis On Twitter Data Using Machine Learning Algorithms in Python
15 pages
AIML8P
No ratings yet
AIML8P
23 pages
Sentiment Analysis Tool Using Machine Learning Algorithms
No ratings yet
Sentiment Analysis Tool Using Machine Learning Algorithms
5 pages
Preprocessing The Informal Text For Efficient Sentiment Analysis
No ratings yet
Preprocessing The Informal Text For Efficient Sentiment Analysis
4 pages
Sentiment Analysis of Tweets Using Machine Learning
No ratings yet
Sentiment Analysis of Tweets Using Machine Learning
22 pages
Study of Online Food Delivery Services In-Covid - 19 Period
No ratings yet
Study of Online Food Delivery Services In-Covid - 19 Period
14 pages
Twitter BDA Presentation
No ratings yet
Twitter BDA Presentation
15 pages
ML Paper (Namrit & Ritika)
No ratings yet
ML Paper (Namrit & Ritika)
16 pages
Introduction
No ratings yet
Introduction
27 pages
MCQ On Mechanics
No ratings yet
MCQ On Mechanics
10 pages
IJRPR6548
No ratings yet
IJRPR6548
5 pages
Product Rating Through Sentiment Analysis
No ratings yet
Product Rating Through Sentiment Analysis
23 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
Twitter Sentiment Analysis For Product Review
No ratings yet
Twitter Sentiment Analysis For Product Review
19 pages
FML Project Report
No ratings yet
FML Project Report
18 pages
Paper 16-Customer Satisfaction Measurement Using Sentiment
No ratings yet
Paper 16-Customer Satisfaction Measurement Using Sentiment
13 pages
Sentiment Analysis Twitter
No ratings yet
Sentiment Analysis Twitter
3 pages
Vaibhav DSBDA Project
No ratings yet
Vaibhav DSBDA Project
16 pages
Sentiment Analysis of Tweets Using Python: Dr. Ritesh Srivastava, Bharat Singh, Choudhary Rishab Kumar, Prashant Raj
No ratings yet
Sentiment Analysis of Tweets Using Python: Dr. Ritesh Srivastava, Bharat Singh, Choudhary Rishab Kumar, Prashant Raj
4 pages
Y Mall Final Report
No ratings yet
Y Mall Final Report
34 pages
gx8 Design
No ratings yet
gx8 Design
11 pages
IR Case Study Final Presentation
No ratings yet
IR Case Study Final Presentation
12 pages
Sentiment Analysis For Promotional Campaigns: 1 Sameer Mulani 2 Nikhat Pathan
No ratings yet
Sentiment Analysis For Promotional Campaigns: 1 Sameer Mulani 2 Nikhat Pathan
3 pages
Shybin Synopsis
No ratings yet
Shybin Synopsis
8 pages
SAP ABAP On HANA Interview Questions 1706469944
No ratings yet
SAP ABAP On HANA Interview Questions 1706469944
5 pages
PROJECT REVIEW ON THE OPINION MININ
No ratings yet
PROJECT REVIEW ON THE OPINION MININ
4 pages
Sample_1
No ratings yet
Sample_1
22 pages
Online Polling For Products
No ratings yet
Online Polling For Products
6 pages
A Study On Software Quality Assurance in A Recent Trend: Amitav Saran, Khitish Kumar Gadnayak, Jagannath Ray
No ratings yet
A Study On Software Quality Assurance in A Recent Trend: Amitav Saran, Khitish Kumar Gadnayak, Jagannath Ray
8 pages
Bhumesh RD
No ratings yet
Bhumesh RD
9 pages
Sentiment Analysis of Android and iOS OS Based Smart Phones Via Twitter Tweets
No ratings yet
Sentiment Analysis of Android and iOS OS Based Smart Phones Via Twitter Tweets
5 pages
Diya Synopsis
No ratings yet
Diya Synopsis
5 pages
Jesmya Synopsis
No ratings yet
Jesmya Synopsis
5 pages
Profanity Filtering Using PHP: Natural Language
No ratings yet
Profanity Filtering Using PHP: Natural Language
4 pages
Top Optical Shop Software, Billing Software For Optical Store, WebsLayout
No ratings yet
Top Optical Shop Software, Billing Software For Optical Store, WebsLayout
5 pages
RESTAURANT REVIEW PRODUCTION ANALYSIS USING PYTHON (1)
No ratings yet
RESTAURANT REVIEW PRODUCTION ANALYSIS USING PYTHON (1)
33 pages
Fds Casestudy Chan
No ratings yet
Fds Casestudy Chan
9 pages
Fin Irjmets1715854730
No ratings yet
Fin Irjmets1715854730
8 pages
I Farm: Register
No ratings yet
I Farm: Register
4 pages
Higher Education Access Prediction Using Data-Mining: Features
No ratings yet
Higher Education Access Prediction Using Data-Mining: Features
4 pages
COL334 Assignment 2 Final
No ratings yet
COL334 Assignment 2 Final
5 pages
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
No ratings yet
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
5 pages
SOFTWARE ENGINEERING_PROJECT PROPOSAL
No ratings yet
SOFTWARE ENGINEERING_PROJECT PROPOSAL
13 pages
Fake Review Detecction Synopsis
No ratings yet
Fake Review Detecction Synopsis
3 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
3 pages
Sentiment Analysis of Twitter Data: Sahar A. El - Rahman Feddah Alhumaidi Alotaibi Wejdan Abdullah Alshehri
No ratings yet
Sentiment Analysis of Twitter Data: Sahar A. El - Rahman Feddah Alhumaidi Alotaibi Wejdan Abdullah Alshehri
4 pages
Chettinad College of Engineering and Technology
No ratings yet
Chettinad College of Engineering and Technology
4 pages
Textual_Analysis_Sentiment_Analysis_Presentation
No ratings yet
Textual_Analysis_Sentiment_Analysis_Presentation
15 pages
Nutanix Study Notes (Part 1) - InfraPCS
No ratings yet
Nutanix Study Notes (Part 1) - InfraPCS
7 pages
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
No ratings yet
Sentiment Analysis On Twitter Data-Set Using Naive Bayes Algorithm
4 pages
Twitter Sentiment Analysis
No ratings yet
Twitter Sentiment Analysis
7 pages
Photo Encryption Tool
No ratings yet
Photo Encryption Tool
2 pages
0.OMG-OCEB2-FUND100 Exam Overview
No ratings yet
0.OMG-OCEB2-FUND100 Exam Overview
3 pages
45 Ijmtst0806103
No ratings yet
45 Ijmtst0806103
4 pages
10 1109@icict48043 2020 9112546
No ratings yet
10 1109@icict48043 2020 9112546
6 pages
Implementation of Sentiment Analysis On Twitter Data
No ratings yet
Implementation of Sentiment Analysis On Twitter Data
6 pages
Joel Affel
No ratings yet
Joel Affel
2 pages
Experiment - 9
No ratings yet
Experiment - 9
9 pages
Sentiment Analysis Using Microsoft Azure Machine Learning and Python IJERTV10IS110099
No ratings yet
Sentiment Analysis Using Microsoft Azure Machine Learning and Python IJERTV10IS110099
4 pages
Yaswanth Infosys
No ratings yet
Yaswanth Infosys
1 page
Social Media Se
No ratings yet
Social Media Se
3 pages
crowd sourcing platform IEEE paper 1
No ratings yet
crowd sourcing platform IEEE paper 1
7 pages
Effective Sentiment Analysis of Twitter with Apache Spark
No ratings yet
Effective Sentiment Analysis of Twitter with Apache Spark
8 pages
detailed2 (1)
No ratings yet
detailed2 (1)
4 pages
The AI Cash Machine: Unleash Passive Income Riches with Generative AI
From Everand
The AI Cash Machine: Unleash Passive Income Riches with Generative AI
The Suburban Guru
No ratings yet
Learning Hunk: A quick, practical guide to rapidly visualizing and analyzing your Hadoop data using Hunk
From Everand
Learning Hunk: A quick, practical guide to rapidly visualizing and analyzing your Hadoop data using Hunk
Dmitry Anoshin
No ratings yet