0% found this document useful (0 votes)
6 views24 pages

BI UNIT 2 & 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views24 pages

BI UNIT 2 & 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT 2

Decision making is the process of making choices by identifying a decision, gathering information,
and assessing alternative resolutions.

Using a step-by-step decision-making process can help you make more deliberate, thoughtful
decisions by organizing relevant information and defining alternatives. This approach increases the
chances that you will choose the most satisfying alternative possible.

Step 1: Identify the decision


You realize that you need to make a decision. Try to clearly define the nature of the decision you
must make. This first step is very important.

Step 2: Gather relevant information


Collect some pertinent information before you make your decision: what information is needed, the
best sources of information, and how to get it. This step involves both internal and external “work.”
Some information is internal: you’ll seek it through a process of self-assessment. Other information
is external: you’ll find it online, in books, from other people, and from other sources.

Step 3: Identify the alternatives


As you collect information, you will probably identify several possible paths of action, or
alternatives. You can also use your imagination and additional information to construct new
alternatives. In this step, you will list all possible and desirable alternatives.

Step 4: Weigh the evidence


Draw on your information and emotions to imagine what it would be like if you carried out each of
the alternatives to the end. Evaluate whether the need identified in Step 1 would be met or
resolved through the use of each alternative. As you go through this difficult internal process, you’ll
begin to favor certain alternatives: those that seem to have a higher potential for reaching your
goal. Finally, place the alternatives in a priority order, based upon your own value system.

Step 5: Choose among alternatives


Once you have weighed all the evidence, you are ready to select the alternative that seems to be
best one for you. You may even choose a combination of alternatives. Your choice in Step 5 may
very likely be the same or similar to the alternative you placed at the top of your list at the end of
Step 4.

Step 6: Take action


You’re now ready to take some positive action by beginning to implement the alternative you
chose in Step 5.

Step 7: Review your decision & its consequences


In this final step, consider the results of your decision and evaluate whether or not it has resolved
the need you identified in Step 1. If the decision has not met the identified need, you may want to
repeat certain steps of the process to make a new decision. For example, you might want to gather
more detailed or somewhat different information or explore additional alternatives.
What is a Decision Support System
(DSS)?
A decision support system (DSS) is an information system that aids a business in
decision-making activities that require judgment, determination, and a sequence of
actions.

The information system assists the mid- and high-level management of an


organization by analyzing huge volumes of unstructured data and accumulating
information that can help solve problems and help in decision-making. A DSS is either
human-powered, automated, or a combination of both.

Purpose of a Decision Support System


A decision support system produces detailed information reports by gathering and
analyzing data. Hence, a DSS is different from a normal operations application, whose
goal is to collect data and not analyze it.

In an organization, a DSS is used by the planning departments – such as the


operations department – which collects data and creates a report that can be used by
managers for decision-making. Mainly, a DSS is used in sales projection,
for inventory and operations-related data, and to present information to customers in
an easy-to-understand manner.

Theoretically, a DSS can be employed in various knowledge domains from an


organization to forest management and the medical field. One of the main
applications of a DSS in an organization is real-time reporting. It can be very helpful
for organizations that take part in just-in-time (JIT) inventory management.

In a JIT inventory system, the organization requires real-time data of their inventory
levels to place orders “just in time” to prevent delays in production and cause a
negative domino effect. Therefore, a DSS is more tailored to the individual or
organization making the decision than a traditional system.

Components of a Decision Support System


The three main components of a DSS framework are:

1. Model Management System

The model management system S=stores models that managers can use in their
decision-making. The models are used in decision-making regarding the financial
health of the organization and forecasting demand for a good or service.

2. User Interface

The user interface includes tools that help the end-user of a DSS to navigate through
the system.

3. Knowledge Base

The knowledge base includes information from internal sources (information collected
in a transaction process system) and external sources (newspapers and online
databases).
Types of Decision Support Systems
 Communication-driven: Allows companies to support tasks that require
more than one person to work on the task. It includes integrated tools such as
Microsoft SharePoint Workspace and Google Docs.
 Model-driven: Allows access to and the management of financial,
organizational, and statistical models. Data is collected, and parameters are
determined using the information provided by users. The information is
created into a decision-making model to analyze situations. An example of a
model-driven DSS is Dicodess – an open-source model-driven DSS.
 Knowledge-driven: Provides factual and specialized solutions to situations
using stored facts, procedures, rules, or interactive decision-making structures
like flowcharts.
 Document-driven: Manages unstructured information in different electronic
formats.
 Data-driven: Helps companies to store and analyze internal and external
data.

Advantages of a Decision Support System


 A decision support system increases the speed and efficiency of decision-
making activities. It is possible, as a DSS can collect and analyze real-time
data.
 It promotes training within the organization, as specific skills must be
developed to implement and run a DSS within an organization.
 It automates monotonous managerial processes, which means more of the
manager’s time can be spent on decision-making.
 It improves interpersonal communication within the organization.

Disadvantages of a Decision Support System


 The cost to develop and implement a DSS is a huge capital investment, which
makes it less accessible to smaller organizations.
 A company can develop a dependence on a DSS, as it is integrated into daily
decision-making processes to improve efficiency and speed. However,
managers tend to rely on the system too much, which takes away the
subjectivity aspect of decision-making.
 A DSS may lead to information overload because an information system tends
to consider all aspects of a problem. It creates a dilemma for end-users, as
they are left with multiple choices.
 Implementation of a DSS can cause fear and backlash from lower-level
employees. Many of them are not comfortable with new technology and are
afraid of losing their jobs to technology.

4 Phases of the Decision-Making


Process
Simon’s model defines four phases of decision-making process:

 Intelligence Phase
 Design Phase
 Choice Phase
 Implementation Phase
Intelligence Phase
Firstly, the decision-making process starts with the intelligence phase. In the
first phase, decision makers examine reality and try to identify problems or
opportunities correctly. This phase is not only related to the Simon’s decision-
making process, but also to other fields and other methodologies. For
example, we like to practice Lean Startup methodology which emphasizes the
importance of right problem definition before building anything (product or
business).

Additionally, one of the pillars of digital transformation is the


data. Organizations need to become data driven. That means proper
usage and implementation of Business Intelligence (BI) systems. Business
Intelligence implementations are considered successful only if you have clear
business needs and see real benefits from it. Business Intelligence is not just
about data. It should be connected with organizational goals and
objectives!

Therefore, intelligence phase includes actions like:

 Defining organizational objectives


 Data collection
 Problem identification and classification
 ….

The intelligence phase can last really long. But, since decision-making process
starts with this phase, it should be to be done properly. This is a key
ingredient in every business success.

Design Phase
The main goal of the design phase is to define and construct a model which
represent a system, by defining relationships between collected variables.
Once we validate the model, we define the criteria of choice and search for
several possible solutions for the defined problem (opportunity). We wrap up
the design phase by predicting the future outcomes for each alternative.

Choice Phase
In this phase we are actually making decisions. The end product of this phase
is a decision. Decision is made by selecting and evaluating alternatives
defined in previous step. If we are sure that the decision we made can
actually be achieved – we are ready for the next phase.

Implementation Phase
All the previous steps we’ve made (intelligence, design, and choice) are now
implemented. Implementation can be either successful or not. Successful
implementation results with a solution to the defined problem. On the other
hand, failure brings us back to the earlier phase.

We described Simon’s model which, even today, serves as the basis of most
models of decision-making process. A process is described as a series of
events that precede final decisions. It is important to say that, at any point,
the decision maker may choose to return to the previous step for additional
validation.

Even Simon’s model was sometimes criticized as being general, that is why
we need to be aware of the importance of decision-making. This model is a
concept, a framework of how organizations and managers make decisions.

Unit-3 - Predictive modeling and


sentiment analysis
What are Neural Networks?
Neural networks extract identifying features from data, lacking pre-
programmed understanding. Network components include neurons,
connections, weights, biases, propagation functions, and a learning rule.
Neurons receive inputs, governed by thresholds and activation functions.
Connections involve weights and biases regulating information transfer.
Learning, adjusting weights and biases, occurs in three stages: input
computation, output generation, and iterative refinement enhancing the
network’s proficiency in diverse tasks.
These include:
1. The neural network is simulated by a new environment.
2. Then the free parameters of the neural network are changed as a result of
this simulation.
3. The neural network then responds in a new way to the environment
because of the changes in its free parameters.

Importance of Neural Networks

The ability of neural networks to identify patterns, solve intricate puzzles,


and adjust to changing surroundings is essential. Their capacity to learn
from data has far-reaching effects, ranging from revolutionizing technology
like natural language processing and self-driving automobiles to automating
decision-making processes and increasing efficiency in numerous industries.
The development of artificial intelligence is largely dependent on neural
networks, which also drive innovation and influence the direction of
technology.

How does Neural Networks work?


Let’s understand with an example of how a neural network works:
Consider a neural network for email classification. The input layer takes
features like email content, sender information, and subject. These inputs,
multiplied by adjusted weights, pass through hidden layers. The network,
through training, learns to recognize patterns indicating whether an email is
spam or not. The output layer, with a binary activation function, predicts
whether the email is spam (1) or not (0). As the network iteratively refines
its weights through backpropagation, it becomes adept at distinguishing
between spam and legitimate emails, showcasing the practicality of neural
networks in real-world applications like email filtering.

Working of a Neural Network

Neural networks are complex systems that mimic some features of the
functioning of the human brain. It is composed of an input layer, one or more
hidden layers, and an output layer made up of layers of artificial neurons
that are coupled. The two stages of the basic process are called
backpropagation and forward propagation.
Forward Propagation
 Input Layer: Each feature in the input layer is represented by a node on
the network, which receives input data.
 Weights and Connections: The weight of each neuronal connection
indicates how strong the connection is. Throughout training, these
weights are changed.
 Hidden Layers: Each hidden layer neuron processes inputs by
multiplying them by weights, adding them up, and then passing them
through an activation function. By doing this, non-linearity is introduced,
enabling the network to recognize intricate patterns.
 Output: The final result is produced by repeating the process until the
output layer is reached.
Backpropagation
 Loss Calculation: The network’s output is evaluated against the real
goal values, and a loss function is used to compute the difference. For a
regression problem, the Mean Squared Error (MSE) is commonly used as
the cost function.

Loss Function:
 Gradient Descent: Gradient descent is then used by the network to
reduce the loss. To lower the inaccuracy, weights are changed based on
the derivative of the loss with respect to each weight.
 Adjusting weights: The weights are adjusted at each connection by
applying this iterative process, or backpropagation, backward across the
network.
 Training: During training with different data samples, the entire process
of forward propagation, loss calculation, and backpropagation is done
iteratively, enabling the network to adapt and learn patterns from the
data.
 Actvation Functions: Model non-linearity is introduced by activation
functions like the rectified linear unit (ReLU) or sigmoid. Their decision on
whether to “fire” a neuron is based on the whole weighted input.
Types of Neural Networks
There are seven types of neural networks that can be used.
 Feedforward Neteworks: A feedforward neural network is a simple
artificial neural network architecture in which data moves from input to
output in a single direction. It has input, hidden, and output layers;
feedback loops are absent. Its straightforward architecture makes it
appropriate for a number of applications, such as regression and pattern
recognition.
 Multilayer Perceptron (MLP): MLP is a type of feedforward neural
network with three or more layers, including an input layer, one or more
hidden layers, and an output layer. It uses nonlinear activation functions.
 Convolutional Neural Network (CNN): A Convolutional Neural
Network (CNN) is a specialized artificial neural network designed for
image processing. It employs convolutional layers to automatically learn
hierarchical features from input images, enabling effective image
recognition and classification. CNNs have revolutionized computer vision
and are pivotal in tasks like object detection and image analysis.
 Recurrent Neural Network (RNN): An artificial neural network type
intended for sequential data processing is called a Recurrent Neural
Network (RNN). It is appropriate for applications where contextual
dependencies are critical, such as time series prediction and natural
language processing, since it makes use of feedback loops, which enable
information to survive within the network.
 Long Short-Term Memory (LSTM): LSTM is a type of RNN that is
designed to overcome the vanishing gradient problem in training RNNs. It
uses memory cells and gates to selectively read, write, and erase
information.
Support Vector Machine
Support Vector Machine (SVM) is a supervised machine learning algorithm used for both
classification and regression. Though we say regression problems as well it’s best suited for
classification. The main objective of the SVM algorithm is to find the optimal hyperplane in an N-
dimensional space that can separate the data points in different classes in the feature space. The
hyperplane tries that the margin between the closest points of different classes should be as
maximum as possible. The dimension of the hyperplane depends upon the number of features. If
the number of input features is two, then the hyperplane is just a line. If the number of input
features is three, then the hyperplane becomes a 2-D plane. It becomes difficult to imagine when
the number of features exceeds three.
Let’s consider two independent variables x1, x2, and one dependent variable which is either a blue
circle or a red circle.
From the figure above it’s very clear that there are multiple lines (our
hyperplane here is a line because we are considering only two input features
x1, x2) that segregate our data points or do a classification between red and
blue circles. So how do we choose the best line or in general the best
hyperplane that segregates our data points?

How does SVM work?

One reasonable choice as the best hyperplane is the one that represents the
largest separation or margin between the two classes.

So we choose the hyperplane whose distance from it to the nearest data


point on each side is maximized. If such a hyperplane exists it is known as
the maximum-margin hyperplane/hard margin. So from the above
figure, we choose L2. Let’s consider a scenario like shown below
Here we have one blue ball in the boundary of the red ball. So how does SVM
classify the data? It’s simple! The blue ball in the boundary of red ones is an
outlier of blue balls. The SVM algorithm has the characteristics to ignore the
outlier and finds the best hyperplane that maximizes the margin. SVM is
robust to outliers.
So in this type of data point what SVM does is, finds the maximum margin as
done with previous data sets along with that it adds a penalty each time a
point crosses the margin. So the margins in these types of cases are
called soft margins. When there is a soft margin to the data set, the SVM
tries to minimize (1/margin+∧(∑penalty)). Hinge loss is a commonly used
penalty. If no violations no hinge loss.If violations hinge loss proportional to
the distance of violation.

Support Vector Machine Terminology

1. Hyperplane: Hyperplane is the decision boundary that is used to


separate the data points of different classes in a feature space. In the
case of linear classifications, it will be a linear equation i.e. wx+b = 0.
2. Support Vectors: Support vectors are the closest data points to the
hyperplane, which makes a critical role in deciding the hyperplane and
margin.
3. Margin: Margin is the distance between the support vector and
hyperplane. The main objective of the support vector machine algorithm
is to maximize the margin. The wider margin indicates better
classification performance.
4. Kernel: Kernel is the mathematical function, which is used in SVM to map
the original input data points into high-dimensional feature spaces, so,
that the hyperplane can be easily found out even if the data points are
not linearly separable in the original input space. Some of the common
kernel functions are linear, polynomial, radial basis function(RBF), and
sigmoid.
5. Hard Margin: The maximum-margin hyperplane or the hard margin
hyperplane is a hyperplane that properly separates the data points of
different categories without any misclassifications.
6. Soft Margin: When the data is not perfectly separable or contains
outliers, SVM permits a soft margin technique. Each data point has a slack
variable introduced by the soft-margin SVM formulation, which softens the
strict margin requirement and permits certain misclassifications or
violations. It discovers a compromise between increasing the margin and
reducing violations.
7. C: Margin maximisation and misclassification fines are balanced by the
regularisation parameter C in SVM. The penalty for going over the margin
or misclassifying data items is decided by it. A stricter penalty is imposed
with a greater value of C, which results in a smaller margin and perhaps
fewer misclassifications.
8. Hinge Loss: A typical loss function in SVMs is hinge loss. It punishes
incorrect classifications or margin violations. The objective function in
SVM is frequently formed by combining it with the regularisation term.
9. Dual Problem: A dual Problem of the optimisation problem that requires
locating the Lagrange multipliers related to the support vectors can be
used to solve SVM. The dual formulation enables the use of kernel tricks
and more effective computing.

Types of Support Vector Machine

Based on the nature of the decision boundary, Support Vector Machines


(SVM) can be divided into two main parts:
 Linear SVM: Linear SVMs use a linear decision boundary to separate the
data points of different classes. When the data can be precisely linearly
separated, linear SVMs are very suitable. This means that a single straight
line (in 2D) or a hyperplane (in higher dimensions) can entirely divide the
data points into their respective classes. A hyperplane that maximizes the
margin between the classes is the decision boundary.
 Non-Linear SVM: Non-Linear SVM can be used to classify data when it
cannot be separated into two classes by a straight line (in the case of 2D).
By using kernel functions, nonlinear SVMs can handle nonlinearly
separable data. The original input data is transformed by these kernel
functions into a higher-dimensional feature space, where the data points
can be linearly separated. A linear SVM is used to locate a nonlinear
decision boundary in this modified space.

Advantages of SVM

 Effective in high-dimensional cases.


 Its memory is efficient as it uses a subset of training points in the decision
function called support vectors.
 Different kernel functions can be specified for the decision functions and
its possible to specify custom kernels.
What is the K-Nearest Neighbors Algorithm?
KNN is one of the most basic yet essential classification algorithms in
machine learning. It belongs to the supervised learning domain and finds
intense application in pattern recognition, data mining, and intrusion
detection.
It is widely disposable in real-life scenarios since it is non-parametric,
meaning it does not make any underlying assumptions about the distribution
of data (as opposed to other algorithms such as GMM, which assume
a Gaussian distribution of the given data). We are given some prior data
(also called training data), which classifies coordinates into groups identified
by an attribute.
As an example, consider the following table of data points containing two
features:

Why do we need a KNN algorithm?


(K-NN) algorithm is a versatile and widely used machine learning algorithm
that is primarily used for its simplicity and ease of implementation. It does
not require any assumptions about the underlying data distribution. It can
also handle both numerical and categorical data, making it a flexible choice
for various types of datasets in classification and regression tasks. It is a
non-parametric method that makes predictions based on the similarity of
data points in a given dataset. K-NN is less sensitive to outliers compared to
other algorithms.
The K-NN algorithm works by finding the K nearest neighbors to a given data
point based on a distance metric, such as Euclidean distance. The class or
value of the data point is then determined by the majority vote or average of
the K neighbors. This approach allows the algorithm to adapt to different
patterns and make predictions based on the local structure of the data.

Distance Metrics Used in KNN Algorithm


As we know that the KNN algorithm helps us identify the nearest points or
the groups for a query point. But to determine the closest groups or the
nearest points for a query point we need some metric. For this purpose, we
use below distance metrics:

Euclidean Distance

This is nothing but the cartesian distance between the two points which are
in the plane/hyperplane. Euclidean distance can also be visualized as the
length of the straight line that joins the two points which are into
consideration. This metric helps us calculate the net displacement done
between the two states of an object.
Manhattan Distance

Manhattan Distance metric is generally used when we are interested in the


total distance traveled by the object instead of the displacement. This metric
is calculated by summing the absolute difference between the coordinates of
the points in n-dimensions.

Minkowski Distance

We can say that the Euclidean, as well as the Manhattan distance, are
special cases of the Minkowski distance.

How to choose the value of k for KNN Algorithm?


The value of k is very crucial in the KNN algorithm to define the number of
neighbors in the algorithm. The value of k in the k-nearest neighbors (k-NN)
algorithm should be chosen based on the input data. If the input data has
more outliers or noise, a higher value of k would be better. It is
recommended to choose an odd value for k to avoid ties in
classification. Cross-validation methods can help in selecting the best k
value for the given dataset.

Workings of KNN algorithm


Thе K-Nearest Neighbors (KNN) algorithm operates on the principle of
similarity, where it predicts the label or value of a new data point by
considering the labels or values of its K nearest neighbors in the training
dataset.

Step-by-Step explanation of how KNN works is discussed below:

Step 1: Selecting the optimal value of K

 K represents the number of nearest neighbors that needs to be


considered while making prediction.

Step 2: Calculating distance


 To measure the similarity between target and training data points,
Euclidean distance is used. Distance is calculated between each of the
data points in the dataset and target point.

Step 3: Finding Nearest Neighbors

 The k data points with the smallest distances to the target point are the
nearest neighbors.

Step 4: Voting for Classification or Taking Average for


Regression

 In the classification problem, the class labels of are determined by


performing majority voting. The class with the most occurrences among
the neighbors becomes the predicted class for the target data point.
 In the regression problem, the class label is calculated by taking average
of the target values of K nearest neighbors. The calculated average value
becomes the predicted output for the target data point.
Advantages of the KNN Algorithm
 Easy to implement as the complexity of the algorithm is not that high.
 Adapts Easily – As per the working of the KNN algorithm it stores all the
data in memory storage and hence whenever a new example or data
point is added then the algorithm adjusts itself as per that new example
and has its contribution to the future predictions as well.
 Few Hyperparameters – The only parameters which are required in the
training of a KNN algorithm are the value of k and the choice of the
distance metric which we would like to choose from our evaluation metric.
Disadvantages of the KNN Algorithm
 Does not scale – As we have heard about this that the KNN algorithm is
also considered a Lazy Algorithm. The main significance of this term is
that this takes lots of computing power as well as data storage. This
makes this algorithm both time-consuming and resource exhausting.
 Curse of Dimensionality – There is a term known as the peaking
phenomenon according to this the KNN algorithm is affected by the curse
of dimensionality which implies the algorithm faces a hard time
classifying the data points properly when the dimensionality is too high.
 Prone to Overfitting – As the algorithm is affected due to the curse of
dimensionality it is prone to the problem of overfitting as well. Hence
generally feature selection as well as dimensionality reduction techniques
are applied to deal with this problem.
What is Sentiment Analysis?
Sentiment analysis is the process of classifying whether a block of text is
positive, negative, or neutral. The goal that Sentiment mining tries to gain is
to be analysed people’s opinions in a way that can help businesses expand.
It focuses not only on polarity (positive, negative & neutral) but also on
emotions (happy, sad, angry, etc.). It uses various Natural Language
Processing algorithms such as Rule-based, Automatic, and Hybrid.
let’s consider a scenario, if we want to analyze whether a product is
satisfying customer requirements, or is there a need for this product in the
market. We can use sentiment analysis to monitor that product’s reviews.
Sentiment analysis is also efficient to use when there is a large set of
unstructured data, and we want to classify that data by automatically
tagging it.
Why is Sentiment Analysis Important?
Sentiment analysis is the contextual meaning of words that indicates the
social sentiment of a brand and also helps the business to determine
whether the product they are manufacturing is going to make a demand in
the market or not.
1. Sentiment Analysis is required as it stores data in an efficient, cost
friendly.
2. Sentiment analysis solves real-time issues and can help you solve all real-
time scenarios.
Here are some key reasons why sentiment analysis is important for
business:
 Customer Feedback Analysis: Businesses can analyze customer
reviews, comments, and feedback to understand the sentiment behind
them helping in identifying areas for improvement and addressing
customer concerns, ultimately enhancing customer satisfaction.
 Brand Reputation Management: Sentiment analysis allows businesses
to monitor their brand reputation in real-time.
By tracking mentions and sentiments on social media, review platforms,
and other online channels, companies can respond promptly to both
positive and negative sentiments, mitigating potential damage to their
brand.
 Product Development and Innovation: Understanding customer
sentiment helps identify features and aspects of their products or services
that are well-received or need improvement. This information is
invaluable for product development and innovation, enabling companies
to align their offerings with customer preferences.
 Competitor Analysis: Sentiment Analysis can be used to compare the
sentiment around a company’s products or services with those of
competitors.
Businesses identify their strengths and weaknesses relative to
competitors, allowing for strategic decision-making.
 Marketing Campaign Effectiveness
Businesses can evaluate the success of their marketing campaigns by
analyzing the sentiment of online discussions and social media mentions.
Positive sentiment indicates that the campaign is resonating with the
target audience, while negative sentiment may signal the need for
adjustments.
What are the Types of Sentiment Analysis?
Fine-Grained Sentiment Analysis

This depends on the polarity base. This category can be designed as very
positive, positive, neutral, negative, or very negative. The rating is done on a
scale of 1 to 5. If the rating is 5 then it is very positive, 2 then negative, and
3 then neutral.

Emotion detection

The sentiments happy, sad, angry, upset, jolly, pleasant, and so on come
under emotion detection. It is also known as a lexicon method of sentiment
analysis.

Aspect-Based Sentiment Analysis


It focuses on a particular aspect for instance if a person wants to check the
feature of the cell phone then it checks the aspect such as the battery,
screen, and camera quality then aspect based is used.

Multilingual Sentiment Analysis

Multilingual consists of different languages where the classification needs to


be done as positive, negative, and neutral. This is highly challenging and
comparatively difficult.

How does Sentiment Analysis work?


Sentiment Analysis in NLP, is used to determine the sentiment expressed in
a piece of text, such as a review, comment, or social media post.
The goal is to identify whether the expressed sentiment is positive, negative,
or neutral. let’s understand the overview in general two steps:

Preprocessing

Starting with collecting the text data that needs to be analysed for
sentiment like customer reviews, social media posts, news articles, or any
other form of textual content. The collected text is pre-processed to clean
and standardize the data with various tasks:
 Removing irrelevant information (e.g., HTML tags, special characters).
 Tokenization: Breaking the text into individual words or tokens.
 Removing stop words (common words like “and,” “the,” etc. that don’t
contribute much to sentiment).
 Stemming or Lemmatization: Reducing words to their root form.

Analysis

Text is converted for analysis using techniques like bag-of-words or word


embeddings (e.g., Word2Vec, GloVe).Models are then trained with labeled
datasets, associating text with sentiments (positive, negative, or neutral).
After training and validation, the model predicts sentiment on new data,
assigning labels based on learned patterns.

What are the Approaches to Sentiment Analysis?


There are three main approaches used:

Rule-based

Over here, the lexicon method, tokenization, and parsing come in the rule-
based. The approach is that counts the number of positive and negative
words in the given dataset. If the number of positive words is greater than
the number of negative words then the sentiment is positive else vice-versa.

Machine Learning

This approach works on the machine learning technique. Firstly, the datasets
are trained and predictive analysis is done. The next process is the
extraction of words from the text is done. This text extraction can be done
using different techniques such as Naive Bayes, Support Vector
machines, hidden Markov model , and conditional random fields like this
machine learning techniques are used.

Neural Network

In the last few years neural networks have evolved at a very rate. It involves
using artificial neural networks, which are inspired by the structure of the
human brain, to classify text into positive, negative, or neutral sentiments. it
has Recurrent neural networks, Long short-term memory , Gated recurrent
unit, etc to process sequential data like text.

Hybrid Approach

It is the combination of two or more approaches i.e. rule-based


and Machine Learning approaches. The surplus is that the accuracy is high
compared to the other two approaches.

Sentiment Analysis Applications


1. Social media monitoring
2. Customer support ticket analysis
3. Brand monitoring and reputation management
4. Listen to voice of the customer (VoC)
5. Listen to voice of the employee
6. Product analysis
7. Market research and competitive research

Social media monitoring

Social media posts often contain some of the most honest opinions
about your products, services, and businesses because they’re
unsolicited.

With the help of sentiment analysis software, you can wade through
all that data in minutes, to analyze individual emotions and overall
public sentiment on every social platform.

Sentiment analysis can read beyond simple definition to detect


sarcasm, read common chat acronyms (lol, rofl, etc.), and correct
for common mistakes like misused and misspelled words.

Comment 1

“Love the user interface. Setup took five minutes and we were
ready to go.”

Comment 2

“Took me 2 hours to set up, then I find out I have to update my OS.
Love it!”
Sentiment analysis would classify the second comment as negative,
even though they both use words that, without context, would be
considered positive.

Keeping track of customer comments allows you to engage with


customers in real time.

You’ll be able to quickly respond to negative or positive comments,


and get regular, dependable insights about your customers, which
you can use to monitor your progress from one quarter to the next.

Customer support

Customer support management presents many challenges due to


the sheer number of requests, varied topics, and diverse branches
within a company – not to mention the urgency of any given
request.

Sentiment analysis with natural language understanding


(NLU) reads regular human language for meaning, emotion, tone,
and more, to understand customer requests, just as a person would.
You can automatically process customer support tickets, online
chats, phone calls, and emails by sentiment to prioritize any urgent
issues.

Try out our sentiment analysis classifier to see how sentiment


analysis could be used to sort thousands of customer support
messages instantly by understanding words and phrases that
contain negative opinions.

Brand monitoring and reputation management

Brand monitoring is one of the most popular applications of


sentiment analysis in business. Bad reviews can snowball online,
and the longer you leave them the worse the situation will be.
With sentiment analysis tools, you will be notified about negative
brand mentions immediately.

Not only that, you can keep track of your brand’s image and
reputation over time or at any given moment, so you can monitor
your progress. Whether monitoring news stories, blogs, forums, and
social media for information about your brand, you can transform
this data into usable information and statistics.

You can also trust machine learning to follow trends and anticipate
outcomes, to stay ahead and go from reactive to proactive.

Listen to voice of the customer (VoC)


Combine and evaluate all of your customer feedback from the web,
customer surveys, chats, call centers, and emails. Sentiment
analysis allows you to categorize and structure this data to identify
patterns and discover recurring topics and concerns.

Listening to the voice of your customers, and learning how to


communicate with your customers – what works and what doesn’t –
will help you create a personalized customer experience.

Listen to your employees

By analyzing the sentiment of employee feedback, you’ll know how


to better engage your employees, reduce turnover, and increase
productivity.

Use sentiment analysis to to evaluate employee surveys or analyze


Glassdoor reviews, emails, Slack messages, and more..

Process unstructured data to go beyond who and what to uncover


the why – discover the most common topics and concerns to keep
your employees happy and productive.

Product analysis

Find out what the public is saying about a new product right after
launch, or analyze years of feedback you may have never seen. You
can search keywords for a particular product feature (interface, UX,
functionality) and use aspect-based sentiment analysis to find only
the information you need.

Discover how a product is perceived by your target audience, which


elements of your product need to be improved, and know what will
make your most valuable customers happy. All with sentiment
analysis.

Market and competitor research

Use sentiment analysis for market and competitor research. Find out
who’s receiving positive mentions among your competitors, and
how your marketing efforts compare.

Analyze the positive language your competitors are using to speak


to their customers and weave some of this language into your own
brand messaging and tone of voice guide.

Sentiment Analysis Process


Step 1: Data collection
This is one of the most important steps in the sentiment analysis process.
Everything from here on will be dependent on the quality of the data that has
been gathered and how it has been annotated or labelled.


API Data - Data can be uploaded through Live APIs for social media. A news
API can help you glean information from all kinds of news publishers, while a
Facebook API can allow you to take all the publicly available data you need
from its platform. You can also use open source repositories like Kaggle,
or Amazon reviews.



Manual - If you have data that you already have from a CRM tool, you can
manually upload that onto the sentiment analysis API as a .csv file.

Step 2: Data processing

The processing of the data will depend on the kind of information it has - text,
image, video, or audio. Repustate IQ sentiment analysis steps also include
handling video content analysis with the same ease it does text analytics.
Below are the sub-tasks.


Audio transcription - The audio from the video data is transcribed through
speech to text software to ensure that any video or audio file (eg. podcast) in
the data is not overlooked.



Caption overlay - If there are any captions appearing in the video, they are
extracted by Repustate IQ and analyzed for any appearing entities, aspects,
or topics that you have identified as important.



Image overlay - Similarly, the platform recognizes and captures any images
in the video or text data through OCR (optical character recognition)


Logo recognition - Repustate IQ’s intelligent data scanner immediately
recognizes any logos that appear in the video background. This means even
videos that appear on the clothes of the presenter or say on an item like a
pen, or mug on the desk. It even picks up logos from background posters. It
does it in such a way so that not even the smallest detail goes unnoticed
when the platform conducts sentiment analysis of your brand.



Text extraction- All the text is similarly recognized and extracted in the
sentiment analysis process. This includes emojis and hashtags as well, which
are a vital part of social media sentiment analysis. Unlike other sentiment
analysis platforms, Repustate IQ ensures that emojis are never left out of data
processing because that could lead to false positives or negatives.

Step 3: Data analysis

There are many subtasks that need to be done for this stage of the sentiment
analysis process.


Training the model - A set of dedicated, classified and labeled sentiment
analysis of dataset that will be used to train the model needs to be pre-
processed and manually labelled. It is this labelled data that will be used to
train the model by comparing the correctly classified data with the incorrectly
classified one. This will help improve the custom model that is created for a
brand.



Multilingual data - In sentiment analysis steps that include multilingual data
processing, Repustate IQ has the dataset for each language individually
annotated and trained. This is because the platform does not rely on
translations at all since all the information can get distorted, or nuances lost
due to vast differences in certain languages like Spanish and Korean. This is
the reason why Repustate gives the highest accuracy scores compared to
other platforms.



Custom Tags - In this part of the process, custom tags for aspects and
themes will be created for the data such as brand mentions, product name,
etc. Once the model has been trained, it will automatically segregate text
based on these custom created tags.



Topic Classification - The topic classifier attaches a theme to a text. For
example, this text “The dresses were awesome, and I found some really good
scarves as well.” will be tagged as the topic “clothes”.



Sentiment Analysis - Each aspect and theme is isolated in this stage by the
platform and then analysed for the sentiment. Sentiment scores are given in
the range of -1 to +1. A neutral statement may be termed as zero. This
assigning of polarity is important as ultimately, even as the platform assigns
the different scores to different aspects like convenience, speed, cleanliness,
functionality, drinks, ambience, etc, it is the aggregate score that is calculated
to know the sentiment of the audience towards the brand. So, if 3 of the 7
aspects receive a poor rating (-.65) and 4 of them receive good ones (+.5),
the sentiment analyzer will give an average score as the overall sentiment of
the brand.

Step 4 - Data visualization

Once all the steps in the sentiment analysis process have been covered, the
insights are quickly turned into actionable reports in the form of graphs and
charts. These reports can then be shared within teams as well. These visual
reports are really important because it is through them that you see granular,
aspect-based results. For example, when you get an average score for your
brand, you can filter the results in the sentiment analysis dashboard to see
which aspects got how high a score and which ones got low scores. This will
give you an idea as to what areas need your attention more than others.
Thus, at this stage of the sentiment analysis steps, you get actionable insights
that you can use to decide the right course of action for your growth plans.

What is Speech Analytics?


Speech analytics, also called interaction analytics, is technology that leverages artificial
intelligence to understand, process, and analyze human speech. Contact centers use speech
analytics to assess call recordings and transcripts from digital channels such as chat and text
messages. The fact that speech analytics software can analyze 100% of contacts 24/7 means
contact centers can be more proactive and have a more accurate view of what happens during
customer interactions.

Speech analytics enables the examination of customer interactions to extract meaningful insights
and customer sentiment by converting spoken words into structured data points. Leveraging
natural language processing (NLP), automatic speech recognition (ASR), machine learning, and
artificial intelligence (AI), it extracts customer preferences, behavior, and emotions from customer
conversations.

The power of speech analytics applications


Contact centers metamorphose into potent information hubs when speech analytics is brought into
the mix. The applications of speech analytics are vast and varied, with game-changing effects:

1. Enhancing customer experience: Speech analytics software extracts customer


emotions and sentiments from voice calls. Businesses can tailor their services for an
enhanced customer experience by understanding customer needs and preferences. Real-
time speech analysis can provide agents with on-the-spot guidance and suggestions to
better serve customers during calls.
2. Improving operational efficiency: The ability to sift through thousands of recorded calls
to identify trends and patterns helps businesses streamline their operations. Speech
analytics tools analyze the conversation and customer data and suggest areas where
contact center agents can be more efficient or need additional training. Moreover, by
automating the process of sifting through calls, the speech analytics solution frees up
valuable human resources for more critical tasks.
3. Monitoring quality assurance: Ensuring customer interactions align with the company's
quality standards is paramount. Speech analytics can be programmed to flag calls that
deviate from these standards, allowing for timely intervention and coaching. This
contributes to consistency in contact center performance and safeguards the brand
image.
4. Reducing customer churn: Understanding why customers leave is as important as
knowing why they stay. Speech analytics identifies at-risk customers by analyzing
dissatisfaction or mentions of competitor brands. Discovering these customer insights
allows businesses to proactively engage these customers with retention strategies.
5. Sentiment analysis: By analyzing tone, speech patterns, and keywords, speech analytics
can gauge customer emotions during calls. This is invaluable for understanding the
customer's mind and can influence how agents handle the call. For example, an angry
customer may require a more empathetic approach.
6. Compliance and risk management: Speech analytics can ensure that calls comply with
legal regulations. Businesses can mitigate potential legal issues by automatically
detecting non-compliance or risky language.
7. Cost reduction: By identifying common customer issues and queries, businesses can
develop strategies or self-service options to handle them more efficiently. This reduces
the workload on contact centers, resulting in cost savings.
8. Sales optimization: Speech analytics can identify successful sales tactics and behaviors
from previous calls, which can be used to train agents. This leads to more effective sales
calls.
9. Product and service development: Listening to customer feedback through speech
analytics can offer insights into what customers like or dislike about products or services.
These insights can be used for product development and service improvement.
10. Competitive intelligence: Analyzing mentions of competitors and what context they are
brought up in can provide businesses with valuable intelligence on how they are
positioned against competitors in the minds of customers.

You might also like