0% found this document useful (0 votes)
41 views32 pages

Ba 04

This document provides an overview of business analytics objectives and content, including statistical learning techniques like regression and predictive modeling. It discusses classification techniques like logistic regression, K-nearest neighbors (KNN), and clustering algorithms like K-means and hierarchical clustering. Classification examples involve medical diagnosis, fraud detection, and identifying disease-causing genetic mutations. The document explains key classification concepts such as conditional probability, likelihood functions, and evaluating different values of K in KNN.

Uploaded by

Naman Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views32 pages

Ba 04

This document provides an overview of business analytics objectives and content, including statistical learning techniques like regression and predictive modeling. It discusses classification techniques like logistic regression, K-nearest neighbors (KNN), and clustering algorithms like K-means and hierarchical clustering. Classification examples involve medical diagnosis, fraud detection, and identifying disease-causing genetic mutations. The document explains key classification concepts such as conditional probability, likelihood functions, and evaluating different values of K in KNN.

Uploaded by

Naman Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 32

1

Business
Analytics
2

Objectives:
 Statistical learning including
quantitative, qualitative analysis
techniques
 Predictive Analytics using linear,

polynomial and logistic regression


techniques and model comparison
 The use of the above analysis and

visualization to aid decision making


3

Content:
 Business Analytics - Introduction
 Statistical Methods for Business Analytics
 Basics of Hypothesis Testing
 Correlation and Regression
 Multiple Linear Regression
 Model Comparison and Performance
 Classification
 Time Series Analysis
4

Classification:
 A person arrives at the emergency room with a set of
symptoms that could possibly be attributed to one of
three medical conditions. Which of the three
conditions does the individual have?
 An online banking service must be able to determine
whether or not a transaction being performed on the
site is fraudulent, on the basis of the user’s IP
address, past transaction history, and so forth.
 On the basis of DNA sequence data for a number of
patients with and without a given disease, a biologist
would like to figure out which DNA mutations are
deleterious (disease-causing) and which are not.
5

Classification:

Why not MRA?


6

Classification:

Conditional Probability: Prob (Y=1| X=x)


7

Classification: Prob (Y=1| X=x)


8

Logistic Regression: Prob (X)

Likelihood Function
9

Logistic Regression: Prob (X)

Likelihood Function
10

Logistic Regression: Prob (X)


11

Logistic Regression: Prob (X)


12

Multiple Logistic Regression: Prob (X)


13

Classification – Multi Level Categorical DV

Bayes Theorem
14

Classification - KNN
15

Classification - KNN
16

Classification – KNN (Value of K)


17

Classification – KNN – For Regression


18

Classification – KNN – For Regression

K=1 K=9
19

Classification – Clustering

Unsupervised
Learning
20

Classification – Clustering

• K Means

• Hierarchical
21

Classification – Clustering – K Means


22

Classification – Clustering – K Means


23

Classification – Clustering – K Means


24

Classification – Clustering – K Means


25

Classification – Clustering – K Means


1. Randomly assign a number, from 1 to K, to each of the
observations. These serve as initial cluster assignments
for the observations.
2. Iterate until the cluster assignments stop changing:
(a) For each of the K clusters, compute the cluster
centroid. The kth cluster centroid is the vector of the p
feature means for the observations in the kth cluster.
(b) Assign each observation to the cluster whose centroid
is closest (where closest is defined using Euclidean
distance).
26

Classification – Clustering – K Means


27

Classification – Clustering-Hierarchical
28

Classification – Clustering-Hierarchical
29

Classification – Clustering-Hierarchical
30

Classification – Clustering-Hierarchical
31

Classification – Clustering-Hierarchical
32

Let’s Practice

???

You might also like