0% found this document useful (0 votes)
2 views16 pages

Support Vector Machine(With Numerical Example) _ by Balaji C _ Medium (1)

The document discusses Support Vector Machines (SVM), a popular supervised machine learning algorithm primarily used for classification. It explains the concept of hyper-planes and support vectors, detailing both linear and non-linear SVMs with numerical examples. Additionally, it highlights the advantages and disadvantages of SVM, including its effectiveness in high-dimensional spaces and limitations with large datasets and noise.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views16 pages

Support Vector Machine(With Numerical Example) _ by Balaji C _ Medium (1)

The document discusses Support Vector Machines (SVM), a popular supervised machine learning algorithm primarily used for classification. It explains the concept of hyper-planes and support vectors, detailing both linear and non-linear SVMs with numerical examples. Additionally, it highlights the advantages and disadvantages of SVM, including its effectiveness in high-dimensional spaces and limitations with large datasets and noise.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

This member-only story is on us. Upgrade to access all of Medium.

Member-only story

Support Vector Machine(with


Numerical Example)
Balaji C · Follow
7 min read · Jan 19, 2023

17 1

SVM is a one of the most popular supervised machine learning algorithm,


which can be used for both classification and regression but mainly used in
area of classification.

Main goal of SVM is to create best fit line or boundary that can segregate n-
dimensional space into classes, so that we can put the data point in correct
category for future prediction.

The best decision boundary is called hyper-plane.

SVM choses the extreme points/vectors that helps in creating a hyper-plane.


These extreme cases are called as support vectors. Hence algorithm is
termed as support vector machine.
Linear SVM

There are mainly two types of SVM

1. Linear SVM: Linear SVM is used for linearly separable data, which means
if a dataset can be classified into two classes by using a single straight
line, then such data is termed as linearly separable data, and classifier is
used called as Linear SVM classifier.

Numerical example for Linear SVM:

Q. Positively labelled data points (3,1)(3,-1)(6,1)(6,-1) and Negatively labelled


data points (1,0)(0,1)(0,-1)(-1,0)

Solution: for all negative labelled output is -1 and for all positive labelled
output is 1.
graph for above table

Now adding 1 to all points

s₁ = (3,1) => s₁`= (3,1,1)| s₂ = (3,-1) => s₂` = (3,1,-1)


s₃ = (6,1) => s₃`= (6,1,1) |s₄ = (6,-1) => s₄` = (6,1,-1)
s₅ = (1,0) => s₅`= (1,0,1) |s₆ = (0,1) => s₆` = (0,1,1)
s₇ = (0,-1)=> s₇ = (0,-1,1) |s₈ = (-1,0) => s₈` = (-1,0,1)

***************************************************************

from the graph we can see there is one negative point α₁ = (1,0,1) and two
positive points α₂ = (3,1,1) & α₃ = (3,-1,1) form support vectors

Generalized equation

α₁*s₁`*s₁`+ α₂*s₁`*s₂`+α₃*s₁`*s₃` = -1 → 1

α₁*s₂`*s₁`+ α₂*s₂`*s₂`+α₃*s₂`*s₃` = 1 → 2
α₁*s₃`*s₁`+ α₂*s₃`*s₂`+α₃*s₃`*s₃` = 1 → 3

*****************************************************************

On solving these equations

eq 1 => α₁ * (1,0,1) *(1,0,1) + α₂* (1,0,1) * (3,1,1) + α₃ * (1,0,1) * (3,-1,1) = -1

eq 2=> α₁ * (3,1,1) *(1,0,1) + α₂* (3,1,1) * (3,1,1) + α₃ * (3,1,1) * (3,-1,1) = 1

eq 3=> α₁ * (3,-1,1) *(1,0,1) + α₂* (3,-1,1) * (3,1,1) + α₃ * (3,-1,1)* (3,-1,1) = 1

****************************************************************************

2α₁ + 4α₂ + 4α₃ = -1 → 4

4α₁ + 11α₂ + 9α₃ = 1 → 5

4α₁ + 9α₂ + 11α₃ = 1 → 6

on solving these equations 4,5 and 6 we get,

α₁ = -3.5, α₂ = 0.75 and α₃ = 0.75

****************************************************************************

To find hyper-plane

W` = Σ αᵢ * sᵢ

W` = -3.5 * (1,0,1) + 0.75 * (3,1,1) + 0.75 * (3,-1,1)

W` = (1,0,-2)
y = W`x + b

W` = (1,0) and b = 2

so the best fit line or hyper plane is at (0,2) which splits the data points into
two classes

Final Graph for linear SVM

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target

C = 1.0 # SVM regularization parameter


svc = svm.SVC(kernel='linear', C=1,gamma=0).fit(X, y)

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1


y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
h = (x_max / x_min)/100
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))

plt.subplot(1, 1, 1)
Z = svc.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)

plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)


plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.xlim(xx.min(), xx.max())
plt.title('SVC with linear kernel')
plt.show()

2. Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,


which means if a dataset cannot be classified by using a straight line, then
such data is termed as non-linear data and classifier used is called as Non-
linear SVM classifier.

In non-linear SVM we will use Kernel SVM to convert low dimensional data
into high dimensional data so that small hyper plane will be created.
Numerical example for Non-Linear SVM:

Q. Positively labelled data points (2,2)(2,-2)(-2,-2)(-2,2) and Negatively labelled


data points (1,1)(1,-1)(-1,-1)(-1,1)

Solution: We have to find a hyper-plane i,e it divides data into two different
classes. But this is of type non-linear SVM so we need to convert data from
one feature space to another feature space using Kernal SVM.

Kernal SVM Condition:

(1) For positive labelled data:

θ(2,2) => (√ 2² + 2²) = (√8) > 2

so θ(x₁) = 4 -2 + |2 -2| =>θ(x₁) = 2 and θ(x₂) = 4 -2 + |2 -2| => θ(x₂) = 2

***********************************************************************
θ(2,-2) => (√ 2²+2²) = (√8) > 2

so θ(x₁) = 4 +2 + |2 +2| =>θ(x₁) = 10 and θ(x₂) = 4 -2 + |2 +2| => θ(x₂) = 6

***********************************************************************

θ(-2,-2) => (√2²+2 ²) = (√8) > 2

so θ(x₁) = 4 +2 + |-2 +2| =>θ(x₁) = 6 and θ(x₂) = 4 +2 + |-2 +2| => θ(x₂) = 6

***********************************************************************

θ(-2,2) => (√2²+2²) = (√8) > 2

so θ(x₁) = 4 -2 + |-2 -2| =>θ(x₁) = 6 and θ(x₂) = 4 +2 + |-2 -2| => θ(x₂) = 10

***********************************************************************

Hence overall positive labelled data is rechanged as (2,2), (10,6), (6,6), (6,10)

(2) For negative labelled data points:

Here all values are (1, 1) only thing is negative and positive sign changes, so
if we square root them then all values will be less than 2. Finally there is no
change in the points.

Final plot is
From the graph we can say support vectors are S₁= (1,1) and S₂ = (2,2) so
adding 1 to support vectors, S₁` = (1,1,1) and S₂` = (2,2,1).

α₁*s₁`*s₁`+ α₂*s₁`*s₂` = -1 → 1

α₁*s₂`*s₁`+ α₂*s₂`*s₂` = 1 → 2

α₁*(1,1,1)*(1,1,1)+ α₂(1,1,1)*(2,2,1) = -1 → 1

α₁*(2,2,1)*(1,1,1)+ α₂*(2,2,1)*(2,2,1) = 1 → 2

3α₁ + 5α₂ = -1 ->1 and 5α₁ + 9α₂ = 1 ->2

On solving equations α₁ = -7 and α₂ = 4

To find hyper-plane
W` = Σ αᵢ * sᵢ

W` = -7 * (1,1,1) + 4 * (2,2,1)

W` = (1,1,-3)

y = W`x + b

W` = 1,1 and b = 3

on drawing graph

At 3 hyper-plane is drawn.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style
from sklearn.svm import SVC

style.use('fivethirtyeight')

# create mesh grids


def make_meshgrid(x, y, h =.02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
return xx, yy

# plot the contours


def plot_contours(ax, clf, xx, yy, **params):
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out

color = ['r', 'b', 'g', 'k']

iris = pd.read_csv("iris-data.txt").values

features = iris[0:150, 2:4]


level1 = np.zeros(150)
level2 = np.zeros(150)
level3 = np.zeros(150)

# level1 contains 1 for class1 and 0 for all others.


# level2 contains 1 for class2 and 0 for all others.
# level3 contains 1 for class3 and 0 for all others.
for i in range(150):
if i>= 0 and i<50:
level1[i] = 1
elif i>= 50 and i<100:
level2[i] = 1
elif i>= 100 and i<150:
level3[i]= 1

# create 3 svm with rbf kernels


svm1 = SVC(kernel ='rbf')
svm2 = SVC(kernel ='rbf')
svm3 = SVC(kernel ='rbf')

# fit each svm's


svm1.fit(features, level1)
svm2.fit(features, level2)
svm3.fit(features, level3)

fig, ax = plt.subplots()
X0, X1 = iris[:, 2], iris[:, 3]
xx, yy = make_meshgrid(X0, X1)

# plot the contours Open in app


plot_contours(ax, svm1, xx, yy, cmap = plt.get_cmap('hot'), alpha = 0.8)
plot_contours(ax, svm2, xx, yy, cmap = plt.get_cmap('hot'), alpha = 0.3)
Search Write
plot_contours(ax, svm3, xx, yy, cmap = plt.get_cmap('hot'), alpha = 0.5)

color = ['r', 'b', 'g', 'k']

for i in range(len(iris)):
plt.scatter(iris[i][2], iris[i][3], s = 30, c = color[int(iris[i][4])])
plt.show()

Advantages:

It works really well with a clear margin of separation

It is effective in high dimensional spaces.

It is effective in cases where the number of dimensions is greater than


the number of samples.

It uses a subset of training points in the decision function (called support


vectors), so it is also memory efficient.

Disadvantages:

It doesn’t perform well when we have large data set because the required
training time is higher
It also doesn’t perform very well, when the data set has more noise i.e.
target classes are overlapping

SVM doesn’t directly provide probability estimates, these are calculated


using an expensive five-fold cross-validation. It is included in the related
SVC method of Python scikit-learn library.

Written by Balaji C Follow


67 Followers · 115 Following

Responses (1)

What are your thoughts?

Respond

Bhuvaneshwarmarri
9 months ago

How to find kernal svm condition

Reply

More from Balaji C


Balaji C Balaji C

Naive Bayes(Numerical Example) RNN (Forward and Backward


Naive Bayes algorithm is a supervised Propagation)
machine learning algorithm which is based o… Recurrent Neural Network is a type of neural
network in which output from previous step i…

Sep 29, 2022 28 3 Jan 21, 2023 1

Balaji C Balaji C

Gaussian Naive Bayes (Continuous K-Nearest Neighbor(KNN) With


values example) Numerical example
In this article we see a numerical example for KNN is one of the simplest machine learning
calculating conditional probability for… algorithm for classification and regression…

Jan 12, 2023 2 Sep 28, 2022 3

See all from Balaji C


Recommended from Medium

Hrishikesh Bele ajaymehta

SVM “Exploring Different Types of


Q. What is a Support Vector Machine (SVM)? Binning and Discretization…
disclaimer: read it below blog first

Dec 12, 2024 6d ago 1

Lists

Staff picks Stories to Help You Level-Up


796 stories · 1561 saves at Work
19 stories · 912 saves

Self-Improvement 101 Productivity 101


20 stories · 3191 saves 20 stories · 2706 saves
In Tech & TensorFlow by Ansa Baby Abisha

Mathematics Behind Support Image Feature Extraction using


Vector Machines (SVM) Algorithm Python - Part I
A Detailed Guide to Understand the Basics of Image feature extraction techniques
Mathematics Behind SVM using python

Jul 24, 2024 77 1 Sep 5, 2024 312 7

AI SageScribe ajaymehta

Constructing a Multilayer Demystifying Transformers:


Perceptron (MLP) from Scratch in… Components and Coding from…
We’ll dive into the implementation of a basic “Attention Is All You Need” research
neural network in Python, without using any… paper,here :https://arxiv.org/pdf/1706.03762.…

Jul 14, 2024 Dec 30, 2024 10

See more recommendations

You might also like