0% found this document useful (0 votes)
2 views8 pages

SVM-1

Support Vector Machine (SVM) is a supervised machine learning algorithm primarily used for classification, aiming to find the optimal hyperplane that separates data into classes by maximizing the margin between support vectors. Linear SVM is applicable for linearly separable data, while Non-Linear SVM employs mapping functions to transform non-linearly separable data into a higher-dimensional space for classification. The document also includes a step-by-step example of finding and drawing an optimal hyperplane for a linearly separable dataset.

Uploaded by

sindhu P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views8 pages

SVM-1

Support Vector Machine (SVM) is a supervised machine learning algorithm primarily used for classification, aiming to find the optimal hyperplane that separates data into classes by maximizing the margin between support vectors. Linear SVM is applicable for linearly separable data, while Non-Linear SVM employs mapping functions to transform non-linearly separable data into a higher-dimensional space for classification. The document also includes a step-by-step example of finding and drawing an optimal hyperplane for a linearly separable dataset.

Uploaded by

sindhu P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Support Vector Machine (SVM) in Machine Learning

 Defini on: SVM is a popular supervised machine learning algorithm used for
classifica on and regression problems. However, it is typically used for classifica on.
 Supervised Learning: SVM requires labelled data as input.
 Classifica on Problems: These involve predic ng a target variable with a discrete
number of possibili es (e.g., spam or not spam).
 Regression Problems: These involve predic ng a target variable with con nuous
values (e.g., salary increase).
How SVM Works for Classifica on
 The primary goal of an SVM algorithm is to find the best decision boundary or
hyperplane that segregates the given dataset into mul ple classes.`
 Once this hyperplane is established, new examples can be classified into one of the
classes based on which side of the hyperplane they fall on.
Hyperplane
 The decision boundary in SVM is called a hyperplane.

 The dimension of the hyperplane depends on the number of features in the dataset.
o For two features (X, Y), the hyperplane will be a straight line.
o For more than two features, the hyperplane will be a plane or a higher-
dimensional hyperplane.
 The crucial aspect is to draw the hyperplane in a way that achieves the maximum
margin.
Finding the Best Hyperplane: Maximum Margin
 There can be mul ple lines (in 2D) or hyperplanes that can separate the data.
 The best hyperplane is the one that has the maximum margin.
 To determine the maximum margin, we consider the nearest data points from each
class to the possible hyperplanes.
Support Vectors
 The nearest data points to a par cular hyperplane are called support vectors.
 These support vectors play a crucial role in defining and determining the op mal
hyperplane with the maximum margin.
Linear SVM
 Defini on: A linear SVM is used when the data can be separated with a straight line
(in 2D) or a linear hyperplane (in higher dimensions).

 To find the op mal linear hyperplane:


o Consider all possible straight lines that can separate the data.
o For each line, iden fy the support vectors (the nearest data points from each
class).
o Draw parallel lines to the hyperplane that pass through the support vectors.
o Calculate the margin, which is the distance between these parallel lines.
o The hyperplane with the maximum margin is the op mal one.
 Mathema cally, for a hyperplane defined by W ⋅ X + b = 0, the distance to the
support vectors on either side can be represented by W ⋅ X + b = 1 and W ⋅ X + b = -1.
The margin is related to 2 / ||W||, and the goal is to maximize this margin.
Non-Linear SVM
 Defini on: A non-linear SVM is used when the data cannot be separated by a
straight line or a linear hyperplane.
 Mapping Func ons: To handle non-linearly separable data, SVM u lizes mapping
func ons (or kernels) to transform the original data into a higher-dimensional space
where it becomes linearly separable.
 Example: This provides an example where data points are arranged in a circular
pa ern. A mapping func on like z = x² + y² can transform this 2D data into a 3D space
(with axes x, y, and z) where the classes can be separated by a linear hyperplane.
 A er applying the mapping func on and achieving linear separability in the higher-
dimensional space, the principles of finding the hyperplane with the maximum
margin (as in linear SVM) are applied.

In summary, SVM aims to find the op mal hyperplane that separates data into different
classes by maximizing the margin between the closest data points (support vectors) of each
class. Linear SVM works when the data is linearly separable, while Non-Linear SVM uses
mapping func ons to handle non-linearly separable data by projec ng it into a higher-
dimensional space.
Drawing a Hyperplane in Linear SVM: A Solved Example
This provides a step-by-step demonstra on of how to find and draw an op mal hyperplane
for a linearly separable dataset in SVM.
1. Understanding the Data Set
 The example dataset consists of posi ve and nega ve examples in a 2D space.
 Posi ve examples: (4, 1), (4, -1), (6, 0).
 Nega ve examples: (1, 0), (0, 1), (0, -1).
 The goal is to find a hyperplane (a line in 2D) that best separates these two classes.
This aligns with the general goal of SVM, which is to find the best decision boundary
to segregate data [Me].

2. Plo ng the Data Points


 The first step is to visualize the data by plo ng the posi ve and nega ve examples
on a 2D graph. This allows for a visual understanding of the data distribu on and
poten al separa ng lines.
3. Iden fying Support Vectors
 Support vectors are the data points that are closest to the hyperplane and play a
crucial role in defining it.
 In this example, the iden fied support vectors are:
o Nega ve class: (1, 0)
o Posi ve class: (4, 1), (4, -1)
4. Formula ng the Hyperplane Equa on using Support Vectors
 To find the equa on of the op mal hyperplane, we use the support vectors.

 A bias term is added to each support vector to create an augmented vector. If a


support vector S is (x, y), the augmented vector S becomes (x, y, 1).
o S1 (1, 0) → S1̄ (1, 0, 1)
o S2 (4, 1) → S2̄ (4, 1, 1)
o S3 (4, -1) → S3̄ (4, -1, 1)
 We need to find coefficients (alpha values) for each support vector. Let these be α₁,
α₂, and α₃ for S1̄, S2̄, and S3̄ respec vely.
 These alpha values are determined by solving a system of equa ons based on the dot
products of the augmented support vectors and their class labels (+1 for posi ve, -1
for nega ve):
o α₁ (S1̄ ⋅ S1̄) + α₂ (S2̄ ⋅ S1̄) + α₃ (S3̄ ⋅ S1̄) = -1 (since S1̄ is from the nega ve class)

o α₁ (S1̄ ⋅ S2̄) + α₂ (S2̄ ⋅ S2̄) + α₃ (S3̄ ⋅ S2̄) = +1 (since S2̄ is from the posi ve class)
o α₁ (S1̄ ⋅ S3̄) + α₂ (S2̄ ⋅ S3̄) + α₃ (S3̄ ⋅ S3̄) = +1 (since S3̄ is from the posi ve class)
5. Calcula ng Dot Products and Forming Simultaneous Equa ons
 The dot products between the augmented support vectors are calculated:
o S1̄ ⋅ S1̄ = (1*1) + (0*0) + (1*1) = 2
o S2̄ ⋅ S1̄ = (4*1) + (1*0) + (1*1) = 5
o S3̄ ⋅ S1̄ = (4*1) + (-1*0) + (1*1) = 5
o S1̄ ⋅ S2̄ = (1*4) + (0*1) + (1*1) = 5
o S2̄ ⋅ S2̄ = (4*4) + (1*1) + (1*1) = 18
o S3̄ ⋅ S2̄ = (4*4) + (-1*1) + (1*1) = 16

o S1̄ ⋅ S3̄ = (1*4) + (0*-1) + (1*1) = 5


o S2̄ ⋅ S3̄ = (4*4) + (1*-1) + (1*1) = 16
o S3̄ ⋅ S3̄ = (4*4) + (-1*-1) + (1*1) = 18
 Subs tu ng these values into the equa ons, we get the following system of
simultaneous equa ons:
o 2α₁ + 5α₂ + 5α₃ = -1
o 5α₁ + 18α₂ + 16α₃ = 1
o 5α₁ + 16α₂ + 18α₃ = 1
6. Solving for Alpha Values
 Solving these simultaneous equa ons (using a calculator or other methods), we
obtain the values for α₁, α₂, and α₃:
o α₁ = -3
o α₂ = +1
o α₃ = 0
7. Determining the Weight Vector (W) and Bias (b)
 The weight vector (W) of the hyperplane is calculated as a linear combina on of the
support vectors and their corresponding alpha values:
o W = Σ (αᵢ * Sᵢ) = α₁ * S₁ + α₂ * S₂ + α₃ * S₃
 The bias (b) is related to the augmented vectors and alpha values. The intercept is
directly obtained a er a similar calcula on involving the augmented vectors and
alpha values, resul ng in -2. So the hyperplane equa on is of the form W⋅X + b = 0,
which becomes 1x₁ + 1x₂ - 2 = 0 or x₁ + x₂ = 2.
8. Drawing the Hyperplane
 The equa on of the hyperplane is x₁ + x₂ - 2 = 0 or y = -x + 2 in terms of x and y
coordinates.
 This represents a straight line with a slope of -1 (an inclina on of 45 degrees with
respect to the origin) and a y-intercept of 2.
 The shows the hyperplane passing through the point (2, 0) and (0, 2). The support
vectors lie on the parallel lines that define the margin [Me].

You might also like