0% found this document useful (0 votes)

4 views23 pages

Vertopal.com Lab 2 SVM

The document discusses the Least Squares C-Support Vector Classifier (LS-SVC), detailing its mathematical foundations, dual problem derivation, and implementation in both primal and dual forms. It also covers the influence of the regularization parameter C on LS-SVC and introduces Proximal C-Support Vector Classification (P-SVC) with its theoretical derivation and dual problem formulation. Additionally, it includes code examples for training models and visualizing decision boundaries.

Uploaded by

hafssabounia73

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views23 pages

Vertopal.com Lab 2 SVM

Uploaded by

hafssabounia73

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Part 1: Least Squares C-Support Vector

Classifier (LS-SVC)
1.1 Mathematical Foundations
The Least Squares Support Vector Classifier (LS-SVC) seeks to determine a linear separator by
minimizing a regularized squared error cost. Its primal form is expressed as:

( )
n
1
min ∥w ∥ 2+C ∑ ξ 2i
w , b ,ξ 2 i=1

subject to:

y i ( w T x i+ b ) =1− ξ i , for i=1 , … , n

where:
d
• w ∈ R is the weight vector,
• b ∈ R is the bias,
• C> 0 is a penalty parameter controlling the trade-off between the margin and error.

Dual Problem Derivation

To derive the dual, we apply the method of Lagrange multipliers. The Lagrangian becomes:

( )
n n
1
L ( w , b , ξ , λ )=
2 i=1 i=1
[
∥w ∥ 2+C ∑ ξ 2i − ∑ λ i y i ( wT x i +b ) − 1+ξ i)

Taking derivatives and setting them to zero gives:

• For w :
n n
∂L
=w − ∑ λ i y i x i=0 ⇒ w=∑ λi y i x i
∂w i=1 i =1

• For b :
n n
∂L
=− ∑ λi y i=0 ⇒ ∑ λi y i=0
∂b i=1 i=1

• For ξ i:

∂L λi
=C ξi − λ i=0 ⇒ ξi =
∂ ξi C
Substituting back, we get the dual optimization problem:

( )
n n n n
1 1
L ( w , b , ξ , λ )=∥ w ∥2 − ∥w ∥ 2+ ∑ λi y i wT x i − b ∑ λi y i +∑ C ξ i − λi ξ i +∑ λi
2 i=1 i=1 i=1 2 i=1

( ) (∑ ) (∑ ) (
T

)
n n n n n
1 1 1
L ( w , b , ξ , λ ) = w − ∑ λi y i x i w −T
λi y i x i λi y i x i + ∑ λi − λi λ i − ∑ λi
i=1 2 i=1 i=1 i=1 2 C i=1
n n n n
1 1
L ( w , b , ξ , λ )=− ∑ ∑ λi λ j y i y j x Ti x j +∑ λi − ∑ λ2i
2 i=1 j=1 i=1 2C i=1
n n n n
1 1
max ∑ λi − ∑ ∑ λi λ j yi y j x Ti x j − ∑ λ 2i
λ i=1 2 i=1 j=1 2C i=1
subject to:
n

∑ λi y i=0
i=1

1.2 Implementation
Primal Form Implementation
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.utils.validation import check_X_y, check_array,
check_is_fitted
from sklearn.utils.multiclass import unique_labels
from sklearn.metrics import pairwise_kernels
from cvxopt import matrix, solvers
from sklearn.preprocessing import StandardScaler

class LS_SVC_Primal(BaseEstimator, ClassifierMixin):

def __init__(self, C=0.1):
self.C = C

def fit(self, X, y):

X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)

# Ensure y is in {-1, 1}
y = y.astype(float).ravel()
assert set(y) == {-1.0, 1.0}, "Labels must be -1 or 1"

# Construct the quadratic program

P = np.zeros((n_features + 1 + n_samples, n_features + 1 +
n_samples))
P[:n_features, :n_features] = np.eye(n_features) # w
P[n_features, n_features] = 0 # b
P[-n_samples:, -n_samples:] = self.C * np.eye(n_samples) # xi

q = np.zeros(n_features + 1 + n_samples)
q[-n_samples:] = self.C

Aeq = np.hstack([y[:, np.newaxis] * X, y[:, np.newaxis],

np.eye(n_samples)])
beq = np.ones(n_samples)

# Convert to cvxopt format

P = matrix(P)
q = matrix(q)
Aeq = matrix(Aeq)
beq = matrix(beq)

solvers.options['show_progress'] = False
sol = solvers.qp(P, q, Aeq, beq)

sol_x = np.array(sol['x']).flatten()
self.coef_ = sol_x[:n_features]
self.intercept_ = sol_x[n_features]
self.xi_ = sol_x[-n_samples:]

return self

def predict(self, X):

check_is_fitted(self)
X = check_array(X)
return np.sign(X @ self.coef_ + self.intercept_)

Dual Form Implementation

import numpy as np
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.utils.validation import check_X_y, check_array,
check_is_fitted
from sklearn.utils.multiclass import unique_labels
from sklearn.metrics import pairwise_kernels
from cvxopt import matrix, solvers

class LS_SVC_Dual(BaseEstimator, ClassifierMixin):

def __init__(self, C=1.0):
self.C = C

def fit(self, X, y):

X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)
y = 2 * y - 1 # Convert labels to -1 and 1

# Construct the quadratic program

K = pairwise_kernels(X, metric='linear') # Linear kernel
P = matrix(np.outer(y, y) * K) # P = y_i y_j K(x_i, x_j)
q = matrix(-np.ones(n_samples)) # q = -1 vector
A = matrix(y.reshape(1, -1), tc='d') # A = y^T (1 x
n_samples)
b = matrix(0.0) # b = 0 (scalar)
G = matrix(np.vstack([-np.eye(n_samples), np.eye(n_samples)]))
# G = [-I; I]
h = matrix(np.hstack([np.zeros(n_samples), self.C *
np.ones(n_samples)])) # h = [0; C]

# Solve the quadratic program

solvers.options['show_progress'] = False
sol = solvers.qp(P, q, G, h, A, b)

# Extract the solution

self.alpha_ = np.array(sol['x']).flatten() # Lagrange
multipliers
self.support_vectors_ = X # All data points are support
vectors in LS-SVC
self.support_vector_labels_ = y # Labels of support vectors
self.intercept_ = np.mean(y - np.dot(K, self.alpha_ * y)) #
Bias term

return self

def predict(self, X):

check_is_fitted(self)
X = check_array(X)
K = pairwise_kernels(X, self.support_vectors_,
metric='linear') # Linear kernel
return np.sign(np.dot(K, self.alpha_ *
self.support_vector_labels_) + self.intercept_)

Visualization
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

# Generate a linearly separable dataset

np.random.seed(42)
X = np.random.randn(100, 2)
y = np.where(X[:, 0] + X[:, 1] > 0, 1, -1) # Simple linear separation

# Shift points away from the boundary, but keep some points close to
the other class
# For most points, shift as before
X[:, 0] = np.where(y > 0, X[:, 0] + 1, X[:, 0] - 1)

# Add some noise points close to boundary: choose 10 random indices

from each class
pos_idx = np.where(y == 1)[0]
neg_idx = np.where(y == -1)[0]

# Pick 10 random points in positive class and move them slightly left
(closer to negative)
np.random.seed(0)
close_pos = np.random.choice(pos_idx, 4, replace=False)
X[close_pos, 0] -= 1.5 # move closer to boundary

# Pick 10 random points in negative class and move them slightly right
(closer to positive)
close_neg = np.random.choice(neg_idx, 4, replace=False)
X[close_neg, 0] += 1.5 # move closer to boundary

# Custom orange and pink colormap

orange_pink_cmap = ListedColormap(["orange", "hotpink"])

# Plot the dataset

plt.figure(figsize=(6, 6))
plt.scatter(X[:, 0], X[:, 1], c=(y > 0), cmap=orange_pink_cmap, s=60,
edgecolors='k')
plt.title("Linearly Separable Dataset with Some Close Points")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.grid(True)
plt.show()
# Train Primal LS-SVM
model =LS_SVC_Primal( C=1)
model.fit(X, y)

LS_SVC_Primal(C=1)

# Decision boundary plot

def plot_decision_boundary(model, X, y, title="Decision Boundary",
resolution=0.02):
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, resolution),
np.arange(y_min, y_max, resolution))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

cmap = ListedColormap(["orange", "hotpink"])

plt.figure(figsize=(8, 6))
plt.contourf(xx, yy, Z, cmap=cmap, alpha=0.4)
plt.scatter(X[:, 0], X[:, 1], c=(y > 0), cmap=cmap, s=60,
edgecolors='k')
plt.title(title)
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.grid(True)
plt.show()

plot_decision_boundary(model, X, y, title="Decision Boundary")

y_pred = model.predict(X)
accuracy = np.mean(y_pred == y)
print(f"Accuracy: {accuracy}")

Accuracy: 0.56
# Train Dual LS-SVM
model =LS_SVC_Dual( C=1)
model.fit(X, y)

LS_SVC_Dual(C=1)

plot_decision_boundary(model, X, y, title="Decision Boundary")

y_pred = model.predict(X)
accuracy = np.mean(y_pred == y)
print(f"Accuracy: {accuracy}")

Accuracy: 0.96

1.3 Questions
1. How does the regularization parameter C influence LS-SVC?

The parameter C balances the emphasis between maximizing the margin and
reducing classification errors. A high value of C strongly penalizes
misclassifications, resulting in a tighter margin and a tendency to overfit the
training data. On the other hand, a lower C value permits a larger margin but might
increase the risk of underfitting.

Part 2: Proximal C-Support Vector Classification

(P-SVC)
2.1 Theoretical Derivation
The primal optimization problem for P-SVC is expressed as:

( )
n
1
min ∥w ∥ 2+ b2 +C ∑ ξ 2i
w , b ,ξ 2 i=1

subject to the equality constraints:

y i ( w T x i+ b ) =1− ξ i , for i=1 , … , n

where:

• $ \xi_i$ are slack variables.

Derivation of the Lagrangian

To handle the constraints, we construct the Lagrangian using Lagrange lambdaltipliers
λi∈ R

for each constraint:

( )
n n
1
L ( w , b , ξ , λ )=
2 i=1 i =1
[
∥w ∥ 2+b 2+ C ∑ ξ 2i − ∑ λi y i ( w T xi +b ) −1+ξ i )

Stationarity Conditions
To find the saddle point, we compute the partial derivatives and set them to zero:

• Derivative w.r.t. ( w ):
n n
∂L
=w − ∑ λ i y i x i=0 ⇒ w=∑ λi y i x i
∂w i=1 i =1

• Derivative w.r.t. ( b ):
n n
∂L
=b − ∑ λ i y i=0 ⇒ b=∑ λi y i
∂b i=1 i=1

• Derivative w.r.t. ξ i:

∂L λi
=C ξi − λ i=0 ⇒ ξi =
∂ ξi C

Dual Problem
Substitute the expressions for ( w ), ( b ), and ξ i into the Lagrangian to obtain the dual
forlambdalation. After simplification, the dual becomes:

( ) ( )
n n n n
1 b 1
L ( w , b , ξ , λ )=∥ w ∥ − ∥w ∥ − ∑ λi y i w x i+ b − ∑ λ i y i + ∑ C ξi − λ i ξ i +∑ λi
2 2 T
2 i=1 2 i=1 i=1 2 i=1

( ) ( )( )
T

( )
n n n n n
1 b2 1 1
L ( w , b , ξ , λ ) = w − ∑ λi y i x i w − T
∑ λi y i x i ∑ λi y i x i − + ∑ λ i − λi λi + ∑ λi
i=1 2 i=1 i=1 2 i=1 2 C i=1

( )
n n n n n 2
1 1 1
L ( w , b , ξ , λ )=− ∑ ∑
2 i=1 j=1
λi λ j y i y j x Ti x j +∑ λi − ∑
2C i=1
λ2i −
2
∑ λi y i
i=1 i=1

n n n n
1 1
max ∑ λi − ∑ ∑ λi λ j yi y j ( x Ti x j +1 ) − ∑ λ2i
λ i=1 2 i=1 j=1 2 C i=1
subject to:
n
λ∈ R

2.2 Implementation
Primal Form Implementation
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.utils.validation import check_X_y, check_array,
check_is_fitted
from sklearn.utils.multiclass import unique_labels
import numpy as np

class P_SVC_Primal(BaseEstimator, ClassifierMixin):

def __init__(self, C=1.0):
self.C = C

def fit(self, X, y):

X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)
y = 2 * y - 1 # Convert labels from {0,1} to {-1,1}
# Total number of variables: w (n_features), b (1), xi
(n_samples)
total_vars = n_features + 1 + n_samples

# Construct P matrix (quadratic terms)

P = np.zeros((total_vars, total_vars))
P[:n_features, :n_features] = np.eye(n_features) # w^T w
P[n_features, n_features] = 1 # b^2
P[n_features + 1:, n_features + 1:] = self.C *
np.eye(n_samples) # C * sum(xi^2)

# q vector (linear term in objective)

q = np.zeros(total_vars)

# Equality constraints: y_i (w^T x_i + b) + xi_i = 1

A_eq = np.zeros((n_samples, total_vars))
A_eq[:, :n_features] = X * y[:, np.newaxis] # y_i * x_i
A_eq[:, n_features] = y # y_i * b
A_eq[np.arange(n_samples), n_features + 1 +
np.arange(n_samples)] = 1 # xi_i
b_eq = np.ones(n_samples)

# Convert to cvxopt format

from cvxopt import matrix, solvers
solvers.options['show_progress'] = False
P = matrix(P)
q = matrix(q)
A = matrix(A_eq)
b = matrix(b_eq)

# Solve
sol = solvers.qp(P, q, A, b)

# Extract model parameters

sol_np = np.array(sol['x']).flatten()
self.coef_ = sol_np[:n_features]
self.intercept_ = sol_np[n_features]
self.xi_ = sol_np[n_features + 1:]

return self

def predict(self, X):

check_is_fitted(self)
X = check_array(X)
return np.sign(X @ self.coef_ + self.intercept_)
Dual Form Implementation
class P_SVC_Dual(BaseEstimator, ClassifierMixin):
def __init__(self, C=1.0):
self.C = C

def fit(self, X, y):

X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)
y = 2 * y - 1 # Convert labels to -1 and 1

# Construct the quadratic program

K = pairwise_kernels(X, metric='linear')
P = np.outer(y, y) * (K + 1) # Kernel matrix + 1 for the bias
term
q = -np.ones(n_samples)
A = y[np.newaxis, :].astype(float) # Ensure A is of type 'd'
b = np.zeros(1)
G = np.vstack([-np.eye(n_samples), np.eye(n_samples)])
h = np.hstack([np.zeros(n_samples), self.C *
np.ones(n_samples)])

# Convert numpy arrays to cvxopt matrices

from cvxopt import matrix, solvers
solvers.options['show_progress'] = False
P = matrix(P)
q = matrix(q)
A = matrix(A)
b = matrix(b)
G = matrix(G)
h = matrix(h)

# Solve the quadratic program

sol = solvers.qp(P, q, G, h, A, b)

# Extract the solution

self.alpha_ = np.array(sol['x']).flatten()
self.support_vectors_ = X
self.support_vector_labels_ = y
self.intercept_ = np.mean(y - np.dot(K + 1, self.alpha_ * y))

return self

def predict(self, X):

check_is_fitted(self)
X = check_array(X)
K = pairwise_kernels(X, self.support_vectors_,
metric='linear')
return np.sign(np.dot(K + 1, self.alpha_ *
self.support_vector_labels_) + self.intercept_)
Visualization
model = P_SVC_Primal(C=10.0)
model.fit(X, y)

P_SVC_Primal(C=10.0)

plot_decision_boundary(model, X, y, title="P-SVC (Primal) Decision

Boundary")

model = P_SVC_Dual(C=10.0)
model.fit(X, y)

P_SVC_Dual(C=10.0)

plot_decision_boundary(model, X, y, title="P-SVC (Dual) Decision

Boundary")
The P-SVC Dual plot correctly visualizes that in this formulation, almost all (or all) training data
points have non-zero dual variables because we set that λ i> 0 and thus influence the decision
boundary, fitting the definition of "support vectors" as used by the plotting code for this model
type. This is a direct consequence of the P-SVC/LS-SVM formulation using equality constraints
and a quadratic slack penalty, leading to non-sparse dual solutions. The two plots show the
same decision boundary because the primal and dual formulations solve the same optimization
problem, but the visualization of support vectors differs based on which model (primal or dual)
was trained and how the plotting code interprets "support vectors" for that model type.

y_pred = model.predict(X)
accuracy = np.mean(y_pred == y)
print(f"Accuracy: {accuracy}")

Accuracy: 1.0

2.3 Questions
1. How does P-SVC differ from LS-SVC in terms of the slack variables ξ i?
Both P-SVC and LS-SVC include slack variables ξ i squared in the objective function,
resulting in a quadratic penalty for misclassification. However, a key difference is
that P-SVC includes the bias term b in the regularization term — that is, the
2 2 2
objective minimizes ∥w ∥ +b +C ∑ ξi — whereas LS-SVC typically does not penalize
b . This impacts the optimization formulation and can influence the geometry of the
decision boundary.

2. What is the effect of the parameter C on the number of support vectors?

The parameter C controls the trade-off between margin size and classification error.
A larger value of C places more emphasis on minimizing the training error, often
resulting in more support vectors and a tighter fit to the training data. In contrast,
a smaller C value allows more margin violations, usually producing fewer support
vectors and a wider margin, which may improve generalization on noisy data.

Part 3: $ \nu − S o f t M a r g i n S u p p o r t V e c t o r C l a s s i f i c at io n¿ \nu $-

SVC)
3.1 Theoretical Derivation
The primal optimization problem for $ \nu $-SVC is:
n
1 1
min ∥w ∥ 2 − ν ρ+ ∑ ξ i
2 n i=1
subject to:

y i ( w T x i+ b ) ≥ ρ− ξ i , i=1 , … , n

where:

• $ \xi_i \geq 0 $, $ \rho \geq 0 $, $ w \in \mathbb{R}^d $, $ b \in \mathbb{R} $

Dual Formulation
The dual form of $ \nu $-SVC can be derived using the method of Lagrange multipliers. The
Lagrangian is:
n n n
1 1
n i=1 i=1
[
L ( w , b , ξ , ρ , λ , β )= ∥w ∥ 2 − ν ρ+ ∑ ξ i − ∑ λ i y i ( wT x i +b ) − ρ+ξ i ) − ∑ β i ξ i
2 i=1

Taking the derivatives with respect to $ w $, $ b $, $ \xi_i $, and $ \rho $, and setting them to
zero, we get:
n
w=∑ λ i y i x i
i=1

∑ λi y i=0
i=1

1
λ i+ β i=
n
n

∑ λi=ν
i=1

Substituting these back into the Lagrangian, we obtain the dual problem:
n n
1
max −
λ
∑ ∑ λ λ y y xT x
2 i=1 j=1 i j i j i j
subject to:
n

∑ λi y i=0
i=1

1
0 ≤ λi ≤ ,i=1 , … , n
n
n

∑ λi=ν
i=1

3.2 Implementation
Primal Form Implementation
class Nu_SVC_Primal(BaseEstimator, ClassifierMixin):
def __init__(self, nu=0.5, reg=1e-6):
self.nu = nu
self.reg = reg # Regularization term

def fit(self, X, y):

X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)
y = 2 * y - 1 # Convert labels to -1 and 1

# Scale the data

from sklearn.preprocessing import StandardScaler
self.scaler = StandardScaler()
X_scaled = self.scaler.fit_transform(X)

# Construct the quadratic program

P = np.eye(n_features + 1 + n_samples + 1) + self.reg *
np.eye(n_features + 1 + n_samples + 1)
q = np.zeros(n_features + 1 + n_samples + 1)
q[-n_samples-1] = -self.nu
q[-n_samples:] = 1 / n_samples

A = np.hstack([y[:, np.newaxis] * X_scaled, y[:, np.newaxis],

np.eye(n_samples), -np.ones((n_samples, 1))])
b = np.zeros(n_samples)

# Solve the quadratic program

from cvxopt import matrix, solvers
solvers.options['show_progress'] = False
P = matrix(P)
q = matrix(q)
A = matrix(A)
b = matrix(b)
sol = solvers.qp(P, q, A, b)

if sol['status'] != 'optimal':
raise ValueError("Optimization failed")

self.coef_ = np.array(sol['x'][:n_features])
self.intercept_ = sol['x'][n_features]
self.xi_ = np.array(sol['x'][-n_samples-1:-1])
self.rho_ = sol['x'][-1]

return self

def predict(self, X):

check_is_fitted(self)
X = check_array(X)
X_scaled = self.scaler.transform(X)
return np.sign(X_scaled @ self.coef_ + self.intercept_)

Dual Form Implementation

class Nu_SVC_Dual(BaseEstimator, ClassifierMixin):
def __init__(self, nu=0.5, kernel='linear', reg=1e-6):
"""
Parameters:
-----------
nu : float, default=0.5
An upper bound on the fraction of margin errors and a
lower bound on
the fraction of support vectors. Must be in the range (0,
1).

kernel : str, default='linear'

The kernel function to use. Supported kernels are
'linear', 'poly', 'rbf', etc.

reg : float, default=1e-6

Regularization term to ensure numerical stability.
"""
self.nu = nu
self.kernel = kernel
self.reg = reg

def fit(self, X, y):

"""
Fit the $\nu$-SVC model using the dual formulation.

Parameters:
-----------
X : array-like of shape (n_samples, n_features)
Training data.

y : array-like of shape (n_samples,)

Target labels.

Returns:
--------
self : object
Fitted estimator.
"""
X, y = check_X_y(X, y)
n_samples, n_features = X.shape
self.classes_ = unique_labels(y)
y = 2 * y - 1 # Convert labels to -1 and 1

# Compute the kernel matrix

self.X_train_ = X
K = pairwise_kernels(X, metric=self.kernel)

# Construct the quadratic program

P = np.outer(y, y) * K + self.reg * np.eye(n_samples) #
Regularization term
q = -np.ones(n_samples)
A = y.reshape(1, -1) # Ensure A is a 2D matrix
b = np.zeros(1)
G = np.vstack([-np.eye(n_samples), np.eye(n_samples)])
h = np.hstack([np.zeros(n_samples), np.ones(n_samples) /
(self.nu * n_samples)])

# Solve the quadratic program using cvxopt

solvers.options['show_progress'] = False
P = matrix(P, tc='d') # Ensure P is of type 'd'
q = matrix(q, tc='d') # Ensure q is of type 'd'
A = matrix(A, tc='d') # Ensure A is of type 'd'
b = matrix(b, tc='d') # Ensure b is of type 'd'
G = matrix(G, tc='d') # Ensure G is of type 'd'
h = matrix(h, tc='d') # Ensure h is of type 'd'
sol = solvers.qp(P, q, G, h, A, b)

if sol['status'] != 'optimal':
raise ValueError("Optimization failed")

# Extract the solution

self.alpha_ = np.array(sol['x']).flatten()
self.support_vectors_ = X
self.support_vector_labels_ = y

# Compute the intercept

sv_indices = self.alpha_ > 1e-5 # Indices of support vectors
self.intercept_ = np.mean(
y[sv_indices] - np.dot(K[sv_indices], self.alpha_ * y)
)

return self

def predict(self, X):

"""
Predict class labels for samples in X.

Parameters:
-----------
X : array-like of shape (n_samples, n_features)
Test data.

Returns:
--------
y_pred : array-like of shape (n_samples,)
Predicted class labels.
"""
check_is_fitted(self)
X = check_array(X)

# Compute the kernel matrix between test data and support

vectors
K = pairwise_kernels(X, self.support_vectors_,
metric=self.kernel)

# Predict using the dual coefficients

y_pred = np.sign(np.dot(K, self.alpha_ *
self.support_vector_labels_) + self.intercept_)
return (y_pred + 1) // 2 # Convert labels back to 0 and 1
Visualization
# Assuming X, y are defined and your model is trained:
model = Nu_SVC_Primal(nu=0.5, reg=1e6)
model.fit(X, y)

Nu_SVC_Primal(reg=1000000.0)

plot_decision_boundary(model, X, y, title="Nu-SVC (Primal) Decision

Boundary")

y_pred = model.predict(X)
accuracy = np.mean(y_pred == y)
print(f"Accuracy: {accuracy}")

Accuracy: 0.4952

model = Nu_SVC_Dual(nu=0.5, kernel='linear') # Or 'rbf', 'poly'

model.fit(X, y)
Nu_SVC_Dual()

# === Plot ===

plot_decision_boundary(model, X, y, title="Nu-SVC (Dual) Decision
Boundary")

y_pred = model.predict(X)
accuracy = np.mean(y_pred == y)
print(f"Accuracy: {accuracy}")

Accuracy: 0.4

3.3 Questions
1. What is the role of the parameter ν in ν -SVC?

The parameter ν controls the balance between margin size and classification error. It sets an
upper bound on the fraction of margin errors (misclassifications) and a lower bound on the
fraction of support vectors. A smaller ν generally leads to a larger margin and fewer support
vectors, while a larger ν results in a smaller margin and more support vectors.

1. How does ν control the trade-off between margin size and classification error?
ν directly affects both the margin size and the number of support vectors. A smaller ν allows for
a wider margin, which may increase training error but improve generalization. Conversely, a
larger ν reduces the margin, potentially lowering training error but risking overfitting.

Part 4: Comparison and Discussion

4.1 Comparison
We compare the three methods (LS-SVC, P-SVC, and ν -SVC) using the criteria below:

Training Classification Number of Support Robustness to

Method Time Accuracy Vectors Outliers
LS-SVC Moderate High Moderate Moderate
P-SVC Moderate High Moderate High
ν -SVC Fast High Low to Moderate High

Explanation:

• Training Time: ν -SVC is generally faster due to a simpler optimization problem.

• Classification Accuracy: All methods achieve high accuracy, but ν -SVC and P-SVC
often generalize better.

• Number of Support Vectors: ν -SVC typically has fewer support vectors because it
explicitly controls this number.

• Robustness to Outliers: P-SVC and ν -SVC are more robust due to their
regularization.

4.2 Discussion
Strengths and Weaknesses:
• LS-SVC:
– Strengths: Simple to implement and works well on linearly separable data.

– Weaknesses: Less robust to outliers; may overfit with large C .

• P-SVC:
– Strengths: More robust to outliers; includes bias term regularization.
– Weaknesses: Slightly more complex optimization problem.
• ν -SVC:
– Strengths: Explicit control over margin and support vectors; faster training.

– Weaknesses: Requires careful tuning of ν .

Suitability for Different Datasets:

• Linearly Separable Data: All methods perform well; ν -SVC is preferred for
simplicity and speed.

• Noisy Data: P-SVC and ν -SVC are more robust to outliers.

• Imbalanced Data: ν -SVC is suitable as it allows control over the fraction of support
vectors.

Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
100% (1)
Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
12 pages
svm using iris dataset by hyparlink
No ratings yet
svm using iris dataset by hyparlink
19 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
PCA Codebase
No ratings yet
PCA Codebase
6 pages
Lect3 2
No ratings yet
Lect3 2
43 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Naive Bayes
No ratings yet
Naive Bayes
58 pages
Expt9_ML2025 (2)
No ratings yet
Expt9_ML2025 (2)
2 pages
hw3 Soln
No ratings yet
hw3 Soln
7 pages
Weekly Homework X
No ratings yet
Weekly Homework X
15 pages
Classification Review
No ratings yet
Classification Review
8 pages
Vertopal.com C1 W2 Lab02 Multiple Variable Soln
No ratings yet
Vertopal.com C1 W2 Lab02 Multiple Variable Soln
11 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 5
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 5
27 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
lecture6
No ratings yet
lecture6
17 pages
4 - SVM
No ratings yet
4 - SVM
58 pages
Assignment II Machine Learning
No ratings yet
Assignment II Machine Learning
8 pages
Problem 1 Report Trần Minh Long 2052154 Final
No ratings yet
Problem 1 Report Trần Minh Long 2052154 Final
31 pages
HW 3
No ratings yet
HW 3
3 pages
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
4 pages
Zerox Ready
No ratings yet
Zerox Ready
21 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Fundamentals of Machine Learning Support Vector Machines, Practical Session
No ratings yet
Fundamentals of Machine Learning Support Vector Machines, Practical Session
4 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
ANN_EXPERIENTIAL_LEARNING
No ratings yet
ANN_EXPERIENTIAL_LEARNING
43 pages
Machine Learning Algorithms From Scratch
No ratings yet
Machine Learning Algorithms From Scratch
9 pages
SVM
No ratings yet
SVM
40 pages
8&9 Assignment ADS
No ratings yet
8&9 Assignment ADS
20 pages
Ex 6,EX 7 AIML
No ratings yet
Ex 6,EX 7 AIML
9 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
1st PGM
No ratings yet
1st PGM
10 pages
Lab Program (SVM From Scratch)
No ratings yet
Lab Program (SVM From Scratch)
2 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
hw3
No ratings yet
hw3
7 pages
Aiml 5-8
No ratings yet
Aiml 5-8
19 pages
SVM & Image Classification.
No ratings yet
SVM & Image Classification.
22 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
svm
No ratings yet
svm
36 pages
Dis11 Sol
No ratings yet
Dis11 Sol
5 pages
Support Vector Machine(With Numerical Example) _ by Balaji C _ Medium (1)
No ratings yet
Support Vector Machine(With Numerical Example) _ by Balaji C _ Medium (1)
16 pages
Support vector machine
No ratings yet
Support vector machine
49 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Ml Manual
No ratings yet
Ml Manual
30 pages
Ai Last 5
No ratings yet
Ai Last 5
4 pages
1.4. Support Vector Machines - Scikit-Learn
No ratings yet
1.4. Support Vector Machines - Scikit-Learn
6 pages
21CSC305P Ml - Lab Programs 1 -9
No ratings yet
21CSC305P Ml - Lab Programs 1 -9
36 pages
Support Vector Machines For Classification and Regression
No ratings yet
Support Vector Machines For Classification and Regression
8 pages
Support Vector Machines & Kernels: David Sontag New York University
No ratings yet
Support Vector Machines & Kernels: David Sontag New York University
19 pages
ML spy programs
No ratings yet
ML spy programs
16 pages
Aiml Lab
No ratings yet
Aiml Lab
14 pages
Ann 1
No ratings yet
Ann 1
20 pages
ML Journal External
No ratings yet
ML Journal External
14 pages
ML Lec SVM Linear
No ratings yet
ML Lec SVM Linear
19 pages
lab5ml
No ratings yet
lab5ml
4 pages
ML Programs
No ratings yet
ML Programs
14 pages
Batch15 Individual Assignment - MLSL2
No ratings yet
Batch15 Individual Assignment - MLSL2
3 pages
ML Lab Prgms Split
No ratings yet
ML Lab Prgms Split
3 pages
B24 ML Exp-3
No ratings yet
B24 ML Exp-3
10 pages
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
From Everand
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Slides Ch7 Bài 7. Phương pháp nhiều bước giải gần đúng PTVP
No ratings yet
Slides Ch7 Bài 7. Phương pháp nhiều bước giải gần đúng PTVP
18 pages
მიკროეკონომიკა. თავი1 (თეორია, პრაქტიკა) + PDF
No ratings yet
მიკროეკონომიკა. თავი1 (თეორია, პრაქტიკა) + PDF
21 pages
Benchmark Va HL Davis 83 B
No ratings yet
Benchmark Va HL Davis 83 B
22 pages
Assignment - 2 Numerical Solutions of Ordinary and Partial Differential Equations
No ratings yet
Assignment - 2 Numerical Solutions of Ordinary and Partial Differential Equations
1 page
Experiment 2a2q2020
No ratings yet
Experiment 2a2q2020
25 pages
Class 10 Chapter 2 Maths Important Question
0% (1)
Class 10 Chapter 2 Maths Important Question
10 pages
Experiment 4 - Numerical Differentiation
No ratings yet
Experiment 4 - Numerical Differentiation
6 pages
Peter Paul Q. Padios: Problem #1 Solution
No ratings yet
Peter Paul Q. Padios: Problem #1 Solution
3 pages
Instant download An introduction to numerical methods: a MATLAB approach 4th Edition Guenther pdf all chapter
100% (3)
Instant download An introduction to numerical methods: a MATLAB approach 4th Edition Guenther pdf all chapter
62 pages
Finite Difference Formulae For Unequal Sub-Interval
No ratings yet
Finite Difference Formulae For Unequal Sub-Interval
14 pages
Introduction To Finite Element Method
No ratings yet
Introduction To Finite Element Method
21 pages
The Design of Approximation Algorithm 2011 PDF
No ratings yet
The Design of Approximation Algorithm 2011 PDF
500 pages
37 B splines+NURBS
No ratings yet
37 B splines+NURBS
12 pages
NLS 2
No ratings yet
NLS 2
28 pages
Simpsons 1/3 & 3/8 Rule. Cabajes Report
No ratings yet
Simpsons 1/3 & 3/8 Rule. Cabajes Report
7 pages
Chapter6 Sec4
No ratings yet
Chapter6 Sec4
46 pages
Problem Formulation For Optimization
No ratings yet
Problem Formulation For Optimization
5 pages
Numerical Methods: Jeffrey R. Chasnov
No ratings yet
Numerical Methods: Jeffrey R. Chasnov
60 pages
Time Integration Methods For Heat Transfer
100% (1)
Time Integration Methods For Heat Transfer
10 pages
X and Y-Intercept
No ratings yet
X and Y-Intercept
21 pages
Quadratic Equations: B.Rakesh MSC. Bioinformatics 21mslsbf17 Central University of Punjab
No ratings yet
Quadratic Equations: B.Rakesh MSC. Bioinformatics 21mslsbf17 Central University of Punjab
25 pages
Math ppt
No ratings yet
Math ppt
15 pages
CS365 Optimization Techniques Module4
No ratings yet
CS365 Optimization Techniques Module4
30 pages
Bisection Method
No ratings yet
Bisection Method
31 pages
Linear Algebra Lecture 7
No ratings yet
Linear Algebra Lecture 7
4 pages
Predictor-Corrector Method
No ratings yet
Predictor-Corrector Method
10 pages
syllabus
No ratings yet
syllabus
1 page
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
71 pages
Assignment #2 - Solutions
No ratings yet
Assignment #2 - Solutions
8 pages