0% found this document useful (0 votes)

29 views32 pages

Week 9. Unsupervised Learning

The document provides an overview of machine learning, focusing on unsupervised learning and clustering algorithms such as K-Means and Agglomerative Clustering. It explains the differences between supervised and unsupervised learning, the importance of dimensionality reduction, and methods for choosing the optimal number of clusters. Additionally, it covers techniques like PCA for feature extraction and the challenges of evaluating unsupervised learning outcomes.

Uploaded by

vefa.qafarzade.05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views32 pages

Week 9. Unsupervised Learning

Uploaded by

vefa.qafarzade.05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Machine Learning:

Clustering Algorithms

Instructor: Sabina Mammadova

Agenda

• Unsupervised Learning

• Agglomerative Clustering

• K-Means Algorithm

• Choosing K – Elbow method & Silhouette Analysis

• Dimensionality Reduction
Machine Learning Algorithms
Linear Regression, Polynomial
Regression, Support Vector
Regression Regression, Decision Tree
Regression, Random Forest
Supervised Regression
Learning Logistic Regression, K-Nearest
Neighbors, Support Vector
Classification Machines, Decision Tree, Random
Forest, Naïve Bayes

Clustering K-Means, Hierarchical, DBSCAN

Machine Unsupervised Association Apriori, FP-Growth

Learning Learning Analysis
Dimensionality
PCA, LDA
Reduction

Reinforcemen Q-Learning, Deep Q-Networks…

t Learning
Difference between Supervised and
Unsupervised Learning

• Input data is labelled • Input data is unlabeled

• There is a training phase • There is no training
• Data is modelled based phase
on training dataset • Uses properties of given
• Known number of data for clustering
classes (for • Unknown number of
classification) classes
What is Unsupervised Learning?
• Unsupervised learning is a type of machine learning where an
algorithm learns patterns and structures from data without
labeled outputs. Unlike supervised learning, where models are
trained using labeled input-output pairs (e.g., images labeled as
"cat" or "dog"), unsupervised learning works with unlabeled
data and tries to discover hidden patterns or relationships within
it.
• It is mainly used for data exploration, feature learning, and
preprocessing in machine learning pipelines.
• Hard to Evaluate: No ground truth to compare results with.
• Interpretability: The patterns found by the algorithm may not
always be meaningful or useful.
Unsupervised Learning Algorithms
• Dimensionality
Reduction— the task
of reducing the number
of input features in a
dataset,
• Anomaly Detection—
the task of detecting
instances that are very
different from the
norm, and
• Clustering — the task
of grouping similar
instances into clusters.
Unsupervised Learning Algorithms
• Unsupervised learning includes transformations, clustering, and
anomaly detection algorithms, each with real-world applications.
• Transformations like dimensionality reduction help in
bioinformatics for gene expression analysis, while topic
extraction is used in news categorization and social media
monitoring.
• Clustering groups similar data points, enabling customer
segmentation in marketing and facial recognition in social
media.
• Anomaly detection algorithms identify unusual patterns, making
them essential for fraud detection in banking, cybersecurity
threat detection, and fault detection in manufacturing, helping
to recognize deviations from normal behavior.
Unsupervised Learning Algorithms
• One of the biggest challenges in unsupervised learning is evaluating whether the
algorithm has learned something useful. Since there are no labels in the data, we don’t
have a correct answer to compare the model’s output against. For example, imagine
using a clustering algorithm to group customers based on their purchasing behavior.
The algorithm might group people who buy luxury items separately from those who
buy everyday essentials. While this is a valid way to categorize customers, it may not
be what we were expecting—perhaps we wanted to group them based on shopping
frequency instead. However, because there are no predefined labels, we cannot
directly tell the algorithm what we want. The only way to assess the results is through
manual inspection.
• Due to this challenge, unsupervised learning is mostly used for exploratory data
analysis, helping data scientists uncover hidden patterns in the data rather than
making final decisions in automated systems. Another important use of unsupervised
learning is preprocessing for supervised learning. For example, dimensionality
reduction can simplify complex data, making it easier for supervised algorithms to
work efficiently while also improving their accuracy. Additionally, techniques like
scaling and normalization, which adjust data values to a consistent range, are also
considered unsupervised because they don’t rely on labeled data. These preprocessing
steps are crucial for improving the performance of machine learning models.
Agglomerative
Clustering
Agglomerative Clustering
• Agglomerative clustering refers to a collection of clustering
algorithms that all build upon the same principles: the algorithm
starts by declaring each point its own cluster, and then merges
the two most similar clusters until some stopping criterion is
satisfied. The stopping criterion implemented in scikit-learn is the
number of clusters, so similar clusters are merged until only the
specified number of clusters are left. There are several linkage
criteria that specify how exactly the “most similar cluster” is
measured. This measure is always defined between two existing
clusters.
• Not appropriate for large datasets
Computing Distance Matrix
• The default choice, ward picks the two clusters to merge such that the variance
within all clusters increases the least. This often leads to clusters that are
relatively equally sized.
• Average linkage merges the two clusters that have the smallest average
distance between all their points.
• Complete linkage (also known as maximum linkage) merges the two clusters
that have the smallest maximum distance between their points.
• Single linkage: Merges the two clusters that have the smallest minimum
distance between any of their points. Sensitive to noise and outliers.
• Centroid linkage: Merges clusters based on the distance between their centroids
(mean points). Less sensitive to outliers than single linkage but can cause
inversion (clusters merging in unexpected ways).
• Median Linkage: Similar to centroid linkage but uses the median instead of the
mean when computing cluster distances.
• Ward works on most datasets. If the clusters have very dissimilar numbers of
members (if one is much bigger than all the others, for example), average or
complete might work better.
Computing Distance Matrix
What is Dendrogram?
• A Dendrogram is a diagram that represents the hierarchical relationship between objects. The
Dendrogram is used to display the distance between each pair of sequentially merged objects.
• These are commonly used in studying hierarchical clusters before deciding the number of
clusters significant to the dataset.
• The distance at which the two clusters combine is referred to as the dendrogram distance.
• The primary use of a dendrogram is to work out the best way to allocate objects to clusters.
Hierarchical Agglomerative
Clustering
• It is also known as the bottom-
up approach or hierarchical
agglomerative clustering
(HAC). Unlike flat clustering
hierarchical clustering provides
a structured way to group data.
This clustering algorithm does
not require us to prespecify the
number of clusters. Bottom-up
algorithms treat each data as a
singleton cluster at the outset
and then successively
agglomerate pairs of clusters
until all clusters have been
merged into a single cluster
that contains all data.
Hierarchical Divisive Clustering
• It is also known as a top-
down approach. This
algorithm also does not
require to prespecify the
number of clusters. Top-
down clustering requires
a method for splitting a
cluster that contains the
whole data and proceeds
by splitting clusters
recursively until
individual data have been
split into singleton
clusters.
K-Means Algorithm
What is K-Means Algorithm?
• K-Means Clustering is an Unsupervised Learning algorithm, which groups
the unlabeled dataset into different clusters. Here K defines the number
of pre-defined clusters that need to be created in the process, as if K=2,
there will be two clusters, and for K=3, there will be three clusters, and so
on.
• It allows us to cluster the data into different groups and a convenient way
to discover the categories of groups in the unlabeled dataset on its own
without the need for any training.
• It is a centroid-based algorithm, where each cluster is associated with a
centroid.
• The algorithm takes the unlabeled dataset as input, divides the dataset
into k-number of clusters, and repeats the process until it does not find
the best clusters. The value of k should be predetermined in this
algorithm.
What is K-Means Algorithm?
The k-means clustering algorithm mainly performs two tasks:

• Determines the best value for K center points or centroids by an iterative process.
• Assigns each data point to its closest k-center. Those data points which are near to
the particular k-center, create a cluster.
How does the K-Means Algorithm
Work?
• Step 1: Select the number K to decide the number of clusters.
• Step 2: Select random K points or centroids. (It can be other from
the input dataset).
• Step 3: Assign each data point to their closest centroid, which will
form the predefined K clusters.
• Step 4: Calculate the variance and place a new centroid of each
cluster.
• Step 5: Repeat the third steps, which means reassign each
datapoint to the new closest centroid of each cluster.
• Step 6: If any reassignment occurs, then go to step-4 else go to
FINISH.
• Step 7: The model is ready.
How does the K-Means Algorithm
Work?

1 2 3 4 5

6 7 8 9 10
Choosing K – Elbow
method & Silhouette
Analysis
Elbow Method – optimal value of K
• Perform K-Means clustering on the dataset for a range of values of K (e.g., K = 1
to 10).
• As K increases, inertia will decrease, because adding more clusters reduces the
distance between data points and their centroids.
• The "elbow" point in the plot is where the rate of decrease sharply slows down.
This point suggests a good balance between the number of clusters and the
amount of variance explained.
Silhouette Analysis
• Silhouette analysis can be used to study the separation distance between the
resulting clusters. The silhouette plot displays a measure of how close each
point in one cluster is to points in the neighboring clusters and thus provides a
way to assess parameters like number of clusters visually. This measure has a
range of [-1, 1].
• Silhouette coefficients (as these values are referred to as) near +1 indicate that
the sample is far away from the neighboring clusters. A value of 0 indicates that
the sample is on or very close to the decision boundary between two
neighboring clusters and negative values indicate that those samples might
have been assigned to the wrong cluster.
• To calculate the Silhouette coefficient, we need to define the mean distance
of a point to all other points in its cluster (a(i)) and also define the mean
distance to all other points in the closest cluster (b(i)).
Silhouette Analysis

• For n_clusters = 2 The average silhouette_score is : 0.7049787496083262

• For n_clusters = 3 The average silhouette_score is : 0.5882004012129721
• For n_clusters = 4 The average silhouette_score is : 0.650518663272943
• For n_clusters = 5 The average silhouette_score is : 0.561464362648773
• For n_clusters = 6 The average silhouette_score is : 0.4857596147013469

https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.htm
l
Dimensionality
Reduction
Dimensionality Reduction
• Dimensionality reduction is the process of reducing the number of
features (variables) in a dataset while preserving as much important
information as possible. In real-world data, many features can be
correlated or redundant, making models more complex and harder to
interpret. By reducing dimensions, we can simplify the data, improve
computational efficiency, and enhance visualization.
• Imagine you have a dataset with 100 different features describing
customer behavior in an online store. Many of these features might
carry overlapping information—like "total money spent" and "average
purchase value." Instead of analyzing all 100 features, dimensionality
reduction methods help find the most informative ones or create new
features that summarize the essential patterns in the data.
Dimensionality Reduction
• There are two main approaches to dimensionality reduction:
• Feature Selection – Choosing a subset of the most important original
features based on certain criteria (e.g., removing low-variance or highly
correlated features).
• Feature Extraction – Creating new, fewer features that capture the essential
patterns of the data. Techniques like Principal Component Analysis (PCA)
and t-SNE fall into this category.
• A major benefit of dimensionality reduction is that it helps in data
visualization. If a dataset has 50 or 100 features, it's impossible to plot
directly. But by reducing it to two or three dimensions, we can create
scatter plots that reveal meaningful clusters and relationships.
• However, reducing dimensions also has risks—some details might be
lost, and the transformed features may not always have clear
interpretations. Therefore, it's important to balance simplicity with
accuracy, choosing the right number of dimensions based on the
specific problem and dataset.
What is PCA?
• Principal Component Analysis (PCA) is a technique used to
reduce the dimensionality of a dataset while preserving as
much variance (information) as possible. It transforms
correlated features into a new set of uncorrelated features
called principal components.
• PCA is widely used in:
Data compression – Reducing storage needs.
Feature selection – Removing less important features.
Noise reduction – Eliminating redundant information.
Visualization – Plotting high-dimensional data in 2D or 3D.
• Selecting only the top k principal components instead of using all
original features.
Eigenvectors and Eigenvalues
• Eigenvectors are the new directions or axes that we use to transform our
data. In PCA, they represent the directions of maximum variance (the
most important features of the data).
• Eigenvalues tell us how much variance (or "information") is captured by
each eigenvector. A higher eigenvalue means the corresponding
eigenvector (principal component) is more important because it captures
more variance.
• In PCA:
• We first calculate the covariance matrix of our data.
• We find eigenvectors and eigenvalues of the covariance matrix.
• The eigenvectors are the new axes for the transformed data (principal
components).
• The eigenvalues tell us how much each principal component explains the
variance in the data.
• We pick the principal components with the largest eigenvalues to reduce the
number of dimensions while keeping most of the data's information.
Thank you!

Common questions

Agglomerative clustering, or the bottom-up approach, starts with each data as its own cluster and merges clusters until all data are combined into one cluster . Divisive clustering, or the top-down approach, starts with a single cluster containing all data and recursively splits it until individual data points form separate clusters .

The main challenge with unsupervised learning is its evaluation, as it lacks labeled data, providing no straightforward method for validating the learned patterns . This makes it particularly hard for final decisions in automated systems since any patterns discovered require manual inspection to verify their usefulness and relevance, making it more suited for exploratory analysis rather than definitive decision-making .

K-Means clustering is a centroid-based algorithm because it associates each cluster with a centroid and iteratively reallocates data points to the nearest centroid to minimize variance within clusters . Unlike K-Means, agglomerative clustering does not require pre-defining cluster centroids or a specific number of clusters, instead it merges clusters based on a similarity measure until a stopping criterion is reached .

Unsupervised learning aims to find patterns and structures in data without labeled outputs, primarily used for data exploration, feature learning, and preprocessing . The challenge in evaluating unsupervised learning results stems from the absence of labels, making it hard to compare the output against a ground truth. This necessitates manual inspection to assess whether the identified patterns are meaningful .

Silhouette analysis evaluates how well-separated and compact the clusters are by measuring how close each point in a cluster is to neighboring clusters, providing visualization for assessing the number of clusters . Silhouette coefficients range from -1 to 1; values near +1 imply points are far from neighboring clusters and well-clustered, while scores around 0 indicate points are close to decision boundaries, and negative scores suggest incorrect assignment .

Linkage criteria in agglomerative clustering determine how to measure the distance between clusters for merging . Different criteria impact clustering outcomes by influencing cluster size and sensitivity to noise. For example, 'single linkage' is sensitive to noise as it merges clusters based on the smallest inter-point distance, while 'complete linkage' uses the largest distance, making it more robust but potentially creating elongated clusters. 'Average linkage' uses average distances, balancing between these extremes, and 'ward linkage' minimizes within-cluster variance, often producing compact, equally-sized clusters .

Feature selection involves choosing a subset of significant original features based on criteria like variance or correlation, preserving interpretability by using existing features . Feature extraction creates new features that summarize important data patterns, often reducing dimensionality more effectively when direct feature interpretation is less crucial . Selection is preferred when maintaining original feature understandability is necessary, whereas extraction is better for achieving substantial dimensionality reduction and capturing complex patterns, as in PCA .

Dimensionality reduction benefits machine learning models by simplifying data, improving computational efficiency, enhancing visualization, and potentially increasing the accuracy of models by removing redundant features . However, it risks losing important details and transformed features may lack clear interpretation, necessitating careful balance between simplicity and accuracy .

The Elbow Method determines the optimal number of clusters by performing K-Means clustering for a range of K values and plotting inertia (sum of squared distances to the nearest cluster center) against K. The optimal number of clusters is generally where the plot forms an 'elbow,' indicating a point beyond which adding more clusters yields diminishing returns on variance explained .

In PCA, eigenvectors define the directions of maximum variance in the data, which represent the new axes or principal components . Eigenvalues indicate the amount of variance captured by each eigenvector; higher eigenvalues mean more variance is captured, thus the corresponding eigenvectors are more important . The process involves computing the covariance matrix, determining its eigenvectors and eigenvalues, and selecting the principal components with the largest eigenvalues to reduce dimensionality while preserving information .

Module 6 - Un-Supervised Learning Algorithms
No ratings yet
Module 6 - Un-Supervised Learning Algorithms
31 pages
ML Unit4
No ratings yet
ML Unit4
19 pages
Lecture 3 Types of Machine Learning
No ratings yet
Lecture 3 Types of Machine Learning
40 pages
Unit 4
No ratings yet
Unit 4
53 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
Machine Learning: Clustering & Algorithms
No ratings yet
Machine Learning: Clustering & Algorithms
66 pages
Unsupervised Learning Insights
No ratings yet
Unsupervised Learning Insights
10 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
6 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Unit 4
No ratings yet
Unit 4
74 pages
Chapter 3 p4
No ratings yet
Chapter 3 p4
18 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
CH 5
No ratings yet
CH 5
34 pages
Lab 10 Unsupervised
No ratings yet
Lab 10 Unsupervised
12 pages
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
No ratings yet
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
48 pages
Week 14 and 15 Machine Learning Unsupervised 2
No ratings yet
Week 14 and 15 Machine Learning Unsupervised 2
25 pages
ML Unit5 Notes
No ratings yet
ML Unit5 Notes
18 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
Clustering
No ratings yet
Clustering
44 pages
ML Unit 4
No ratings yet
ML Unit 4
110 pages
Unit 3 Unsupervised Learning Algorith
No ratings yet
Unit 3 Unsupervised Learning Algorith
15 pages
DSUP Exp5
No ratings yet
DSUP Exp5
7 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Determining Clusters in K-Means
No ratings yet
Determining Clusters in K-Means
21 pages
Unsupervised Learning in Neural Networks
No ratings yet
Unsupervised Learning in Neural Networks
21 pages
04-FSSR DS610 2024 2025T1 Kmeans
No ratings yet
04-FSSR DS610 2024 2025T1 Kmeans
57 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
21 pages
Chapter 3 Unsupervised Machine Learning
No ratings yet
Chapter 3 Unsupervised Machine Learning
41 pages
Unsupervised Learning Final
No ratings yet
Unsupervised Learning Final
17 pages
Machine Learning Unsupervised Learning Methods
No ratings yet
Machine Learning Unsupervised Learning Methods
10 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
50 pages
K Means Final
No ratings yet
K Means Final
10 pages
Fuzzy Meaning
No ratings yet
Fuzzy Meaning
6 pages
Unit 4
No ratings yet
Unit 4
96 pages
Clustering
No ratings yet
Clustering
84 pages
ML CH 4
No ratings yet
ML CH 4
51 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
43 pages
Unit IV
No ratings yet
Unit IV
6 pages
UNIT-5 Material
No ratings yet
UNIT-5 Material
42 pages
ML Unit 4
No ratings yet
ML Unit 4
17 pages
Unsupervised Learning Part 1
No ratings yet
Unsupervised Learning Part 1
9 pages
Unsupervised Learning for Students
No ratings yet
Unsupervised Learning for Students
59 pages
2nd Unit NN Final Class Notes
No ratings yet
2nd Unit NN Final Class Notes
50 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
9 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
Unit 2 Unsupervised Learning
No ratings yet
Unit 2 Unsupervised Learning
86 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
54 pages
Machine Learning - Iv
No ratings yet
Machine Learning - Iv
13 pages
Buck-Boost and Cuk DC-DC Converters
No ratings yet
Buck-Boost and Cuk DC-DC Converters
73 pages
Assignment 2, Quiz 2 & Quiz 3 PDF
No ratings yet
Assignment 2, Quiz 2 & Quiz 3 PDF
2 pages
Measuring Soil Resistivity PDF
No ratings yet
Measuring Soil Resistivity PDF
4 pages
WRM200 Datasheet B210698EN
No ratings yet
WRM200 Datasheet B210698EN
3 pages
MSFS CJ4 Sim Pilot Guide
No ratings yet
MSFS CJ4 Sim Pilot Guide
92 pages
15 - Analysis+of+Low+Power+16 Bit+Processor+Using+Cadence+ +90nm+Foundry+Technology
No ratings yet
15 - Analysis+of+Low+Power+16 Bit+Processor+Using+Cadence+ +90nm+Foundry+Technology
28 pages
Design, Development and Evaluation of Barnyard Millet Dehuller
No ratings yet
Design, Development and Evaluation of Barnyard Millet Dehuller
9 pages
Teknaevo TPG: Instructions Manual
No ratings yet
Teknaevo TPG: Instructions Manual
78 pages
LogCluster A Data Clustering and Pattern Mining
No ratings yet
LogCluster A Data Clustering and Pattern Mining
7 pages
Tuning Oracle Parallel Execution
No ratings yet
Tuning Oracle Parallel Execution
17 pages
Remote Sensing Resolution Types
No ratings yet
Remote Sensing Resolution Types
30 pages
Analytical Voltage Sensitivity-Based Distributed Volt - Var Control For Mitigating Voltage-Violations in Low-Voltage Distribution Networks
No ratings yet
Analytical Voltage Sensitivity-Based Distributed Volt - Var Control For Mitigating Voltage-Violations in Low-Voltage Distribution Networks
11 pages
Irrigation Design for Engineers
No ratings yet
Irrigation Design for Engineers
4 pages
Queue PPT
No ratings yet
Queue PPT
35 pages
Aayush Nihar Soham Maitrik Yagnesh ML Project Report
No ratings yet
Aayush Nihar Soham Maitrik Yagnesh ML Project Report
9 pages
MasteringArchiMateEdition3 20171022 Screensyntax Optimized
100% (1)
MasteringArchiMateEdition3 20171022 Screensyntax Optimized
56 pages
Prelim Lab Exercise 01 It Elec02 - Ite Elective 02 (Vb2010)
No ratings yet
Prelim Lab Exercise 01 It Elec02 - Ite Elective 02 (Vb2010)
4 pages
TOS Grade 5 Math
No ratings yet
TOS Grade 5 Math
1 page
DAB 2019 Product Catalogue
No ratings yet
DAB 2019 Product Catalogue
24 pages
Major Project Report Format Ggsipu Mait
0% (1)
Major Project Report Format Ggsipu Mait
12 pages
Computer Science & Information Technology: Theory of Computation
No ratings yet
Computer Science & Information Technology: Theory of Computation
21 pages
264 Fluid Mechanics 2022
No ratings yet
264 Fluid Mechanics 2022
9 pages
Cell Biology Exam Questions
No ratings yet
Cell Biology Exam Questions
5 pages
Exercise 6.7 Page No: 6.47
No ratings yet
Exercise 6.7 Page No: 6.47
5 pages
Vector Force Problem Solutions
No ratings yet
Vector Force Problem Solutions
2 pages
User'S Manual: SB70 Series Inverter
No ratings yet
User'S Manual: SB70 Series Inverter
150 pages
Wine Quality Classification with Weka
No ratings yet
Wine Quality Classification with Weka
21 pages
Dynamic Programming & Algorithm Analysis
No ratings yet
Dynamic Programming & Algorithm Analysis
60 pages
Resource-Based Theory Insights
No ratings yet
Resource-Based Theory Insights
39 pages
7 Mahindra 575 Di XP Plus
100% (1)
7 Mahindra 575 Di XP Plus
14 pages

Week 9. Unsupervised Learning

Uploaded by

Week 9. Unsupervised Learning

Uploaded by

Machine Learning:

Instructor: Sabina Mammadova

• Choosing K – Elbow method & Silhouette Analysis

Clustering K-Means, Hierarchical, DBSCAN

Machine Unsupervised Association Apriori, FP-Growth

Reinforcemen Q-Learning, Deep Q-Networks…

• Input data is labelled • Input data is unlabeled

• For n_clusters = 2 The average silhouette_score is : 0.7049787496083262

Common questions

What is the fundamental difference between agglomerative and divisive hierarchical clustering approaches?

What is the main challenge associated with unsupervised learning, and why is it particularly hard to apply for making final decisions in automated systems?

Why is K-Means clustering considered a centroid-based algorithm and how does it differ from agglomerative clustering in terms of execution?

What is the primary purpose of unsupervised learning, and why is it challenging to evaluate its results?

Describe how silhouette analysis aids in deciding the number of clusters in data. What score values are considered optimal by silhouette analysis?

Explain the role of linkage criteria in agglomerative clustering and the impact different criteria have on clustering outcomes.

How do feature selection and feature extraction differ in the context of dimensionality reduction, and when would you choose one method over the other?

In what way does dimensionality reduction benefit machine learning models, and what are its potential drawbacks?

How does the Elbow Method determine the optimal number of clusters in K-Means clustering?

How do eigenvectors and eigenvalues contribute to the Principal Component Analysis (PCA) process?

You might also like