0% found this document useful (0 votes)
1 views

Unsupervised Learning

Unsupervised learning is a machine learning approach that analyzes unlabeled data to identify patterns and relationships without prior guidance. It includes techniques such as clustering, which groups data based on similarities, and association rule learning, which discovers relationships among data items. K-means clustering is a specific algorithm used to categorize data into clusters by iteratively assigning data points to the nearest centroid until stable clusters are formed.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Unsupervised Learning

Unsupervised learning is a machine learning approach that analyzes unlabeled data to identify patterns and relationships without prior guidance. It includes techniques such as clustering, which groups data based on similarities, and association rule learning, which discovers relationships among data items. K-means clustering is a specific algorithm used to categorize data into clusters by iteratively assigning data points to the nearest centroid until stable clusters are formed.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

UNSUPERVISED LEARNING

What is Unsupervised learning?

• Unsupervised learning is a type of machine learning that


works with data that has no labels or categories. The main
goal is to find patterns and relationships in the data without
any guidance.
• In this approach, the machine analyzes unorganized
information and groups it based on similarities, patterns, or
differences. Unlike supervised learning, there is no teacher or
training involved. The machine must uncover hidden
structures in the data on its own.
• For example, unsupervised learning can analyze animal data
and group the animals by their traits and behavior. These
groups could correspond to different species, making it
possible to organize the animals without pre-existing labels.
Example to understand
• Imagine you have a machine learning model trained on a large dataset of
unlabeled images, containing both dogs and cats. The model has never seen
an image of a dog or cat before, and it has no pre-existing labels or categories
for these animals. Your task is to use unsupervised learning to identify the
dogs and cats in a new, unseen image.
• suppose it is given an image having both dogs and cats which it has never
seen.
• Thus the machine has no idea about the features of dogs and cats so we
can’t categorize it as ‘dogs and cats ‘. But it can categorize them according to
their similarities, patterns, and differences, i.e., we can easily categorize the
above picture into two parts. The first may contain all pics having dogs in
them and the second part may contain all pics having cats in them. Here you
didn’t learn anything before, which means no training data or examples.
• It allows the model to work on its own to discover patterns and information
that was previously undetected. It mainly deals with unlabeled data.
Types of Unsupervised Learning

Unsupervised learning is classified into two categories of algorithms:


• Clustering: A clustering problem is where you want to discover the
inherent groupings in the data, such as grouping customers by purchasing
behavior.
• Marketing, product recommendations, or customer segmentation. The
task of grouping data points based on their similarity with each other is
called Clustering or Cluster Analysis. This method is defined under the
branch of unsupervised learning, which aims at gaining insights from
unlabelled data points.
• Think of it as you have a dataset of customers shopping habits. Clustering
can help you group customers with similar purchasing behaviors, which
can then be used for targeted.
• Association: An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as people
that buy X also tend to buy Y.
• Association rule mining finds interesting associations and relationships
among large sets of data items.(E.g) : Milk & Bread etc.
One of the most important hierarchical clustering algorithms include:
K means Clustering:
K-Means Clustering is an Unsupervised Machine Learning algorithm which
groups the unlabeled dataset into different clusters.
Understanding K-means Clustering:
• K-means clustering is a technique used to organize data into groups based
on their similarity. For example online store uses K-Means to group
customers based on purchase frequency and spending creating segments
like Budget Shoppers, Frequent Buyers and Big Spenders for
personalised marketing.
• The algorithm works by first randomly picking some central points
called centroids and each data point is then assigned to the closest
centroid forming a cluster.
• This process repeats until the centroids stop changing forming clusters.
The goal of clustering is to divide the data points into clusters so that
similar data points belong to same group.
How k-means clustering works?

• ‘K’ in the name of the algorithm represents the number of groups/clusters


we want to classify our items into.

The algorithm works as follows:


1.First, we randomly initialize k points, called means or cluster centroids.
2.We categorize each item to its closest mean, and we update the mean’s coordinates,
which are the averages of the items categorized in that cluster so far.
3.We repeat the process for a given number of iterations and at the end, we have our
clusters.
Implementation of K-Means Clustering in Python

We will use blobs datasets and show how clusters are made.
Step 1: Importing the necessary libraries
We are importing Numpy for statistical computations, Matplotlib to plot the graph,
and make_blobs from sklearn.datasets.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
Step 2: Create the custom dataset with make_blobs and plot it
X,y = make_blobs(n_samples = 500,n_features = 2,centers = 3,random_state = 23)
fig = plt.figure(0)
plt.grid(True)
plt.scatter(X[:,0],X[:,1])
plt.show()
Output

You might also like