0% found this document useful (0 votes)
8 views29 pages

Group 2 ML Assignment PDF

The document is an assignment from Injibara University on the topic of Machine Learning, specifically focusing on supervised and unsupervised learning. It covers various algorithms, their applications, advantages, and disadvantages, along with practical implementation examples in Python. The assignment includes contributions from a group of students and is submitted to a faculty member.

Uploaded by

ah4710519
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

Group 2 ML Assignment PDF

The document is an assignment from Injibara University on the topic of Machine Learning, specifically focusing on supervised and unsupervised learning. It covers various algorithms, their applications, advantages, and disadvantages, along with practical implementation examples in Python. The assignment includes contributions from a group of students and is submitted to a faculty member.

Uploaded by

ah4710519
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

INJIBARA UNIVERSITY

College Of Engineering and Technology

Department of Information System

Course Title: - Introduction to Machine Learning

Course Code:-InSy4101

Group Two Assignment

Group name: ID
1. Abaynew kassaye------------------------IUNSR/0080/13
2. Tamirat Gola------------------------------IUNSR/0743/13
3. Abdi Dawit-------------------------------- IUNSR/0004/13
4. Ahmed Habib----------------------------- IUNSR/0065/13
5. Meti Beshuma------------------------------- IUNSR/0562/13
6. Ediget Lema--------------------------------- IUNSR/0276/13
7. Guadnaw Mogase--------------------------0902
8. Tewodrose Tekalign-----------------------0765

Submitted to: Mis Habtamu. (Msc)

Submit date: 17/04/2017 E, C


Contents
Chapter One .................................................................................................................................................. 1
Supervised machine learning ........................................................................................................................ 1
Introduction ................................................................................................................................................. 1
1.1. Supervised machine learning ........................................................................................................ 2
1.2. Types of Supervised Learning Algorithms ................................................................................... 3
1.2.1. Regression Algorithm ........................................................................................................... 4
1.2.2. Classification Algorithm ..................................................................................................... 10
1.3. Applications of Supervised learning ........................................................................................... 16
1.4. Advantages of Supervised learning............................................................................................. 17
1.5. Disadvantages of Supervised learning ........................................................................................ 17
Chapter Two ................................................................................................................................................ 18
Unsupervised Machine learning ................................................................................................................. 18
2.1. Unsupervised Machine learning.................................................................................................. 18
2.2. Types of Unsupervised Learning Algorithm: .................................................................................. 19
2.2.1. Clustering .................................................................................................................................. 20
2.2.2. Association rule learning .......................................................................................................... 22
2.2. Application of Unsupervised learning ........................................................................................ 24
2.3. Advantages of Unsupervised learning ........................................................................................ 25
2.4. Disadvantages of Unsupervised learning .................................................................................... 25
Conclusion ................................................................................................................................................... 26
References .................................................................................................................................................. 27

Figure 1: Supervised learning ....................................................................................................................... 2


Figure 2: Classification of SML Algorithms ................................................................................................ 4
Figure 3: Unsupervised Machine Learning.................................................................................................. 18
Figure 4: Classification of UML Algorithm................................................................................................... 19

i
Chapter One
Supervised machine learning

Introduction

What is machine learning? Machine learning is a branch of artificial


intelligence (AI) and computer science which focuses on the use of data and
algorithms to imitate the way that humans learn, gradually improving its accuracy.

Machine learning is a field of computer science that gives computers the ability to
learn without being explicitly programmed.

Supervised learning and unsupervised learning are two main types of machine
learning.

1
1.1. Supervised machine learning

What is supervised learning?


Supervised learning is a type of machine learning algorithm that learns from
labeled data. Labeled data is data that has been tagged with a correct answer
or classification.

Supervised learning, the machine is trained on a set of labeled data, which


means that the input data is paired with the desired output.

Supervised learning is often used for tasks such as classification, regression,


and object detection.

For example, a labeled dataset of images of Elephant, Camel and Cow would
have each image tagged with either “Elephant”, “Camel “or “Cow.”

Figure 1: Supervised learning

2
Key Points:

 Supervised learning involves training a machine from labeled data.

 Labeled data consists of examples with the correct answer or


classification.

 The machine learns the relationship between inputs (fruit images) and
outputs (fruit labels).

 The trained machine can then make predictions on new, unlabeled data.

1.2. Types of Supervised Learning Algorithms

Supervised learning is classified into two categories of algorithms:

Such as:-

1. Regression Algorithm

2. Classification Algorithm

3
Figure 2: Classification of SML Algorithms

1.2.1. Regression Algorithm


 Regression algorithms learn a function that maps from the input features to
the output value.

 Regression Algorithm is a type of supervised learning Algorithm that is used


to predict continuous values, such as house prices, stock prices, or customer
churn.

 Regression Algorithm Categorical liable e,g Apple/orange recognition

 Regression: A regression problem is when the output variable is a real value,


such as “dollars” or “weight”.

There are many Regression Algorithms. But list some common regression
algorithms include:
4
A, Linear Regression

 It’s a foundational algorithm in supervised machine learning used for


predicting a continuous target variable based on one or more input features.
 It's one of the simplest and most widely used techniques for modeling
relationships between variables.
 Used for predicting continuous values. It assumes a linear relationship
between input features and the target variable.

The relationship between the variables is modeled as a straight line.

y=β0+β1x+ϵy

Where

 yy: Dependent variable (target)


 xx: Independent variable (feature)
 β0\beta_0: Intercept
 β1\beta_1: Slope
 ϵ\epsilon: Error term

Example:

A simple example using scikit-learn to predict house prices based on features like
size, number of rooms, and age of the house.

• Implementation:

In Python,
you can use libraries like scikit-learn:

import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data: [size, number_of_rooms, age]


X = np.array([[1400, 3, 20], [1600, 4, 15], [1700, 3, 25], [1500, 3, 10]])

5
y = np.array([300000, 400000, 350000, 320000]) # Target: house prices

# Create and fit the model


model = LinearRegression()
model.fit(X, y)

# Make predictions
new_house = np.array([[1600, 3, 5]])
predicted_price = model.predict(new_house)
print(f"Predicted price: ${predicted_price[0]:,.2f}")

B. Multi Leaner Regression

 An extension of simple linear regression that allows for the prediction of a


continuous target variable based on multiple input features. Instead of
using just one independent variable, multiple linear regressions use two or
more.
 The concept of simple linear regression to accommodate multiple
independent variables.
 This statistical technique is crucial for forecasting the value of a dependent
variable by analyzing its linear associations with two or more independent
variables.
 The goal of MLR is to predict the value of the dependent variable based on
the values of the independent variables.

Equation: The relationship between the dependent variable yy and multiple


independent variables x1,x2,...,xnx_1, x_2, ..., x_n is modeled as:

Y = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ + ∊

Where:

• Y is the dependent variable.

• X₁, X₂, ..., Xₙ are the independent variables.

• β₀ is the intercept.

• β₁, β₂, βₙ are the coefficients of the independent variables.

6
• ∊ is the error term.

Examples of Implementations: - a couple of examples of how to implement Multi-


Linear Regression using Python with libraries like statsmodels.
python

import pandas as pd

import statsmodels.api as sm

# Sample dataset

data = {

'Area': [1500, 1600, 1700, 1800, 1900],

'Bedrooms': [3, 3, 4, 4, 5],

'Age': [10, 15, 20, 25, 30],

'Price': [300000, 320000, 350000, 370000, 400000]}

df = pd.DataFrame(data)

# Independent variables

X = df[['Area', 'Bedrooms', 'Age']]

# Adding a constant for the intercept term

X = sm.add_constant(X)

# Dependent variable

y = df['Price']

# Creating and fitting the model

model = sm.OLS(y, X).fit()

# Making predictions

predictions = model.predict(X)

# Summary of the model

print(model.summary())

7
C. Polynomial Regression

 Polynomial Regression is a form of regression analysis in which the


relationship between the independent variable X and the dependent
variable Y is modeled as an nth degree polynomial.
 It is used when the data shows a non-linear relationship that can be
better captured by polynomial terms.

The general form of a polynomial regression model is:

Y = β₀ + β₁X + β₂X² + β₃X³ + ... + βₙXⁿ + ∊


Where:
• Y is the dependent variable.

• X is the independent variable.

• n is the degree of the polynomial.

• β₀, β₁, ..., βₙ are the coefficients.


• ∊ is the error term..

Example of Polynomial Regression in Python

Let’s consider a dataset where we want to model the relationship between the
independent variable xx and the dependent variable yy using a polynomial
function.

python

import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures

# Sample data

8
X = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]).reshape(-1, 1)

y = np.array([3, 9, 10, 27, 36, 45, 54, 63, 72])

# Transform to polynomial features

degree = 2

poly = PolynomialFeatures(degree)

X_poly = poly.fit_transform(X)

# Create and fit the model

model = LinearRegression()

model.fit(X_poly, y)

# Make predictions

X_predict = np.linspace(1, 9, 100).reshape(-1, 1)

X_predict_poly = poly.transform(X_predict)

y_predict = model.predict(X_predict_poly)

# Plot the results

plt.scatter(X, y, color='blue', label='Original data')

plt.plot(X_predict, y_predict, color='red', label='Polynomial fit')

plt.xlabel('X')

plt.ylabel('y')

plt.legend()

plt.title('Polynomial Regression')

plt.show()

9
1.2.2. Classification Algorithm

 Classification is a type of supervised learning that is used to predict


categorical values, such as

 whether a customer will churn or not,

 whether an email is spam or not, or

 whether a medical image shows a tumor or not.

 Classification algorithms learn a function that maps from the input


features to a probability distribution over the output classes.

Some common classification algorithms include:

1. Logistic Regression
 It’s a statistical method used for binary classification
problems, Classification tasks, predicting a binary outcome.
 The outcome variable is categorical and typically takes on
two possible values (e.g., 0 or 1, True or False, Yes or No).
 Despite its name, Logistic Regression is used for
classification rather than regression.

Example: Determining whether an email is spam or not.

 Implementation:

Python

from sklearn.linear_model import LogisticRegression

# Sample data
X = [[1, 2], [2, 3], [3, 4], [4, 5]]
y = [0, 0, 1, 1]

# Create and fit the model


model = LogisticRegression()

10
model.fit(X, y)

# Make predictions
predictions = model.predict([[3, 5]])
print(predictions)

2. Decision Trees

 Decision Trees are a popular and intuitive method for both classification and
regression tasks in machine learning.
 They model decisions and their possible consequences in a tree-like
structure,
 Each internal node represents a decision based on a feature, each branch
represents the outcome of that decision, and each leaf node represents a final
outcome (class label or predicted value).
 Main Purpose: Classification and regression.

Example: Classifying whether a loan applicant is high or low risk based on their credit score,
income, and loan amount.

 Implementation:
 Python

from sklearn.tree import DecisionTreeClassifier

# Sample data
X = [[600, 50, 20000], [700, 60, 15000], [800, 70, 30000]]
y = [0, 0, 1] # 0: Low risk, 1: High risk

# Create and fit the model


model = DecisionTreeClassifier()
model.fit(X, y)

# Make predictions
predictions = model.predict([[650, 55, 18000]])
print(predictions)

3. Support Vector Machines (SVM)

 SVM are a class of supervised machine learning algorithms primarily used


for classification tasks,

11
 But they can also be adapted for regression.
 The main idea behind SVM is to find the optimal hyper plane that separates
data points of different classes in a high-dimensional space.
 Main Purpose: Classification tasks.

Example: Identifying handwritten digits from images.

 Implementation:

Python

from sklearn import datasets


from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# Load dataset
digits = datasets.load_digits()
X = digits.data
y = digits.target

# Split the dataset


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Create and fit the model


model = SVC()
model.fit(X_train, y_train)

# Make predictions
predictions = model.predict(X_test)
print(predictions)

4. k-Nearest Neighbors (k-NN)

 K-NN is a simple, yet powerful, supervised machine learning algorithm used


for both classification and regression tasks.
 The core idea behind k-NN is to classify a data point based on how its
neighbors are classified.
 Main Purpose: Both classification and regression.

12
Example: Recommending movies to users based on the preferences of similar
users.

 Implementation:

Python

from sklearn.neighbors import KNeighborsClassifier

# Sample data
X = [[0], [1], [2], [3]]
y = [0, 0, 1, 1]

# Create and fit the model


model = KNeighborsClassifier(n_neighbors=3)
model.fit(X, y)

# Make predictions
predictions = model.predict([[1.5]])
print(predictions)

5, Neural Network

 Neural networks are a class of algorithms inspired by the structure and


function of the human brain.
 They consist of interconnected layers of nodes (neurons) that process
data and can learn complex patterns through training.
 Neural networks are widely used for various tasks, including
classification, regression, image recognition, natural language
processing, and more.

Example: Simple Neural Network for Binary Classification

Implementation

python

import numpy as np

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

13
# Sample data: Inputs and their corresponding labels

X = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]])

y = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1]) # 0: below 6, 1: 6 and above

# Create the model

model = Sequential([

Dense(10, input_dim=1, activation='relu'), # Input layer with 1 neuron and hidden layer with 10 neurons

Dense(1, activation='sigmoid') # Output layer with 1 neuron])

# Compile the model

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model on the data

model.fit(X, y, epochs=100, verbose=0)

# Make predictions

predictions = model.predict(np.array([[5], [6]]))

print(f"Prediction for 5: {predictions[0][0]:.2f}")

print(f"Prediction for 6: {predictions[1][0]::.2f}")

6. Naive Bayes

 It’s a probabilistic classification algorithm based on Bayes' Theorem,


with the "naive" assumption that features are independent of each other
given the class label.
 Despite this assumption, Naive Bayes performs remarkably well in
many real-world applications, particularly in text classification and spam
detection.

14
 Bayes' Theorem: The foundation of Naive Bayes, which describes the
probability of a class given the features:

P(C∣X)=P(X∣C)⋅P(C)/P(X)P(C|X)

 P(C∣X)P(C|X): Posterior probability of class CC given features XX


 P(X∣C)P(X|C): Likelihood of features XX given class CC
 P(C)P(C): Prior probability of class CC
 P(X)P(X): Prior probability of features

Example Implementation: Gaussian Naive Bayes

python

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.naive_bayes import GaussianNB

from sklearn.metrics import accuracy_score

# Load the iris dataset

iris = load_iris()

X = iris.data

y = iris.target

# For simplicity, use only the first two classes (setosa and versicolor)

X = X[y != 2]

y = y[y != 2]

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create the Gaussian Naive Bayes model

model = GaussianNB()

15
# Train the model on the training data

model.fit(X_train, y_train)

# Make predictions on the testing data

y_pred = model.predict(X_test)

# Evaluate the model's accuracy

accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")

1.3. Applications of Supervised learning


Supervised learning can be used to solve a wide variety of problems,
including:

 Spam filtering: can be trained to identify and classify spam emails based
on their content, helping users avoid unwanted messages.

 Image classification: automatically classify images into different


categories, such as animals, objects, or scenes, facilitating tasks like image
search, content moderation, and image-based product recommendations.

 Medical diagnosis: can assist in medical diagnosis by analyzing patient


data, such as medical images, test results, and patient history, to identify
patterns that suggest specific diseases or conditions.

 Fraud detection: Supervised learning models can analyze financial


transactions and identify patterns that indicate fraudulent activity, helping
financial institutions prevent fraud and protect their customers.

16
1.4. Advantages of Supervised learning

 Supervised learning allows collecting data and produces data output from
previous experiences.

 Helps to optimize performance criteria with the help of experience.

 It performs classification and regression tasks.

 It allows estimating or mapping the result to a new sample.

1.5. Disadvantages of Supervised learning

 Classifying big data can be challenging.

 Training for supervised learning needs a lot of computation time. So, it


requires a lot of time.

 Computation time is vast for supervised learning.

 It requires a labeled data set.

17
Chapter Two
Unsupervised Machine learning

2.1. Unsupervised Machine learning

What is unsupervised learning?


 Unsupervised learning is a type of machine learning that learns from
unlabeled data.

 This means that the data does not have any pre-existing labels or
categories. The goal of unsupervised learning is to discover patterns and
relationships in the data without any explicit guidance.

These groupings might correspond to various animal species, providing you


to categorize the creatures without depending on labels that already exist.

Figure 3: Unsupervised Machine Learning

18
Key Points

 Unsupervised learning allows the model to discover patterns and


relationships in unlabeled data.

 Clustering algorithms group similar data points together based on their


inherent characteristics.

 Feature extraction captures essential information from the data, enabling


the model to make meaningful distinctions.

 Label association assigns categories to the clusters based on the extracted


patterns and characteristics.

2.2. Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized


into two types of Algorithms.

Figure 4: Classification of UML Algorithm

19
2.2.1. Clustering
 Clustering is a type of unsupervised learning that is used to
group similar data points together.

 Clustering problem is where you want to discover the inherent


groupings in the data, such as grouping customers by purchasing
behavior.
 Clustering algorithms work by iteratively moving data points
closer to their cluster centers and further away from data points
in other clusters.

Example Implementation of Agglomerative Clustering in Python

Implementation

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import dendrogram, linkage
# Generate synthetic data
X, _ = make_blobs(n_samples=100, centers=3, cluster_std=0.60, random_state=0)
# Apply Agglomerative Clustering
agg_clustering = AgglomerativeClustering(n_clusters=3, linkage='ward')
labels = agg_clustering.fit_predict(X)
# Plot the clustered data
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='rainbow')
plt.title('Agglomerative Hierarchical Clustering')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()
# Create the linkage matrix and plot the dendrogram
Z = linkage(X, 'ward')
plt.figure(figsize=(10, 5))
dendrogram(Z, truncate_mode='level', p=5)
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('Sample index')

20
plt.ylabel('Distance')
plt.show()

Some common Clustering Types algorithms include:

1. Hierarchical clustering: Hierarchical clustering is a method of cluster


analysis which seeks to build a hierarchy of clusters. It is commonly used in
data mining and statistics for finding meaningful patterns or groups in data.
There are two main types of hierarchical clustering.

2. K-means clustering:-is a popular unsupervised machine learning algorithm


used to partition a dataset into KK distinct, non-overlapping clusters. It aims to
group data points in such a way that points in the same cluster are more similar
to each other than to those in other clusters.

3. Principal Component Analysis: (PCA) is a dimensionality reduction


algorithm used to reduce the number of variables in a dataset while preserving
as much information as possible.

It transforms the original variables into a new set of uncorrelated variables


called principal components, ordered by the amount of variance they capture
from the data.

4. Singular Value Decomposition: - (SVD) is a mathematical technique used


in linear algebra to factorize a matrix into three distinct matrices.

It's commonly used in various applications, such as data compression, noise


reduction, and latent semantic analysis in natural language processing

 Factorization: SVD decomposes a given matrix AA into three matrices: UU, Σ\Sigma,
and VTV^T.

A=UΣVTA = U \Sigma V^T

 UU: An orthogonal matrix containing the left singular vectors.


 Σ\Sigma: A diagonal matrix containing the singular values.
 VTV^T: An orthogonal matrix containing the right singular vectors.
21
 Properties:
o The singular values in Σ\Sigma are non-negative and sorted in descending order.
o The columns of UU and VV are orthonormal.

5. Independent Component Analysis (ICA) is a computational method for


separating a multivariate signal into additive, independent non-Gaussian
signals.

It's primarily used in the fields of signal processing and data analysis for tasks
such as blind source separation and feature extraction.

2.2.2. Association rule learning


 Association rule learning is a type of unsupervised learning that is
used to identify patterns in a data.

 An association rule learning problem is where you want to


discover rules that describe large portions of your data, such as
people that buy X also tend to buy Y.

 Association rule learning algorithms work by finding


relationships between different items in a dataset.

Some common association rule learning algorithms include:


1. Apriori Algorithm

 Used to identify frequent item sets in a transaction database and


generate association rules.
 It uses a bottom-up approach, starting with individual items and
extending them to larger item sets as long as those item sets appear
sufficiently often.

Example:

Consider a transaction database with the following transactions:

22
 T1: {Bread, Milk}
 T2: {Bread, Diaper, Beer, Eggs}
 T3: {Milk, Diaper, Beer, Coke}
 T4: {Bread, Milk, Diaper, Beer}
 T5: {Bread, Milk, Diaper, Coke}

Using Apriori, we can find frequent item sets and generate rules like {Bread} ->
{Milk}, indicating that customers who buy bread are likely to buy milk.

2. Éclat Algorithm

 The Éclat algorithm (Equivalence Class Clustering and bottom-up


Lattice Traversal) is another method for frequent item set mining,
which uses a depth-first search approach.

Example:

Given the same transactions, Eclat transforms the data as:

 Bread: {T1, T2, T4, T5}


 Milk: {T1, T3, T4, T5}
 Diaper: {T2, T3, T4, T5}

By intersecting these sets, we identify frequent item sets like {Bread, Milk}
appearing in transactions {T1, T4, T5}.

3. FP-Growth Algorithm

 The FP-Growth algorithm (Frequent Pattern Growth) is an efficient


alternative to Apriori,
 Which avoids the costly candidate generation step by using a compact
data structure called an FP-Tree (Frequent Pattern Tree)?

Example:

With our transaction database:

23
1. FP-Tree Construction:
o Identify frequent items.
o Construct the tree by inserting transactions and updating counts.
2. Mining:
o Traverse the tree to find frequent itemsets like {Diaper, Beer}, which
might indicate a strong association between these items.

2.2. Application of Unsupervised learning


Non-supervised learning can be used to solve a wide variety of problems,
including:

 Anomaly detection: Unsupervised learning can identify unusual patterns


or deviations from normal behavior in data, enabling the detection of
fraud, intrusion, or system failures.

 Scientific discovery: Unsupervised learning can uncover hidden


relationships and patterns in scientific data, leading to new hypotheses and
insights in various scientific fields.

 Recommendation systems: Unsupervised learning can identify patterns


and similarities in user behavior and preferences to recommend products,
movies, or music that align with their interests.

 Customer segmentation: Unsupervised learning can identify groups of


customers with similar characteristics, allowing businesses to target
marketing campaigns and improve customer service more effectively.

 Image analysis: Unsupervised learning can group images based on their


content, facilitating tasks such as image classification, object detection,
and image retrieval.

24
2.3. Advantages of Unsupervised learning
 It does not require training data to be labeled.

 Dimensionality reduction can be easily accomplished using unsupervised


learning.

 Capable of finding previously unknown patterns in data.

 Unsupervised learning can help you gain insights from unlabeled data that
you might not have been able to get otherwise.

 Unsupervised learning is good at finding patterns and relationships in data


without being told what to look for. This can help you learn new things
about your data.

2.4. Disadvantages of Unsupervised learning


 Difficult to measure accuracy or effectiveness due to lack of predefined
answers during training.

 The results often have lesser accuracy.

 The user needs to spend time interpreting and label the classes which
follow that classification.

 Unsupervised learning can be sensitive to data quality, including missing


values, outliers, and noisy data.

 Without labeled data, it can be difficult to evaluate the performance of


unsupervised learning models, making it challenging to assess their
effectiveness.

25
Conclusion
 Supervised and unsupervised learning are two powerful tools that can
be used to solve a wide variety of problems.

 Supervised learning is well-suited for tasks where the desired output is


known, while unsupervised learning is well-suited for tasks where the
desired output is unknown.

 Supervised learning and unsupervised learning are two main types


of machine learning.

 Unsupervised learning is a type of machine learning that learns from


unlabeled data.

 Supervised learning, the machine is trained on a set of labeled data,


which means that the input data is paired with the desired output.

 The goal of unsupervised learning is to discover patterns and


relationships in the data without any explicit guidance.

26
References

• Bishop, C. M. (2006). *Pattern Recognition and Machine Learning*. Springer.

• Murphy, K. P. (2012). *Machine Learning: A Probabilistic Perspective*. MIT


Press.

• Géron, A. (2019). *Hands-On Machine Learning with Scikit-Learn, Keras, and


TensorFlow*. O'Reilly Media.

• Official Scikit-learn Documentation: https://scikit-


learn.org/stable/documentation.html

27

You might also like