0% found this document useful (0 votes)
6 views7 pages

ML Answers

The document explains the Decision Tree algorithm, describing it as a flowchart-like model that makes decisions through a series of yes/no questions based on features. It also contrasts supervised and unsupervised learning, highlighting their differences in data usage and feedback mechanisms. Additionally, it covers logistic regression, hyperplanes, maximum margin hyperplanes, and support vectors in the context of SVM, providing examples for clarity.

Uploaded by

pradeepkori1092
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

ML Answers

The document explains the Decision Tree algorithm, describing it as a flowchart-like model that makes decisions through a series of yes/no questions based on features. It also contrasts supervised and unsupervised learning, highlighting their differences in data usage and feedback mechanisms. Additionally, it covers logistic regression, hyperplanes, maximum margin hyperplanes, and support vectors in the context of SVM, providing examples for clarity.

Uploaded by

pradeepkori1092
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Answers

Explain Decision Tree Algorithm with Example.

What is a Decision Tree?


A Decision Tree is a machine learning model that makes decisions by asking a
series of yes/no (or true/false) questions, like a flowchart. It splits data into
branches based on features (like “Is it raining?” or “Is the temperature high?”),
and each branch leads to a decision or prediction (like “Play outside” or “Stay
inside”). It’s called a “tree” because it starts at a root (the first question) and
branches out to leaves (the final answers).
Think of it like playing a game of “20 Questions” to figure something out—each
question narrows down the possibilities until you get to an answer.

Why use Decision Trees?


There are various algorithms in Machine learning, so choosing the best
algorithm for the given dataset and problem is the main point to remember
while creating a machine learning model. Below are the two reasons for using
the Decision tree:

Decision Trees usually mimic human thinking ability while making a


decision, so it is easy to understand.

The logic behind the decision tree can be easily understood because it
shows a tree-like structure.

Working of Decision Tree Algorithm 🌳


The Decision Tree builds a model by splitting the dataset based on certain
rules/conditions, and keeps splitting until it reaches the final decision.
Let’s break it down 👇
✍️ Working of Decision Tree (with Weather Example):
1. 👉 Select the Best Attribute as Root Node:

Answers 1
From the dataset, choose the feature with the highest Information Gain or
lowest Gini Index.
Example: From the weather dataset, we select "Weather" as the root node
(Sunny, Cloudy, Rainy).
2. 👉 Create Branches for Each Value of the Attribute:
Make branches from the root for each value.
Example: From "Weather," we make 3 branches: Sunny, Cloudy, and Rainy.
3. 👉 Split Data into Subsets:
For each branch, separate the data where the condition is true.
Example: For the "Sunny" branch, we now look at only the data where weather
is sunny.
4. 👉 Repeat Steps for Subsets:
For each subset, choose the next best attribute and repeat the process.
Example: For "Sunny," next best attribute is "Humidity":
- If Humidity is High → Don’t play
- If Humidity is Normal → Play
5. 👉 Stop When a Decision is Reached:
When there are no more attributes to split or data is pure, we make a final
decision (leaf node).

Example:
- If Weather is Cloudy → Always Play {For Cloudy the descision is always yes
so no need to check other attribute cause the decision is clearly seen as all are
yes in dataset}.

- If Weather is Rainy → Check Wind:


- Strong → Don’t play

- Weak → Play

if Weather is Sunny → Check Humidity:

High → Don’t play

Normal → Play

Answers 2
Weather
/ | \
Sunny Cloudy Rainy
/ | \
Humidity Play Wind
/ \ / \
High Normal Strong Weak
No Yes No Yes

Supervised Vs Unsupervised Learning


Supervised Learning Unsupervised Learning

Supervised learning algorithms are trained Unsupervised learning algorithms are


using labeled data. trained using unlabeled data.

Supervised learning model takes direct


Unsupervised learning model does not
feedback to check if it is predicting correct
take any feedback.
output or not.

Supervised learning model predicts the Unsupervised learning model finds the
output. hidden patterns in data.

In supervised learning, input data is


In unsupervised learning, only input data
provided to the model along with the
is provided to the model.
output.

The goal of supervised learning is to train The goal of unsupervised learning is to


the model so that it can predict the output find the hidden patterns and useful
when new data is given. insights from the unknown dataset.

Supervised learning needs supervision to Unsupervised learning does not need any
train the model. supervision to train the model.

Supervised learning can be categorized Unsupervised Learning can be classified


in Classification and Regression problems. in Clustering and Associations problems.

Supervised learning can be used for those Unsupervised learning can be used for
cases where we know the input as well as those cases where we have only input
corresponding outputs. data and no corresponding output data.

Unsupervised learning model may give


Supervised learning model produces an
less accurate result as compared to
accurate result.
supervised learning.

Answers 3
Supervised learning is not close to true Unsupervised learning is more close to
Artificial intelligence as in this, we first train the true Artificial Intelligence as it learns
the model for each data, and then only it similarly as a child learns daily routine
can predict the correct output. things by his experiences.

It includes various algorithms such as


Linear Regression, Logistic Regression, It includes various algorithms such as
Support Vector Machine, Decision tree, Clustering, KNN, and PCA.
Bayesian Logic, etc.

Write Short note on: (a) Cross Validation (b) Inductive bias
Cross Validation
Cross-validation is a method used to check how well our machine
learning model will work on new, unseen data.

When we train a model, it learns from the given data. But just training is
not enough — we need to test if the model will perform well on data it
has never seen before.

In cross-validation, we split our dataset into multiple parts (called


"folds").

For example, if we use 5-fold cross-validation:

We divide the data into 5 equal parts.

We train the model on 4 parts and test it on the remaining 1 part.

We repeat this process 5 times, changing the test part each time.

Finally, we take the average of all the test results to understand how
well the model is performing overall.

This helps us:

Avoid the mistake of testing the model on the same data it learned from.

Get a better idea of how the model will work in real life.

Reduce problems like overfitting or underfitting.

In short, cross-validation makes sure our model is not just memorizing the
training data but is actually learning useful patterns.

Answers 4
Inductive bias
When your ML Model picks a solution from the hypothesis space, It
needs some guidance. It Can’t try every possible solution (specialy if
the hypothesis space is infinite) that guidance is called Inductive bias.

Inductive bias refers to the set of assumptions that a learning algorithm


uses to predict outputs for inputs it has never seen before.

Example: Imagine you are guessing your friend’s favourite color.

If you assume they like bright color, you’ll guess yellow, red etc

If you assume they like neutral shade, you’ll guess white, grey etc

That assumption is your inductive bias. It hels you guess faster, without
trying evry possible color in the world.
Explain Logistic Regression with example
Logistic regression is a supervised machine learning algorithm used
for classification tasks.

But instead of predicting values directly like Linear Regression, it predicts


the probability that something belongs to a certain class using a special
function called the sigmoid function.

It’s referred to as regression because it is the extension of linear


regression but is mainly used for classification problems.

📊 Example:
Let’s say we want to predict whether a student will pass an exam or not based
on hours of study.

Input: Hours studied

Answers 5
Output: Pass (Yes/No)

We train the model with past student data:

Hours Studied Passed

1 No

2 No

3 Yes

4 Yes

After training, if a new student studied 2.5 hours, logistic regression might give
a probability like 0.65.
→ Since 0.65 > 0.5, we classify the result as “Yes” (student will pass).
With respect to SVM explain the terms: 1) Hyperplane 2)
Maximum Margin Hyperplane (MMH) 3) Support Vector.

1) Hyperplane
A hyperplane is simply a decision boundary created by the SVM algorithm to
separate data into different classes.

In 2D (two features), this is just a straight line.

In 3D, it becomes a flat surface or a plane.

In more than 3 dimensions, we call it a hyperplane (a general term).

The main goal of the hyperplane is to separate data points of one class from
those of another as cleanly as possible.

📌 Example:
Imagine we are trying to classify fruits as either apples or oranges based on
their weight and color. The hyperplane would be the line (in 2D) or surface (in
3D) that divides apples on one side and oranges on the other.

2) Maximum Margin Hyperplane (MMH)


Out of many possible hyperplanes that can separate the data, SVM chooses
the one that creates the maximum margin between Hyperplane and the

Answers 6
closest data points from both classes. This special hyperplane is called the
Maximum Margin Hyperplane.

Because a larger margin means the model is more confident.

It helps the model make better predictions on new (unseen) data.

It avoids overfitting by not sticking too closely to the training points.

3) Support Vectors
Support vectors are the most important data points in SVM. These are the
points from each class that are closest to the hyperplane.

These points “support” the position of the hyperplane.

If you remove or change a support vector, the position of the hyperplane


can also change.

The MMH is created in such a way that it lies exactly halfway between the
support vectors of both classes.

📌 Example:
In our fruit example, the apple and orange that are nearest to the separating
line are support vectors. The entire decision boundary depends on them.

Answers 7

You might also like