ML Answers
ML Answers
The logic behind the decision tree can be easily understood because it
shows a tree-like structure.
Answers 1
From the dataset, choose the feature with the highest Information Gain or
lowest Gini Index.
Example: From the weather dataset, we select "Weather" as the root node
(Sunny, Cloudy, Rainy).
2. 👉 Create Branches for Each Value of the Attribute:
Make branches from the root for each value.
Example: From "Weather," we make 3 branches: Sunny, Cloudy, and Rainy.
3. 👉 Split Data into Subsets:
For each branch, separate the data where the condition is true.
Example: For the "Sunny" branch, we now look at only the data where weather
is sunny.
4. 👉 Repeat Steps for Subsets:
For each subset, choose the next best attribute and repeat the process.
Example: For "Sunny," next best attribute is "Humidity":
- If Humidity is High → Don’t play
- If Humidity is Normal → Play
5. 👉 Stop When a Decision is Reached:
When there are no more attributes to split or data is pure, we make a final
decision (leaf node).
Example:
- If Weather is Cloudy → Always Play {For Cloudy the descision is always yes
so no need to check other attribute cause the decision is clearly seen as all are
yes in dataset}.
- Weak → Play
Normal → Play
Answers 2
Weather
/ | \
Sunny Cloudy Rainy
/ | \
Humidity Play Wind
/ \ / \
High Normal Strong Weak
No Yes No Yes
Supervised learning model predicts the Unsupervised learning model finds the
output. hidden patterns in data.
Supervised learning needs supervision to Unsupervised learning does not need any
train the model. supervision to train the model.
Supervised learning can be used for those Unsupervised learning can be used for
cases where we know the input as well as those cases where we have only input
corresponding outputs. data and no corresponding output data.
Answers 3
Supervised learning is not close to true Unsupervised learning is more close to
Artificial intelligence as in this, we first train the true Artificial Intelligence as it learns
the model for each data, and then only it similarly as a child learns daily routine
can predict the correct output. things by his experiences.
Write Short note on: (a) Cross Validation (b) Inductive bias
Cross Validation
Cross-validation is a method used to check how well our machine
learning model will work on new, unseen data.
When we train a model, it learns from the given data. But just training is
not enough — we need to test if the model will perform well on data it
has never seen before.
We repeat this process 5 times, changing the test part each time.
Finally, we take the average of all the test results to understand how
well the model is performing overall.
Avoid the mistake of testing the model on the same data it learned from.
Get a better idea of how the model will work in real life.
In short, cross-validation makes sure our model is not just memorizing the
training data but is actually learning useful patterns.
Answers 4
Inductive bias
When your ML Model picks a solution from the hypothesis space, It
needs some guidance. It Can’t try every possible solution (specialy if
the hypothesis space is infinite) that guidance is called Inductive bias.
If you assume they like bright color, you’ll guess yellow, red etc
If you assume they like neutral shade, you’ll guess white, grey etc
That assumption is your inductive bias. It hels you guess faster, without
trying evry possible color in the world.
Explain Logistic Regression with example
Logistic regression is a supervised machine learning algorithm used
for classification tasks.
📊 Example:
Let’s say we want to predict whether a student will pass an exam or not based
on hours of study.
Answers 5
Output: Pass (Yes/No)
1 No
2 No
3 Yes
4 Yes
After training, if a new student studied 2.5 hours, logistic regression might give
a probability like 0.65.
→ Since 0.65 > 0.5, we classify the result as “Yes” (student will pass).
With respect to SVM explain the terms: 1) Hyperplane 2)
Maximum Margin Hyperplane (MMH) 3) Support Vector.
1) Hyperplane
A hyperplane is simply a decision boundary created by the SVM algorithm to
separate data into different classes.
The main goal of the hyperplane is to separate data points of one class from
those of another as cleanly as possible.
📌 Example:
Imagine we are trying to classify fruits as either apples or oranges based on
their weight and color. The hyperplane would be the line (in 2D) or surface (in
3D) that divides apples on one side and oranges on the other.
Answers 6
closest data points from both classes. This special hyperplane is called the
Maximum Margin Hyperplane.
3) Support Vectors
Support vectors are the most important data points in SVM. These are the
points from each class that are closest to the hyperplane.
The MMH is created in such a way that it lies exactly halfway between the
support vectors of both classes.
📌 Example:
In our fruit example, the apple and orange that are nearest to the separating
line are support vectors. The entire decision boundary depends on them.
Answers 7