ML Questions
Module 1 - Introduction to Machine Learning
Important Topics
Types of Machine Learning
Issues in Machine Learning
Applications of Machine Learning
Steps in developing a Machine Learning Application
Training Error, Generalization Error, Overfitting, Underfitting, Bias-
Variance Trade-off
Important questions
What are the issues and challenges in Machine Learning?
Explain the steps of developing Machine Learning applications.
Explain any five applications of Machine Learning.
Explain the steps required for selecting the right ML algorithm. How to choose
the right ML algorithm?
Explain any five business applications of Machine learning.
Explain Machine learning and it’s types
Module 2 - Learning with Regression and Trees
Important Topics
Linear Regression
Multivariate Linear Regression
Logistic Regression
Decision Trees
Constructing Decision Trees using Gini Index
Classification and Regression Trees (CART)
Important questions
Explain Regression line, Scatter plot, Error in prediction and Best fitting line.
Explain Linear regression along with an example + Numerical
Explain Logistic Regression.
Write a note on Decision Tree + Numerical
Create a decision tree using Gini Index to classify the following dataset. 2022
Dec.
Sr No – 1 2 3 4 5 6 7 8 9 10 11 12
Income –
Age –
Own Car –
Explain Gini Index along with an example.
Consider the example below where the mass, 𝑦 (grams), of a chemical is
related to the time, 𝑥 (seconds), for which the chemical reaction has been
taking place according to the table. Find the equation of the regression line.
Also explain performance evaluation measures for regression. 2024 May
Time – 5 7 12 16 20
Mass – 40 120 180 210 240
Module 3 - Ensemble Learning
Important Topics
Understanding Ensembles
K-fold cross-validation
Boosting
Stumping
Bagging
Subagging
Random Forest
XGBoost
Different ways to combine classifiers
Important questions
Explain the Random Forest algorithm in detail.
Explain the different ways to combine the classifiers.
Explain the concept of bagging and boosting.
Explain the necessity of cross-validation in ML applications
Explain the concept of k fold cross-validation.
Compare Bagging and Boosting with reference to ensemble learning. Explain
how these methods help to improve the performance of the machine learning
model.
Explain Ensemble learning algorithm Random Forest and its use cases in real-
world applications.
Module 4 - Support Vector Machine
Important Topics
Constrained Optimization
Optimal Decision Boundary
Margins and Support Vectors
SVM as a Constrained Optimization Problem
Quadratic Programming
SVM for linear and non-linear classification
Kernel Trick
Support Vector Regression
Multiclass Classification
Important questions
Explain Multiclass Classification
Explain the concept of margin and support vector.
Define Support Vector Machine.
Explain how margin is computed and optimal hyper-plane is decided.
Explain the support vector machine as a constrained optimization problem.
Define the following terminologies with reference to Support vector machine:
Hyperplane, Support Vectors, Hard Margin, Soft Margin, Kernel
Explain the kernel trick in Support Vector Machine
Consider the use case of email spam detection. Identify and explain the
suitable machine learning techniques for this task
Module 5 - Learning with Clustering
Important Topics
Introduction to clustering with an overview of distance metrics and
major clustering approaches
Graph-based Clustering: Clustering with minimal spanning tree
Model-based Clustering: Expectation Maximization Algorithm
Density-based Clustering: DBSCAN
Important questions
Explain the distance metrics used in clustering.
Explain clustering with minimal spanning tree with reference to Graph-based
clustering
Explain EM algorithm.
Explain K-means algorithm or numerical
Explain the concept of Expectation Maximization Algorithm.
Write a detailed note on DBSCAN.
Explain DBSCAN algorithm along with example.
What is Density based clustering? Explain the steps used for the clustering task
using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
algorithm.
Explain Clustering with minimal spanning tree along with an example.
Module 6 - Dimensionality Reduction
Important Topics
Dimensionality Reduction Techniques
Principal Component Analysis
Linear Discriminant Analysis
Singular Valued Decomposition
Important questions
Compute the Linear Discriminant projection for the following two-dimensional
dataset. 2022 Dec
X1 = (x1,x2) = {(4,1), (2,4), (2,3), (3,6), (4,4)}
X1 = (x1,x2) = {(9,10), (6,8), (9,5), (8,7), (10,8)}
(Datasets provided in the question papers.)
Write a detailed note on Principal Component Analysis for Dimension
Reduction.
Find SVD for A = [ 2 2
-1 1]. (Matrix provided in the question paper.)
What is dimensionality reduction? Explain how it can be utilized for
classification and clustering tasks in Machine learning.
Explain the Dimensionality reduction technique Linear Discriminant Analysis
and its real-world applications.
Explain Linear Discriminant Analysis for Dimension Reduction + Numerical