ML Question Papers
ML Question Papers
D44E62X23
257YD44E
Paper /Subject Code: 42171 /MACHINE LEARNING
Eo2X237Y
DA4ES
Duration: 3hrs MaxKMarks:80
B Compare Bagging and Boosting with reference to ensemble learning. Expain how these [10]
methods help to improve the performance of the machine learning model,
Q3 A Consider the example below where the mass, y(grams),of a chemical is related to the
JE62X237YD44E62X23
1O]
time, x (seconds), for which the chemical reaction has been taking place according to the
table, Find the equation of the regression line. Also explain performance evaluation
measures for regression.
Time, x(seconds) 7 12 16 20
Mass, y(grams) 40 120 180 210 240
B What is Density based clustering? Explain the steps used for clustering task using [10]
Y D A
A Explain Clustering with minimal spanníng tree along with example. [10]
B Consider the dataset given below with 3 features Color, Wig, Num. Ears and one output [10]
52X237
YD44E61
variable Emotion
G G G B R R R
Color B R
Wig N
4E62X237YD44ERx
B 2 3
i) Find root node of decision tree using GINI index
ii) Explain techniques can be used to handle over fitting in decision trees?
A Consider the use case of Email spam detection. Identify and explain the suitable machine [10]
learning technique for this task.
B Explain the Dimensionality reduction technique Linear Discriminant Analysis and its [10]
YD44E62X2real-world applications.
6 A Define following terminologies with reference to Support vector machine: [10]
Hyper plane, Support Vectors, Hard Margin, Soft Margin, Kernel
B ExplainD44E62X237YD44E
Ensemble learning algorithm Random Forest and its use cases in real world [10]
X237y applications.
K237YD44E62X237
237YD44
to2x237y
57037 Page 1 of 1
YD44
X237YD44E62X237YD44E62X237YD44E62X237YD44E62
Paper /Subject Code: 42171/MACHINE LEARNING
B.E. SEM VIl/ COMP/C SCHEME / DEC 2023 / 26.12.2023
Time: 3hrs G. Notal Marks:80]
N.B. : (1) Question No 1 is Compulsory.
QI
(2) Attempt any three questions out of the remaining five.
(3) Assume suitable data, if required and state it clearly.
Attempt any FOUR from the following
EXAM
RAIG [20]
A Explain how to choose the right algorithm for machine learning application.
B Explain Linear Discriminant Analysis.
C Explain any five performance measures along with example.
D Differentiate between Logistic regression and Support vectormachine.
E Explain the following Receiver operating characteristics curve and Area under curve.
Q2 A Explain clustering with minimal spanning tree with reference to Graph based clustering. [10]
B Explain the terms overfiting, underfitting, bias &variance tradeoff w.r.t. Machine Learning. [10]
Q3 A Explain the concept of regression and enlist its types. Aclinical trial gave the data for BMI [10]
and Cholesterol level for 10 Patients as shown in table below. ldentifythe machine learning
method used to solve the above problem and predict the likel9value of Cholesterol level for
someone who has BMI of 27.
BMI 17 21 24 28 |4 16 22 15 18
Cholesterol 140 189 210 240 130 100 35 |166 130 170
B Explain the necessity of cross validation in Machine learningapplications and K-fold cross [10]
validation in detail.
Q4 A Explain support vector machine as a constrained optimization problem. [10]
B Explain the concept of decision tree. Consider the dataset given in a table below. The dataset [10]
has 3 features as Past Trend, Open interest, Trading volume and one class label as Return.
Compute the Gini Index for all features and specify which node will be chosen as a root node
in decision tree.
Past Trend Open Interest Trading Volume Return
Positiye Low High Up
Negative High Low Down
Positive Low High Up
Positive High High Up
Negative Low High Down
Positive Low Low Down
Negative High High Down
Negative Low High Down
Positive Low Low Down
Positive High High Up
Q5 A Explain kernel Trick in support vector machine. [10]
B Explain different ways to combine classifiers. [10]
Q6 Write any TWO from the following [20]
A Explain multiclass classification techniques.
B Explain in detail Principal Component Analysis for Dimensionality reduction
C Explain DBSCAN algorithm along with example
38397 Page 1 of 1
309CBE773807C22D23F8737 1AC7COSEF
Paper / Subject Code: 42171 /MACHINE LEARNING
A p A D E N
No hardware Up
no software Up
B
Find :SVD [10]
DEFOECG0F6FOE3
AS79EY
E34E4DESFC88B4AS79E2EFOEC690F6F6
DENSSsCCS
Paper / Subject Code: 42171 / MACHINEADESSS8CCcsB97C
LEARNING
3.Con
DACISADESsse
Duration: 3hrs [Max Marks:80]
line.
CADI3756C339C8DAC
c. Explain the concept of margin and support vector.
d. Explain the distance metrics used in clustering.
e. Explain Logistic Regression
E8S58CCSB97CADI3756c
Q3. a. Create a decision tree using Gini Index to classify following dataset.
Sr. No. Income Age Own Car
Very High Young Yes
2 High Medium Yes
3 Low Young No
4 High Medium Yes
5 Very High Medium Yes
6 Medium Young Yes
7 High Old Yes
Medium Medium No
Low Medium No
DES
10 Low Old No
11 High Young Yes
12 Medium Old No
3ADE
Q5. a., Compute the Linear Discriminant projection for the following two-dimensional [10]
dataset. X1=(xI,x2) = (4,1), (2,4), (2,3), (3,6), (4.4)) and
X2= (x1, x2) = (9,10), (6,8). (9,5), (8,7). (10,8))
b. Explain EM algorithm. [10]
CAD
15s
SS4 p o 1 C A D
C13ADE8558CCSB97CADI3756C339C8DA
Paper / Subject Code: 52701 / Elective- I1I ) Machine Learning
B.E(COMPUTER)(SEM VII) ) (CBSGS) DEC 201/04.12.2019
EXAM)
Time: 03 Hours Marks: 80
RJAT
Note: 1.Question I is compulsory
2. Answer any three out of remaining five questions.
3. Assume any suitable data wherever required and justify the same.
Q2 a) Compare and contrast Linear and Logistic regressions with respect to their [10]
mechanisms of prediction.
b) Consider 2-D dataset given in the table below. Construct a SVM classifier model. [10]
Given (2, 1), (2, -1l) and (4, 0) as support vectors; estimate the parameters of the
model and classify (4. 2). Why is sVM called as optimal binary hyper plane
classifier?
(X1, X2)(1,-D) (2, -1) (6, -1)) (4, 0)| (6,0)) (1, 1) (2, 1) (5, 1)|
Class CI CI C2 C2 C2CI CI C2
Q3 a) You are given a data set on cancer detection. You have built a classificaticon model [10]
and achieved an accuracy of 96%. Why shouldn't you be happy with your model
performance? What can you do about it?
b) What is a HMM? What are the issues in Hidden Markov Model (HMM)? [10]
Q4 a) You came to know that your model is suffering from low bias and high variance. [10]
Which algorithm should you use to tackle it? Why?
b) Differentiate between simple linkage, average linkage and complete linkage [10]
algorithms. Use complete linkage algorithm to find the clusters from the following
dataset.
24 24
y 4 48 4 12
4B5860DF6148AB46DA06129EE2DD9086
Paper /Subject Code: 52701/ Elective- III I) Machine Learning
Q5 a) Draw the block diagram of ErrorBack Propagation Algorithm and explain with flow [10]
chart the concept of Back Propagation.
b) The following table consists of training data fromn an employee database. The data [10]
have been generalized. For example, "31...35" for age represents the age range of
31 to 35. For a given row entry, count represents the number of data tuples having
the values for department, status, age, and salary given in that row. Let the status be
the class-label attribute.
(0) Design a multilayer feed-forward neural network for the given data. Label
the nodes in the input and output layers.
(i1) Using the multilayer feed-forward neural network obtained in (i), show the
weight values after one iteration of the back propagation algorithm, given the
training instance "(sales, senior, 31 35, 46K sOKy"
Assume initial weight values and biases. Assume learning rate to be 0.9. Use binary
input and draw (one input layer, one output layer and one hidden layer) neural
network. Solve the problem for one epoch.
department status age salary COunt
systens 4145
marketing 40
marketing 31 33
secretary 46 ... 50 36K 405
sCretary 26
nu
Page 2 of2
4B5860DF6148AB46DA06129EE2DD9086
Paper / Subject Code: 52701 / Elective- III 1) Machine Learning
Q2 a) Use the k-means clustering algorithm and Euclidean distance to cluster the following [10]
eight 8 examples into three clusters:
Al=(2, 10), A2= (2, 5), A3= (8, 4), A4= (5, 8), A5= (7, 5), A6= (6, 4), A7= 1(1, 2),
A8= (4,9). Find the new centroid at every new point entry into the cluster group.
Assume initial cluster centers as Al, A4 and A7.
b) Compare and contrast Linear and Logistic regressions with respect to their [10]
mechanisms of prediction.
Q3 a) Find predicted value of Yfor one epoch X Y-Actual [10]
and RMSE using Linear regression. 2
3
4 6
5 9
6 11
7 13
8 15
9 17
10 20
b) Find the new revised theta for the given problem using Expectation -Maximization [10]
Algorithm for one epoch.
1HTT T H H T H T H
2HH H HH T H H H H
3 HT H H HH H T H
4 H TH TT T H H T
5 T H H H T H H H T H
OA =0.6 and OB =0.5
69459 Page 1 of 2
52D665BA610210D6637CEC2F4953EODA
Paper / Subject Code: 52701 /Elective- III ) Machine Learning
Q4 a) For the given set of points identify clusters using single linkage and draw the [10]
dendrogram with cluster separation line emerging at 1.3. Find how many clusters are
formed below the line?
Dist A B C D E
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50
3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00
b) Use Principal Component Analysis (PCA) to arrive at the transformed matrix for [10]
the given matrix A.
A= 2 10 -1
4 3 1 0.5
52D665BA610210D6637CEC2F4953EODA