0% found this document useful (0 votes)
1K views5 pages

Machine Learning Bangalore City University 2024

Question paper of Machine learning Bangalore City University 2024

Uploaded by

Adi Vardharaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views5 pages

Machine Learning Bangalore City University 2024

Question paper of Machine learning Bangalore City University 2024

Uploaded by

Adi Vardharaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Certainly!

Here are the answers categorized by the marks allocated for each question:

### SECTION-A (2 Marks Each)

1. Define machine learning.


- Answer: Machine learning is a subset of artificial intelligence that involves the use of
algorithms and statistical models to enable computers to perform specific tasks without
explicit instructions, by learning from patterns and inference from data.

2. What is Dataset.
- Answer: A dataset is a collection of data, typically organized in a structured format such
as a table, where each row represents an observation and each column represents a variable.

3. Define regression. Give an example.


- Answer: Regression is a type of supervised learning technique used to predict a
continuous output variable based on one or more input features. Example: Predicting house
prices based on features like size, location, and number of bedrooms.

4. Define clustering. Mention one application.


- Answer: Clustering is an unsupervised learning technique used to group similar data
points into clusters. Application: Customer segmentation in marketing, where customers are
grouped based on purchasing behavior.

5. Mention any two tools used for machine learning.


- Answer: TensorFlow, Scikit-learn.

6. What is Data splitting.


- Answer: Data splitting is the process of dividing a dataset into two or more subsets,
typically a training set and a testing set, to evaluate the performance of a machine learning
model.

### SECTION-B (5 Marks Each)

7. Explain types of machine learning with examples.


- Answer:
- Supervised Learning: Algorithms are trained on labeled data. Example: Linear
regression for predicting housing prices.
- Unsupervised Learning: Algorithms are used to find patterns in unlabeled data.
Example: K-means clustering for customer segmentation.
- Reinforcement Learning: Algorithms learn by interacting with an environment to
maximize some reward. Example: Training a robot to walk.

8. Explain exploratory data analysis and data cleaning.


- Answer:
- Exploratory Data Analysis (EDA): The process of analyzing data sets to summarize
their main characteristics, often using visual methods. It helps in understanding data
distribution, spotting anomalies, and testing hypotheses.
- Data Cleaning: The process of fixing or removing incorrect, corrupted, improperly
formatted, duplicate, or incomplete data within a dataset. It is a critical step to ensure data
quality and accuracy.

9. Explain Bayes' theorem with an example.


- Answer: Bayes' theorem describes the probability of an event based on prior knowledge
of conditions related to the event.
\[
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
\]
Example: Diagnosing a disease based on test results. If we know the prevalence of the
disease and the accuracy of the test, we can calculate the probability that a person has the
disease given a positive test result.

10. Explain K-means clustering for image segmentation.


- Answer: K-means clustering can be used for image segmentation by grouping pixels
with similar color intensities. The algorithm partitions the image into K clusters, assigning
each pixel to the cluster with the nearest mean value, effectively segmenting the image into
regions.

11. Explain how DBSCAN works.


- Answer: DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a
clustering algorithm that groups together points that are closely packed together while
marking points that are in low-density regions as outliers. It relies on two parameters: epsilon
(the maximum distance between two points to be considered neighbors) and MinPts (the
minimum number of points required to form a dense region).

12. Write and explain K-Nearest Neighbour Algorithm.


- Answer: K-Nearest Neighbour (KNN) is a simple, instance-based learning algorithm
used for classification and regression. Given a data point, KNN identifies the 'k' closest points
from the training data and predicts the output based on the majority class (in classification) or
the average of the nearest neighbors (in regression).

### SECTION-C (8 Marks Each)

13. Explain main challenges of Machine Learning.


- Answer:
- Data Quality and Quantity: High-quality, labeled data is often required for training
effective models.
- Overfitting and Underfitting: Balancing model complexity to generalize well on new
data.
- Feature Engineering: Identifying the right features that influence the outcome.
- Model Interpretability: Ensuring that models are understandable and explainable.

14. Explain how to prepare the data for Machine Learning Algorithms.
- Answer:
- Data Cleaning: Remove or fix missing and noisy data.
- Data Transformation: Scale or normalize features, encode categorical variables.
- Feature Selection/Engineering: Select relevant features or create new ones.
- Splitting the Data: Divide data into training and testing sets.

15. Explain confusion matrix and performance evaluation metrics in classification.


- Answer:
A confusion matrix is a table used to evaluate the performance of a classification model.
It shows the true positives, true negatives, false positives, and false negatives.
- Accuracy: \(\frac{TP + TN}{TP + TN + FP + FN}\)
- Precision: \(\frac{TP}{TP + FP}\)
- Recall: \(\frac{TP}{TP + FN}\)
- F1 Score: \(2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}\)

16. Explain any four unsupervised learning techniques.


- Answer:
- K-means Clustering: Partitioning data into K clusters based on feature similarity.
- Hierarchical Clustering: Building a tree of clusters by either merging or splitting
clusters iteratively.
- DBSCAN: Clustering based on density of data points, identifying outliers.
- Principal Component Analysis (PCA): Reducing dimensionality of data while retaining
most of the variance.

17. Explain:
a) Scikit-learn and pandas.
- Answer:
- Scikit-learn: A machine learning library in Python providing simple and efficient
tools for data mining and data analysis.
- Pandas: A data manipulation and analysis library in Python, providing data structures
like DataFrame for handling tabular data.

b) Explain the steps to select and train a model.


- Answer:
- Data Collection: Gather relevant data.
- Data Preprocessing: Clean and transform the data.
- Feature Selection: Choose relevant features for the model.
- Model Selection: Choose the appropriate algorithm.
- Model Training: Train the model on the training data.
- Model Evaluation: Test the model on the testing data.
- Hyperparameter Tuning: Optimize the model parameters.
- Model Deployment: Deploy the model for practical use.
18. Write note on:
a) Entropy and information gain.
- Answer:
- Entropy: A measure of randomness or impurity in the data. Higher entropy means
more disorder.
- Information Gain: The reduction in entropy or surprise by splitting a dataset
according to a given attribute. It is used to decide which feature to split on at each step in
building a decision tree.

b) Partitioning clustering and hierarchical clustering.


- Answer:
- Partitioning Clustering: Divides data into distinct, non-overlapping subsets (clusters).
Example: K-means clustering.
- Hierarchical Clustering: Builds a tree of clusters, where each node represents a
cluster of similar points. Can be agglomerative (bottom-up) or divisive (top-down).

If you need further details or explanations on any of these answers, feel free to ask!

You might also like