0% found this document useful (0 votes)

20 views14 pages

Machine Learning Question Bank

The document provides a comprehensive overview of Machine Learning, including its definition, application areas, classifications, and characteristics. It discusses the steps in designing a learning system, differentiates between training, testing, and validation sets, and contrasts predictive and descriptive tasks. Additionally, it emphasizes the importance of features, models, and tasks in achieving successful machine learning outcomes.

Uploaded by

apexgaming27889

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views14 pages

Machine Learning Question Bank

Uploaded by

apexgaming27889

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Machine Learning Question Bank

Unit-1 – Chapter-1

Define Machine Learning ? Explain various types of applications area in Machine Learning.

Definition of Machine Learning:

1. Meaning of Machine Learning:

Machine Learning (ML) is a branch of Artificial Intelligence that allows machines to learn
patterns from data and make decisions without being explicitly programmed.

2. Arthur Samuel’s Definition:

According to Arthur Samuel, Machine Learning is "the field of study that gives computers the
ability to learn from data without being explicitly programmed."

3. Tom Mitchell’s Definition:

Tom Mitchell defines ML as: "A computer program is said to learn from experience (E) with
respect to tasks (T) and performance measure (P) if its performance improves with experience."

Application Areas of Machine Learning:

4. Image Recognition:
ML is used to identify people, objects, or scenes in images. For example, Facebook uses ML to
automatically tag friends in photos using face recognition.

5. Speech Recognition:
It helps convert voice commands into text. Examples include Siri, Google Assistant, and Alexa
that understand and act on voice inputs.

6. Traffic Prediction:
Google Maps uses ML to predict traffic conditions using GPS data and past traffic trends,
helping users find the fastest routes.

7. Product Recommendation:
E-commerce websites like Amazon and streaming services like Netflix use ML to recommend
products or movies based on user preferences and behaviour.

8. Self-Driving Cars:
Companies like Tesla use ML to train cars to detect objects, follow lanes, and make driving
decisions using real-time sensor data.

9. Spam and Malware Filtering:

Email services use ML algorithms like Naïve Bayes to filter spam and detect harmful
attachments automatically.

10. Medical Diagnosis:

ML helps doctors identify diseases by analysing medical images, health records, and
symptoms—for example, detecting brain tumours or cancer early.
Discuss the Classification of Machine Learning in detail.

Machine Learning is classified into different types based on how the learning process happens and
what kind of data is provided. The three main types are:
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
1. Supervised Learning :
In supervised learning, the model is trained on labelled data—this means the input data is paired with
the correct output. The model learns the relationship and predicts output for new inputs. Used in
applications where historical data is available. Examples include:
• Email Spam Detection
• Risk Assessment
• Image Classification
• Fraud Detection
There are two major problems in supervised learning:
• Regression: Predicts continuous values (e.g., house price, temperature)
• Classification: Predicts categories (e.g., spam or not spam)
2. Unsupervised Learning :
In unsupervised learning, the model is given unlabelled data. The system tries to learn patterns and
structures from the data without known outputs. Used for tasks like:
• Customer Segmentation
• Anomaly Detection
• Market Basket Analysis
• Clustering Images or Documents
Two main types of problems are:
• Clustering: Grouping similar data points (e.g., K-Means, Hierarchical Clustering)
• Association: Discovering rules that describe large portions of data (e.g., Market Basket
Analysis)
3. Reinforcement Learning :
Reinforcement learning is a feedback-based method where an agent learns by interacting with the
environment, receiving rewards for good actions and penalties for bad actions. Used in dynamic and
sequential decision-making tasks such as:
• Game Playing (e.g., Chess, Go)
• Robotics
• Self-driving cars
• Industrial automation systems
Discuss various steps in designing a learning system

1. Choosing the Training Experience

The first step is to select the right training data or experience that will be fed into the machine
learning algorithm. This data must be relevant and should have a direct or indirect impact on
the success of the model. For example, in a chess game, the moves played, and their
outcomes act as training experience from which the model learns.

2. Choosing the Target Function

Once the training data is selected, a target function must be defined. This function describes
the goal of learning — it maps the input to the desired output. For example, in a spam detection
system, the target function might be "SpamClassifier" which decides whether an email is spam
or not.

3. Choosing the Representation for the Target Function

After defining the target function, it must be represented in a suitable form such as linear
equations, hierarchical graphs, or tabular formats. This representation helps the machine
understand and apply the logic behind the function. In a chess game, it would represent all
legal and optimal moves.

4. Choosing the Function Approximation Algorithm

The algorithm used to approximate the target function is selected next. This algorithm helps
the system learn from training examples by trial and error. The more examples it sees, the
better it becomes at selecting the correct output. For example, the system may initially make
mistakes but gradually learns from experience and improves its accuracy.

5. Final Design of the Learning System

After going through various training instances, learning from errors, and refining predictions,
the system reaches its final design. This final model can make intelligent decisions or
predictions on new, unseen data. An example is Deep Blue, the ML-based system that beat
chess champion Garry Kasparov by learning and improving through experience.
Explain the Characteristics of Machine Learning Tasks.

1. Automated Data Visualization

Machine learning offers tools that automatically visualize complex relationships in both structured
and unstructured data. This helps businesses uncover insights and patterns easily, leading to
better decision-making.

2. Automation at Its Best

One of the key characteristics of ML is its ability to automate repetitive and time-consuming tasks.
Industries like finance use ML to automate accounting, expense management, invoicing, and even
customer queries using chatbots.

3. Enhanced Customer Engagement

ML helps businesses improve customer interaction by analysing what kind of content, words, or
products resonate with users. For example, Pinterest uses ML to personalize content suggestions
based on user behaviour.

4. Increased Efficiency with IoT Integration

When combined with Internet of Things (IoT) technologies, machine learning can significantly
improve the efficiency of industrial and business processes. ML analyses IoT-generated data to
optimize operations and reduce waste.

5. Transformation of the Mortgage Market

Machine learning allows financial institutions to better understand customer spending behaviour
and creditworthiness beyond just credit scores. This helps lenders make more informed decisions
in mortgage and loan approvals.

6. Accurate Data Analysis

Unlike traditional trial-and-error methods, machine learning provides powerful algorithms that can
handle large and diverse datasets. This leads to faster, more precise analysis and more reliable
outcomes.

7. Improved Business Intelligence

ML enhances business intelligence by processing big data and extracting useful insights.
Industries like retail, healthcare, and finance use ML to support strategic planning, product
development, and customer service.
Differentiate between training set, testing set and validation set.

Aspect Training Set Validation Set Testing Set

Purpose Used to train the Used to tune Used to evaluate final

machine learning model hyperparameters and model performance on
improve model unseen data
performance

Data Type Labelled data used to fit Labelled data used for Labelled data used only for
the model tuning and model selection final evaluation

Usage Used during model Used during training (for Used after training and
Time training validation and tuning) validation are complete

Helps Learning patterns, Preventing overfitting, Checking generalization

With building the model selecting the best model ability of the final model
version

Seen by Yes, the model directly Yes, used indirectly during No, completely unseen
Model? learns from it model tuning during training and tuning

Effect on Directly affects how the Helps adjust parameters to Does not influence the
Model model is trained improve performance model; only measures
performance

Example Fitting a regression or Choosing number of layers Measuring accuracy,

Use classification model in a neural network precision, recall, etc.

Risk If Model may underfit if May cause overfitting if used If leaked into training,
Misused data is insufficient excessively results in overestimated
performance

Typical 60–70% of the dataset 10–20% of the dataset 20–30% of the dataset
Proportion

Related Learning from examples Cross-validation, model Final model evaluation,

Concept selection real-world performance
check
Differentiate between Predictive and Descriptive task.

Aspect Predictive Tasks Descriptive Tasks

Purpose To predict future or unknown To discover hidden patterns or

outcomes based on input data relationships in the data

Target Variable Involves a known target variable Does not involve a target variable (no
(output is labelled) labels in data)

Learning Type Commonly used in supervised Commonly used in unsupervised

learning learning

Examples Classification (e.g., spam detection), Clustering (e.g., customer grouping),

Regression (e.g., predicting house Association Rule Discovery (e.g., Market
prices) Basket)

Output Predicts specific values or class Summarizes or explains structure in data

labels without predictions

Data Requires historical labelled data Works with unlabelled data

Requirement (input-output pairs)

Focus Accuracy of prediction Understanding structure or distribution

of data

Alternative Sometimes called supervised Sometimes called exploratory data

Name predictive modelling analysis

Used For Decision-making, forecasting, risk Data summarization, insight generation

assessment

Example from Playing checkers – predicting Subgroup discovery or clustering movies

PDF probability of winning by genres
Write a note on Learning Vs. Designing.

1. Start with Training Data

In a learning-based approach, the process begins by collecting and preparing training data. This
data contains various examples that represent the problem domain and includes relevant input
and output relationships (if supervised learning is used).
2. Feed Data into Machine Learning Algorithm
The training data is fed into a machine learning algorithm. This algorithm is designed to analyze the
data and identify patterns, correlations, or rules hidden within the examples.
3. Build Logical and Mathematical Model
Using the data, the algorithm constructs a logical and mathematical model. This model represents
the learned knowledge and will be used to make future predictions or decisions. The more data the
model is exposed to, the better it becomes at generalizing and improving accuracy.
4. Produce Output Based on Learning
Once the model is built, it can take new input data and generate an output. This output is not the
result of hardcoded logic, but rather the result of what the model has learned from experience (i.e.,
the training data).

Learning vs Designing Philosophy

In traditional designing, the programmer manually creates a set of rules and instructions that the
system must follow. All logic is predefined, and there's no scope for adaptation or improvement
unless the programmer changes the code. In contrast, learning allows the system to automatically
improve over time by analysing more data, thus reducing the need for manual intervention.

Example – Driverless Car

A driverless car, when designed using traditional methods, would require manually coding every
possible traffic scenario. In a learning system, the car is trained using real-world driving data. The ML
algorithm learns traffic rules, object detection, and decision-making by building a model from this
data, which results in smarter and adaptive driving.
Unit-1 – Chapter-2

Describe the following model with example

1. Logical Model.
2. Probabilistic Model.
3. Geometric Model.

1. Logical Model

A logical model uses a series of logical conditions to divide the instance space into segments. These
models typically rely on if-then rules and are closely related to decision trees and rule-based systems.
They help classify data into groups by applying logical expressions. In such models, the learning
process involves determining which conditions lead to which outputs.

For example, in spam filtering, a rule might be:

• if bonus = 1 then Class = spam

• else if lottery = 1 then Class = spam

• else Class = ham

This makes the logical model very easy to interpret and implement. It is especially useful in
applications where rule transparency is required.

2. Probabilistic Model

Probabilistic models are based on statistical probability. These models assign a posterior probability
to each possible output class based on given input data, often using Bayes’ theorem. The system
learns the probabilities of the output given the input, using training data to calculate these values.

In the same spam detection example, a probabilistic model may calculate:

• P(spam∣bonus=1,lottery=0)

If this value is greater than 0.5, the model classifies the email as spam. Probabilistic models are
suitable when it's important to estimate the degree of belief or confidence in the prediction.

3. Geometric Model

A geometric model represents instances as points in a multidimensional space and uses geometric
relationships to make decisions. These models work by creating decision boundaries (such as lines,
planes, or curves) that separate different classes in space. Classification is done by checking which
side of the boundary a point lies on.

An example of a geometric model is a linear classifier, where a straight line (in 2D) or hyperplane (in
higher dimensions) separates two classes. The formula used is:

• w⋅x=t

Another example is the k-nearest neighbour model, where a new instance is classified based on the
majority class among its closest neighbours in the feature space. Geometric models are effective
when data is numerically represented, and spatial separation exists between categories.
Machine Learning is all about using the right feature to build right model that achieve right task
justify your answer.

1. Features represent the problem to the model

Features are the measurable properties of input data. If the right features are selected, they
capture the essential patterns needed to solve the problem. Without the right features, even the
best algorithms fail to learn effectively.

2. Models learn from features to perform tasks

The model’s learning capability depends on how well the features describe the data. For example,
in spam classification, features like "bonus" or "lottery" help the model differentiate spam from
non-spam emails.

3. Each model type suits different data and tasks

Logical models work well with rule-based classification, probabilistic models handle uncertainty,
and geometric models are ideal for spatially separable data. Using the wrong model for the feature
type or task can lead to incorrect results.

4. The task defines the goal of learning

Machine learning tasks can be predictive (like classification or regression) or descriptive (like
clustering). Choosing the right task helps determine the learning method and evaluation strategy.

5. Wrong features or models lead to poor performance

If features are irrelevant, or if the model is too simple or too complex for the task, the system may
underfit or overfit, failing to generalize well to new data.

6. All three components are interdependent

The success of a machine learning system depends on the harmony between features, model, and
task. Each one affects the effectiveness of the others and changing one may require adjusting the
rest.

7. Learning is about mapping inputs (features) to outputs (task)

The entire process of machine learning is to use features as input, a model to learn from those
features, and achieve the correct output — that is, solving the intended task.

8. Real-world systems demonstrate this alignment

In systems like driverless cars or email spam filters, choosing the right sensory or textual features,
the right modelling approach, and a well-defined goal is what enables high performance.

9. Model generalization depends on proper input and output setup

Generalization — the ability of a model to perform well on unseen data — is only possible when
the input features and learning task are correctly aligned with the problem structure.

10. Conclusion
Machine learning is not just about choosing a fancy algorithm. It's about selecting the right
features that describe the data well, applying the right model that can learn from those features,
and targeting the right task that matches the problem goal. Without alignment between all three,
learning will not succeed.
What are the various types of features available, Explain one in brief.

1. Binary Features
Binary features are attributes that can take only two values: typically, 0 or 1. These values
represent true/false, yes/no, or presence/absence of a particular characteristic. Binary features
are widely used in classification tasks, especially where logical decision rules apply.
Example:
In email classification, a binary feature could be:
o bonus = 1 if the word "bonus" is present
o bonus = 0 if it is not present
This feature allows for straightforward rules like:
if bonus = 1 then Class = spam
2. Nominal Features
Nominal features are categorical attributes that can take on one of several discrete values, but
these values have no inherent order. Each category represents a label, and all categories are
treated as equally distinct without any ranking.
Example:
In movie classification, a feature like Genre can take values like:
o Action
o Comedy
o Drama
o Horror
3. Ordinal Features
Ordinal features are like nominal features but with an important difference their values have a
clear, meaningful order. However, the distance between the values is not defined.
Example:
A satisfaction rating:
o Poor < Fair < Good < Excellent
While "Excellent" is clearly better than "Good", we cannot quantify how much better. These
features are important in models that can handle ordered information.
4. Quantitative Features
Quantitative features (also called numerical features) are those that take on real numerical values
and have a mathematical meaning. The differences between values are measurable and
consistent.
Example:
o Age: 25, 30, 45
o Price: ₹199, ₹250, ₹399
These features can be directly used in mathematical computations like calculating averages,
distances, or trends, and are essential for regression and geometric models.
Discuss the need of feature construction and feature transformation and explain how it can be
achieved.

Need for Feature Construction and Transformation

In machine learning, raw data collected from various sources often contains irrelevant, redundant, or
unstructured information. Models cannot perform efficiently if the input features are poorly
represented. Therefore, feature construction and feature transformation are essential steps to
improve the learning process by making data more meaningful and usable for algorithms.

These processes help in:

• Improving model accuracy by creating more informative features

• Reducing noise and redundancy

• Making data compatible with algorithms that expect input in a certain form

• Helping the model generalize better to unseen data

1. Feature Construction

Feature construction involves creating new features from the existing raw data to enhance the
model’s predictive power. These new features help represent the underlying patterns more clearly.

• Example :
From an email’s text, new features like "bonus", "lottery", and "win" can be extracted. These do not
exist explicitly in the raw data but are constructed based on word presence or frequency.

This process helps transform unstructured text into structured features suitable for machine learning
algorithms.

2. Feature Transformation

Feature transformation refers to modifying existing features to make them more suitable for learning.
This includes scaling, normalizing, or encoding features so that the model can process them
effectively.

• Techniques :

o Text to Binary Transformation: Converting words into binary values — e.g., if the word “bonus”
appears, feature = 1, else 0.

o Word Frequency Count: Converting text features into numerical form based on how often a
word appears in the document.

o Dimensionality Reduction: Reducing the number of features to eliminate noise and improve
performance.

These transformations make the data machine-readable and help algorithms like decision trees,
SVMs, and neural networks learn more efficiently.
Explain the various approaches that can be used for feature selection.

Feature selection is the process of identifying and selecting the most relevant features from a dataset
to improve model performance, reduce overfitting, and lower computational cost. There are three
main approaches to feature selection:

1. Filter Approach

The filter approach selects features based on their statistical properties, independently of any
learning algorithm. Features are ranked using metrics like information gain, correlation, or chi-square
test, and the top-ranked ones are selected.

• Example: Selecting words in a text classification task based on their frequency or relevance.

• Advantage: Fast and simple; works well as a preprocessing step.

• Limitation: Does not consider feature interactions.

2. Wrapper Approach

The wrapper approach evaluates subsets of features by actually training and testing a model on them.
It searches for the best-performing combination of features using techniques like forward selection,
backward elimination, or recursive feature elimination.

• Example: Adding or removing one feature at a time and testing how the model accuracy
changes.

• Advantage: Considers interaction between features.

• Limitation: Computationally expensive, especially with large datasets.

3. Embedded Approach

The embedded approach performs feature selection as part of the model training process. The
learning algorithm itself selects the most important features while building the model.

• Example: Decision trees automatically select the best features at each split; LASSO regression
shrinks less important feature coefficients to zero.

• Advantage: Efficient and integrated with model building.

• Limitation: Specific to certain algorithms.

Discuss the term variance and bias with respect to overfitting and underfitting.

In machine learning, bias and variance are two key components that contribute to a model's
prediction error. Understanding how they relate to underfitting and overfitting helps in building models
that generalize well on unseen data.

Bias – Linked to Underfitting

Bias refers to the error that is introduced by approximating a complex real-world problem using a
simplified model. When a model has high bias, it means it makes strong assumptions about the data
and fails to learn the underlying patterns properly.

• Such models are often too simple to capture the complexity of the data.

• They tend to ignore important features and relationships in the dataset.

• This results in poor performance both on the training set and the test set.

• The model is said to underfit the data.

Example:
Using a straight line to fit a clearly curved dataset results in high bias. The model cannot learn the
curve and gives inaccurate predictions, even on training data.

Variance – Linked to Overfitting

Variance measures how much a model’s predictions change when it is trained on different subsets of
the data. A model with high variance is very sensitive to the training data and learns even the noise or
random fluctuations.

• Such models are typically too complex relative to the amount of data available.

• They perform very well on training data but fail to generalize on new, unseen data.

• This leads to overfitting, where the model captures noise instead of the true signal.

Example:
A deep decision tree that fits all training examples perfectly, including outliers, may fail to predict well
on test data due to high variance.

Bias-Variance Trade-off

There is a natural trade-off between bias and variance:

• High bias, low variance models are stable but often inaccurate — they underfit.

• Low bias, high variance models are flexible but often unstable — they overfit.

The challenge is to find the optimal model complexity that maintains a balance:

• Enough flexibility to capture real patterns (low bias)

• Enough simplicity to ignore noise (low variance)

This balance leads to low total error and good generalization.

Machine Learning Question Bank
No ratings yet
Machine Learning Question Bank
27 pages
MAchine Learning
No ratings yet
MAchine Learning
10 pages
ML 2
No ratings yet
ML 2
4 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
10 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Introduction To Machine Learning (ML)
No ratings yet
Introduction To Machine Learning (ML)
5 pages
ML&DAP Module 1
No ratings yet
ML&DAP Module 1
41 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
5 pages
Department of Emerging Technology (SB) III B.Tech - I Semester
No ratings yet
Department of Emerging Technology (SB) III B.Tech - I Semester
12 pages
Article On Machine Learning
No ratings yet
Article On Machine Learning
4 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
U1 ML Intro and Applications
No ratings yet
U1 ML Intro and Applications
123 pages
DA Chap2
No ratings yet
DA Chap2
14 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
78 pages
ML Note
No ratings yet
ML Note
8 pages
MCA Machine Learning Question Bank
No ratings yet
MCA Machine Learning Question Bank
139 pages
Introducion To ML
No ratings yet
Introducion To ML
29 pages
Machine Learning Is A Branch of Artificial Intelligence (AI)
No ratings yet
Machine Learning Is A Branch of Artificial Intelligence (AI)
80 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
Karthik
No ratings yet
Karthik
10 pages
ML Unit-I Part 1
No ratings yet
ML Unit-I Part 1
7 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
ML Unit - 1
No ratings yet
ML Unit - 1
70 pages
Unit V
No ratings yet
Unit V
67 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Presenttion 33
No ratings yet
Presenttion 33
2 pages
ML Basics for Beginners
No ratings yet
ML Basics for Beginners
20 pages
DAIOT UNIT 5 (1) Own
No ratings yet
DAIOT UNIT 5 (1) Own
13 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
56 pages
Unit3 - Updated
No ratings yet
Unit3 - Updated
116 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
10 pages
Machine Learning QB
No ratings yet
Machine Learning QB
15 pages
ML Unit1.2
No ratings yet
ML Unit1.2
24 pages
ML Unit-1
No ratings yet
ML Unit-1
39 pages
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
No ratings yet
Unit 1: Shobana T S Assistant Professor Dept. of ISE, BMSCE
114 pages
Chapter 1
No ratings yet
Chapter 1
30 pages
Session One Machine Learning
No ratings yet
Session One Machine Learning
18 pages
Truncated Doc 4
No ratings yet
Truncated Doc 4
3 pages
Unit 1
No ratings yet
Unit 1
62 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Unit 5
No ratings yet
Unit 5
26 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
ML Unit1 (HKB)
No ratings yet
ML Unit1 (HKB)
7 pages
Overview of Machine Learning Solutions
No ratings yet
Overview of Machine Learning Solutions
34 pages
IT Report PDF
No ratings yet
IT Report PDF
24 pages
SK Sahidur Rahaman Bba504a 2024
No ratings yet
SK Sahidur Rahaman Bba504a 2024
9 pages
ML Unit 1 Intro ML
No ratings yet
ML Unit 1 Intro ML
43 pages
Machine Learning and Deep Learning Basics
No ratings yet
Machine Learning and Deep Learning Basics
36 pages
Introduction To Machine Learning2 - 085047
No ratings yet
Introduction To Machine Learning2 - 085047
11 pages
Unit 3
No ratings yet
Unit 3
33 pages
Machine Learning Full PDF
No ratings yet
Machine Learning Full PDF
149 pages
ML-Unit 1 Merged
No ratings yet
ML-Unit 1 Merged
151 pages
ML-Unit 1
No ratings yet
ML-Unit 1
43 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Machine Learning - 1
No ratings yet
Machine Learning - 1
19 pages
Machine Learning 3-4
No ratings yet
Machine Learning 3-4
5 pages
Basics of Machine Learning
No ratings yet
Basics of Machine Learning
22 pages
Customer Segmentation in A Mall
No ratings yet
Customer Segmentation in A Mall
13 pages
Ai Project Cycle Class X
No ratings yet
Ai Project Cycle Class X
23 pages
4 - Cyberbullying Detection and Machine Learning A Systematic Literature Review - 2023
No ratings yet
4 - Cyberbullying Detection and Machine Learning A Systematic Literature Review - 2023
42 pages
Python Properties for Data Mining
No ratings yet
Python Properties for Data Mining
5 pages
Digital Fluency Notes Final
No ratings yet
Digital Fluency Notes Final
44 pages
Module 4
No ratings yet
Module 4
54 pages
ML Summer Training
No ratings yet
ML Summer Training
20 pages
Notes Unit-2
No ratings yet
Notes Unit-2
4 pages
Market Basket Analysis in Unsupervised Learning
100% (1)
Market Basket Analysis in Unsupervised Learning
33 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
46 pages
Machine Learning Types & Techniques
No ratings yet
Machine Learning Types & Techniques
17 pages
Question Bank DMC
No ratings yet
Question Bank DMC
28 pages
Unit-5 DS Notes
No ratings yet
Unit-5 DS Notes
19 pages
Lightweight Deep Learning for Image Forgery Detection
No ratings yet
Lightweight Deep Learning for Image Forgery Detection
78 pages
FAM Assignment AN5I 2024-25-1
No ratings yet
FAM Assignment AN5I 2024-25-1
6 pages
AI System Definition Guidelines EU 2024
No ratings yet
AI System Definition Guidelines EU 2024
13 pages
Data Science, ML, AI: Key Differences
No ratings yet
Data Science, ML, AI: Key Differences
37 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
3 pages
Data Mining and Machine Learning Overview
No ratings yet
Data Mining and Machine Learning Overview
12 pages
AI in Healthcare: Course Overview
No ratings yet
AI in Healthcare: Course Overview
29 pages
ML - Unit 1 - SPR - New July 212025
No ratings yet
ML - Unit 1 - SPR - New July 212025
60 pages
A Comprehensive Survey Evaluating The Efficiency of Artificial Intelligence and Machine Learning Techniques On Cyber Security Solutions
No ratings yet
A Comprehensive Survey Evaluating The Efficiency of Artificial Intelligence and Machine Learning Techniques On Cyber Security Solutions
28 pages
Self Supervised Learning: A Succinct Review: Veenu Rani Syed Tufael Nabi Munish Kumar Ajay Mittal Krishan Kumar
No ratings yet
Self Supervised Learning: A Succinct Review: Veenu Rani Syed Tufael Nabi Munish Kumar Ajay Mittal Krishan Kumar
15 pages
CSE445 Machine Learning Overview
No ratings yet
CSE445 Machine Learning Overview
36 pages
L2-SL - en - Slides - How Computers Learn From Data
No ratings yet
L2-SL - en - Slides - How Computers Learn From Data
56 pages
Machine Learning Road Map
No ratings yet
Machine Learning Road Map
5 pages
Machine Learning, History and Types of ML
No ratings yet
Machine Learning, History and Types of ML
18 pages
Speech-To-Speech Translation With Unit Language Models
No ratings yet
Speech-To-Speech Translation With Unit Language Models
20 pages
Advanced AI Topics and Applications
No ratings yet
Advanced AI Topics and Applications
98 pages
Unit IV. Digital Image Classification
No ratings yet
Unit IV. Digital Image Classification
22 pages

Machine Learning Question Bank

Uploaded by

Machine Learning Question Bank

Uploaded by

Machine Learning Question Bank

Definition of Machine Learning:

1. Meaning of Machine Learning:

2. Arthur Samuel’s Definition:

3. Tom Mitchell’s Definition:

Application Areas of Machine Learning:

9. Spam and Malware Filtering:

10. Medical Diagnosis:

1. Choosing the Training Experience

2. Choosing the Target Function

3. Choosing the Representation for the Target Function

4. Choosing the Function Approximation Algorithm

5. Final Design of the Learning System

1. Automated Data Visualization

2. Automation at Its Best

3. Enhanced Customer Engagement

4. Increased Efficiency with IoT Integration

5. Transformation of the Mortgage Market

6. Accurate Data Analysis

7. Improved Business Intelligence

Aspect Training Set Validation Set Testing Set

Purpose Used to train the Used to tune Used to evaluate final

Helps Learning patterns, Preventing overfitting, Checking generalization

Example Fitting a regression or Choosing number of layers Measuring accuracy,

Related Learning from examples Cross-validation, model Final model evaluation,

Aspect Predictive Tasks Descriptive Tasks

Purpose To predict future or unknown To discover hidden patterns or

Learning Type Commonly used in supervised Commonly used in unsupervised

Examples Classification (e.g., spam detection), Clustering (e.g., customer grouping),

Output Predicts specific values or class Summarizes or explains structure in data

Data Requires historical labelled data Works with unlabelled data

Focus Accuracy of prediction Understanding structure or distribution

Alternative Sometimes called supervised Sometimes called exploratory data

Used For Decision-making, forecasting, risk Data summarization, insight generation

Example from Playing checkers – predicting Subgroup discovery or clustering movies

1. Start with Training Data

Learning vs Designing Philosophy

Example – Driverless Car

Describe the following model with example

For example, in spam filtering, a rule might be:

• if bonus = 1 then Class = spam

• else if lottery = 1 then Class = spam

• else Class = ham

In the same spam detection example, a probabilistic model may calculate:

1. Features represent the problem to the model

2. Models learn from features to perform tasks

3. Each model type suits different data and tasks

4. The task defines the goal of learning

5. Wrong features or models lead to poor performance

6. All three components are interdependent

7. Learning is about mapping inputs (features) to outputs (task)

8. Real-world systems demonstrate this alignment

9. Model generalization depends on proper input and output setup

Need for Feature Construction and Transformation

These processes help in:

• Improving model accuracy by creating more informative features

• Reducing noise and redundancy

• Helping the model generalize better to unseen data

• Advantage: Fast and simple; works well as a preprocessing step.

• Limitation: Does not consider feature interactions.

• Advantage: Considers interaction between features.

• Limitation: Computationally expensive, especially with large datasets.

• Advantage: Efficient and integrated with model building.

• Limitation: Specific to certain algorithms.

Bias – Linked to Underfitting

• They tend to ignore important features and relationships in the dataset.

• The model is said to underfit the data.

Variance – Linked to Overfitting

There is a natural trade-off between bias and variance:

• Enough flexibility to capture real patterns (low bias)

• Enough simplicity to ignore noise (low variance)

This balance leads to low total error and good generalization.

You might also like