0% found this document useful (0 votes)

14 views

ML Mid1

ddd

Uploaded by

melllo gang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

ML Mid1

ddd

Uploaded by

melllo gang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

MACHINE LEARNING

MOD-1
SA
Q1) Define supervised learning with an example?

Ans:
Supervised learning is a type of machine learning where the model is trained on labeled data. The
algorithm learns to map inputs to outputs based on the provided labels.
Example: Predicting house prices based on labeled data like house size, location, and price.

Q2) Define unsupervised learning with an example?

Ans:
Unsupervised learning is a machine learning method where the algorithm learns patterns from
unlabeled data. The model finds hidden structures without any explicit output labels.
Example: Customer segmentation based on purchasing behavior using clustering algorithms like K-
Means.

Q3) Define reinforcement learning with an example?

Ans:
Reinforcement learning is a type of machine learning where an agent learns to make decisions by
interacting with its environment and receiving rewards or penalties.
Example: Training a robot to walk by rewarding it for successful movements and penalizing it for falling.

Q4) Define deep learning with an example?

Ans:
Deep learning is a subset of machine learning that uses neural networks with many layers (deep
networks) to learn complex patterns in data.
Example: Image recognition systems that classify objects (e.g., identifying cats or dogs) using
Convolutional Neural Networks (CNNs).

Q5) Define semi-supervised learning with an example?

Ans:
Semi-supervised learning is a technique that uses a small amount of labeled data and a large amount of
unlabeled data for training. It helps improve performance when labeling data is expensive or difficult.
Example: A model trained to classify emails as spam using a small labeled dataset and a larger set of
unlabeled emails.

LA
Q1) What is Machine Learning and explain the different types of Machine Learning?

Ans:
Machine Learning (ML) is a subset of artificial intelligence that allows systems to learn from data and
improve their performance over time without explicit programming. ML algorithms find patterns in data
and make decisions or predictions based on it.

Types of Machine Learning:

1. Supervised Learning:
Involves labeled data, where the algorithm learns from input-output pairs. It is used for tasks like
classification (e.g., spam detection) and regression (e.g., predicting house prices).

2. Unsupervised Learning:
Works with unlabeled data. The goal is to find hidden patterns or structures. For example,
clustering algorithms like K-Means can group customers into segments based on purchasing
behavior.

3. Reinforcement Learning:
This method learns by interacting with an environment and receiving feedback in the form of
rewards or punishments. It is used in robotics, game AI, and self-driving cars.

4. Semi-supervised Learning:
Uses a small amount of labeled data and a large amount of unlabeled data. It is useful when
labeling data is expensive or time-consuming.

5. Self-supervised Learning:
A variant of unsupervised learning where the system creates its own labels from the input data.
This is popular in NLP and computer vision tasks.

Q2) How is unsupervised learning different from supervised learning with a practical example?

Ans:
The main difference between unsupervised learning and supervised learning lies in the presence of
labeled data:

1. Supervised Learning:

o In supervised learning, the algorithm is trained on a labeled dataset, where each input
has a corresponding output (label). The goal is to learn a mapping from inputs to
outputs.
o Example: Predicting house prices based on labeled data where features like house size
and location are linked to known prices.

2. Unsupervised Learning:

o In unsupervised learning, the algorithm is trained on an unlabeled dataset and has to

find patterns or groupings within the data.

o Example: Customer segmentation in marketing. A business might use unsupervised

learning to segment its customer base based on their purchasing behavior, even though
no labels are available.

In supervised learning, the model aims to minimize the error between predicted and actual values, while
in unsupervised learning, the focus is on identifying inherent patterns in the data.

Q3) How is reinforcement learning different from deep learning with a practical example?

Ans:
Reinforcement learning and deep learning are different approaches in machine learning:

1. Reinforcement Learning (RL):

RL involves an agent that interacts with an environment and learns by receiving feedback in the
form of rewards or penalties. The goal is to maximize the cumulative reward over time.

o Example: Training a robot to walk. The robot tries different movements and receives
positive or negative feedback based on its success, eventually learning to walk efficiently.

2. Deep Learning (DL):

DL is a subset of machine learning that uses neural networks with multiple layers (deep neural
networks) to model complex patterns in data. It is typically used for tasks such as image
recognition, speech processing, and NLP.

o Example: Image classification using Convolutional Neural Networks (CNNs). The system
learns to classify images (e.g., identifying objects) by learning features from the raw pixel
data.

While reinforcement learning focuses on learning through interaction and feedback, deep learning
focuses on finding complex patterns in large amounts of data using neural networks.

Q4) What is the difference between MCAR, MAR, and MNAR in Machine Learning?

Ans:
These terms refer to different types of missing data in machine learning:

1. MCAR (Missing Completely at Random):

Data is missing independently of both observed and unobserved data. There is no pattern to the
missing data, and it occurs purely by chance.
o Example: A sensor fails to record data due to a random malfunction.

2. MAR (Missing at Random):

The probability of missing data depends only on observed data but not on the missing data itself.
In other words, missingness is related to other observed variables.

o Example: Age data is missing, but its absence is related to another feature like income
level.

3. MNAR (Missing Not at Random):

Data is missing due to reasons related to the missing value itself. In this case, there is a pattern
to the missing data.

o Example: Individuals with higher income may choose not to disclose their income in
surveys, meaning the missing data is not random.

Understanding the type of missing data is essential for choosing the appropriate imputation technique to
handle it in machine learning models.

Q5) Explain the applications of Machine Learning?

Ans:
Machine learning has a wide range of applications across industries, impacting various aspects of life:

1. Healthcare:

o Used for medical diagnosis, drug discovery, and predicting disease outbreaks. For
example, machine learning models can analyze medical images to detect diseases like
cancer.

2. Finance:

o Machine learning is used for fraud detection, stock market predictions, and credit
scoring. It helps banks and financial institutions identify unusual transaction patterns
that may indicate fraud.

3. E-commerce:

o Recommendation systems like those used by Amazon and Netflix are powered by
machine learning. They suggest products or content based on user behavior and
preferences.

4. Autonomous Vehicles:

o Self-driving cars use machine learning to perceive their environment, make decisions,
and navigate safely.

5. Natural Language Processing (NLP):

o Machine learning powers applications like language translation, sentiment analysis, and
chatbots, making human-computer interactions more natural.

6. Manufacturing:

o Machine learning is used for predictive maintenance, allowing companies to predict

equipment failures and schedule maintenance before breakdowns occur.

These applications demonstrate machine learning's ability to automate decision-making, improve

efficiency, and provide insights across various domains.

MOD-2
SA
Q1) Define Normal Distribution?

Ans:
A normal distribution is a symmetrical, bell-shaped distribution where most data points cluster around
the mean. In a normal distribution, the mean, median, and mode are all equal. It is commonly used in
statistics and machine learning to model natural data patterns.

Q2) Define Skewed Distribution?

Ans:
A skewed distribution is an asymmetrical distribution where data points are not evenly distributed
around the mean. It can be positively skewed (right tail is longer) or negatively skewed (left tail is
longer). Skewed data can affect the performance of machine learning models.

Q3) Why data transformation is important in Machine Learning?

Ans:
Data transformation is important in machine learning to ensure that data is in the appropriate format for
model training. It helps to normalize, scale, or standardize features, making models more accurate and
efficient by removing bias, improving convergence, and handling outliers.

Q4) List two data transformation techniques in Machine Learning?

Ans:

1. Normalization: Scales data to a range (e.g., 0 to 1).

2. Standardization: Rescales data to have a mean of 0 and a standard deviation of 1.

Q5) List out the three types of Missing Values in Machine Learning?

Ans:

1. MCAR (Missing Completely at Random)

2. MAR (Missing at Random)

3. MNAR (Missing Not at Random)

LA
Q1) What are the different types of data distributions in Machine Learning?

Ans:
In machine learning, different types of data distributions are important to understand for effective model
development. The most common types include:

1. Normal Distribution: Also called Gaussian distribution, where data points are symmetrically
distributed around the mean. It's shaped like a bell curve.

2. Uniform Distribution: All outcomes have an equal chance of occurring. The probability is
constant, leading to a flat, horizontal distribution.

3. Skewed Distribution: In skewed data, most of the data points are located on one side, either
left-skewed (negative skew) or right-skewed (positive skew).

4. Binomial Distribution: Used for binary data, representing the probability of success or failure
(e.g., heads/tails in coin tosses).

5. Poisson Distribution: Describes the number of events occurring within a fixed interval of time or
space, assuming the events happen with a known constant rate.

6. Exponential Distribution: Used to model the time between events in a Poisson process, where
events occur continuously and independently.

Understanding the data distribution helps in selecting the right algorithms and transformation
techniques to improve model performance.

Q2) Explain the process to handle imbalanced data in Machine Learning?

Ans:
Handling imbalanced data in machine learning involves adjusting the dataset or algorithm to ensure the
minority class is well-represented. Common approaches include:

1. Resampling Techniques:
o Oversampling: Replicating instances of the minority class to balance the class
distribution (e.g., using SMOTE - Synthetic Minority Over-sampling Technique).

o Undersampling: Reducing the number of majority class instances to match the minority
class, which might lead to loss of information.

2. Use of Different Evaluation Metrics: Instead of accuracy, metrics like Precision, Recall, F1-score,
and ROC-AUC are used to evaluate the model’s performance on imbalanced datasets.

3. Cost-Sensitive Learning: Assigning a higher misclassification cost to the minority class helps the
algorithm focus more on correctly classifying it. Algorithms such as decision trees and SVMs can
be adapted for this purpose.

4. Ensemble Methods: Techniques like Random Forest or Boosting (e.g., AdaBoost, XGBoost) can
be helpful as they build multiple models and can adjust for class imbalance.

5. Synthetic Data Generation: Using techniques like SMOTE to synthetically generate new instances
for the minority class, thus balancing the dataset.

Q3) Explain Filter feature selection in Machine Learning?

Ans:
Filter feature selection is a method used to select relevant features (variables) before training a machine
learning model, independent of the model. It evaluates each feature individually using statistical
techniques and ranks them based on their relevance.

1. Methods Used:

o Correlation Coefficient: Measures the correlation between each feature and the target
variable.

o Chi-Square Test: Tests the dependence between categorical features and the target
variable.

o Variance Threshold: Removes features with low variance, assuming they don’t carry
useful information.

o Mutual Information: Measures how much information a feature contributes to the

prediction of the target.

2. Advantages:

o Fast and computationally efficient because it does not involve running the machine
learning model.

o Prevents overfitting by removing irrelevant or redundant features early on.

3. Disadvantages:
o Since it’s independent of the model, it doesn’t account for interactions between features
that might impact model performance.

Filter methods are useful when working with high-dimensional datasets, where reducing feature space
can enhance model training and interpretation.

Q4) Explain Wrapper feature selection in Machine Learning?

Ans:
Wrapper feature selection is a method that uses a machine learning model to evaluate the importance
of features by assessing their impact on the model’s performance. It wraps the feature selection process
around the model training process.

1. Approaches:

o Forward Selection: Starts with no features and iteratively adds features that improve
model performance until no further improvement is observed.

o Backward Elimination: Starts with all features and iteratively removes the least
important features.

o Recursive Feature Elimination (RFE): Works by recursively removing the least important
feature, based on the model’s performance, until the optimal set of features is found.

2. Advantages:

o Model-specific, meaning it finds the feature set that gives the best performance for a
specific model.

o It takes into account feature interactions, which can be beneficial for complex datasets.

3. Disadvantages:

o Computationally expensive since it involves training and evaluating models multiple

times.

o Time-consuming, especially for large datasets.

Wrapper methods provide more accurate feature selection but at the cost of higher computation,
making them suitable when accuracy is prioritized over speed.

Q5) Explain Embedded feature selection in Machine Learning?

Ans:
Embedded feature selection integrates the feature selection process into the training of the machine
learning model itself. The model determines which features contribute most to its performance during
the learning process.

1. Examples of Embedded Methods:

o Lasso Regression (L1 Regularization): Shrinks less important feature coefficients to zero,
effectively selecting the most important features.

o Decision Trees and Random Forests: These algorithms inherently rank feature
importance based on how well they split the data at each node.

o Ridge Regression (L2 Regularization): It penalizes large coefficients, indirectly promoting

simpler models by reducing the impact of less important features.

2. Advantages:

o More efficient than wrapper methods since feature selection is done during model
training, reducing the need for multiple iterations.

o Often more accurate than filter methods because it’s specific to the model being trained.

3. Disadvantages:

o Model-specific, meaning different models may select different features.

o Requires careful tuning of hyperparameters (like regularization strength) to ensure

proper feature selection.

Embedded methods strike a balance between the efficiency of filter methods and the accuracy of
wrapper methods, making them useful for many real-world applications.

MOD-3
SA
Q1) Define Linear Regression?

Ans:
Linear Regression is a supervised machine learning algorithm used to model the relationship between a
dependent variable and one or more independent variables by fitting a linear equation to the data. It
predicts continuous outcomes, like predicting house prices based on size and location.

Q2) Define Logistic Regression?

Ans:
Logistic Regression is a classification algorithm used to predict a binary outcome (e.g., yes/no, 0/1) based
on one or more predictor variables. It uses the logistic function to model the probability of the target
variable belonging to a particular class.

Q3) Define Data Imputation?

Ans:
Data Imputation is the process of replacing missing or incomplete data with substituted values to
maintain dataset integrity. It helps in handling missing data, which can otherwise skew results, by
techniques like mean substitution or predictive modeling.

Q4) List out the 4 Filter Feature selection techniques?

Ans:

1. Chi-Square Test

2. Correlation Coefficient

3. Variance Threshold

4. ANOVA (Analysis of Variance)

Q5) List out the 4 Wrapper Feature selection techniques?

1. Forward Selection

2. Backward Elimination

3. Recursive Feature Elimination (RFE)

4. Exhaustive Feature Selection

LA
Q1) How to detect outliers in Machine Learning?

Ans:
Outliers are data points that significantly differ from other observations in a dataset. Detecting outliers is
crucial as they can distort the training process and lead to inaccurate models. Some common methods to
detect outliers in machine learning include:

1. Statistical Methods:

o Z-Score: Measures how many standard deviations a data point is from the mean. If the Z-
score is above a threshold (e.g., greater than 3), the data point is considered an outlier.

o IQR (Interquartile Range): Uses the range between the first (Q1) and third quartiles
(Q3). Data points outside the range of Q1 - 1.5IQR and Q3 + 1.5IQR are flagged as
outliers.

2. Visualization:

o Box Plot: A graphical tool that highlights outliers as points beyond the "whiskers."
o Scatter Plot: Can visually reveal data points that are far from others in a two-
dimensional space.

3. Distance-Based Methods:

o Euclidean Distance: In high-dimensional data, calculating the distance from a data point
to its neighbors. If the distance is much greater than that of other points, it may be an
outlier.

4. Model-Based Methods:

o Isolation Forest: A tree-based model specifically designed to detect outliers by isolating

data points that require fewer splits.

o DBSCAN (Density-Based Spatial Clustering): Classifies data points that do not belong to
a cluster as outliers.

Q2) Explain Logistic Regression with a practical example?

Ans:
Logistic Regression is a supervised machine learning algorithm used for binary classification problems. It
predicts the probability of an outcome that has two possible values (e.g., 0 or 1). The logistic function
(sigmoid) is used to map predicted values to a range of 0 to 1.

Practical Example:

Spam Email Classification:

Suppose you want to classify whether an email is spam or not spam. You have a dataset with emails
labeled as "spam" (1) or "not spam" (0), along with features like the number of links, words, and the
sender's address.

• Input Features (X): Number of links, suspicious words, etc.

• Output (Y): 0 (not spam) or 1 (spam).

By training a logistic regression model on this dataset, it learns the relationship between the input
features and the probability of an email being spam. After training, the model can predict the probability
of a new email being spam, and if the probability exceeds a threshold (e.g., 0.5), the email is classified as
spam.

The logistic function is represented as:

P(Y=1) = 1 / (1 + e^-(mX + b)), where m is the coefficients, X is the input feature, and b is the bias.

Q3) Explain Linear Regression with a practical example?

Ans:
Linear Regression is a supervised learning algorithm used to predict a continuous output based on input
features. It models the relationship between the dependent variable (Y) and one or more independent
variables (X) by fitting a linear equation to the data.

Practical Example:

House Price Prediction:

Suppose a real estate company wants to predict the price of a house based on its size (square feet).
Using linear regression, the company can create a model based on historical data of house sizes and
prices.

• Independent Variable (X): Size of the house in square feet.

• Dependent Variable (Y): Sale price of the house.

The model will establish a linear relationship such as:

Price = 300 * (Size) + 50,000.

For example, if the house size is 2,000 square feet, the predicted price would be:
Price = 300 * 2000 + 50,000 = $650,000.

Linear regression finds the best-fitting line by minimizing the error between predicted and actual values,
often using the least squares method.

Q4) Explain the differences between Linear Regression and Logistic Regression?

Ans:
Linear Regression and Logistic Regression are both supervised learning algorithms, but they differ in
purpose and approach:

1. Purpose:

o Linear Regression: Used for predicting a continuous numerical value (e.g., predicting
house prices).

o Logistic Regression: Used for binary classification problems (e.g., predicting whether an
email is spam or not).

2. Output:

o Linear Regression: Outputs a continuous value (can be any real number).

o Logistic Regression: Outputs a probability between 0 and 1, which is then classified as 0

or 1 based on a threshold (e.g., 0.5).

3. Equation:

o Linear Regression: Uses a linear equation Y = mX + b.

o Logistic Regression: Uses the sigmoid function P(Y=1) = 1 / (1 + e^-(mX + b)) to map the
output to a probability.
4. Use Case:

o Linear Regression: Predicts numerical values, such as sales, temperature, or house

prices.

o Logistic Regression: Used for classification tasks, like disease prediction (yes/no), or
fraud detection.

5. Loss Function:

o Linear Regression: Uses Mean Squared Error (MSE) to minimize the difference between
predicted and actual values.

o Logistic Regression: Uses Log Loss (Cross-Entropy) to measure the difference between
predicted probabilities and actual class labels.

Q5) Explain the sources of data in Machine Learning?

Ans:
Data is the core component of machine learning. The various sources of data used for training and
building machine learning models include:

1. Public Datasets:

o Many organizations provide datasets for research purposes. Popular sources include
Kaggle, UCI Machine Learning Repository, and Google Dataset Search.

2. Web Scraping:

o Data can be gathered from websites using scraping tools like BeautifulSoup or Scrapy.
This data is often used for applications such as price comparisons, sentiment analysis,
and news aggregation.

3. User-Generated Data:

o Social media platforms, forums, and review sites generate vast amounts of data. For
example, Twitter, Amazon, and Reddit provide text data used for natural language
processing (NLP).

4. Transactional Data:

o This includes data generated through e-commerce transactions, financial records, or any
system that logs interactions. It’s commonly used for recommendation systems and
fraud detection.

5. Sensor Data (IoT):

o Devices equipped with sensors (e.g., smartwatches, medical devices) generate real-time
data used in predictive maintenance, health monitoring, and smart city applications.
Machine learning practitioners choose data sources based on the problem being solved, data availability,
and the structure of the data (e.g., structured or unstructured).

Applied Statistics in Business and Economics 7th Edition
No ratings yet
Applied Statistics in Business and Economics 7th Edition
23 pages
Bio-Stats Step 3
100% (6)
Bio-Stats Step 3
9 pages
QF603 Draft Slides (Group 7)
No ratings yet
QF603 Draft Slides (Group 7)
71 pages
FRM Quant Question Bank
100% (5)
FRM Quant Question Bank
111 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
INTRODUCTION TO MACHINE LEARNING
No ratings yet
INTRODUCTION TO MACHINE LEARNING
31 pages
ML UNIT 1
No ratings yet
ML UNIT 1
20 pages
ML-QB-Unit 1
No ratings yet
ML-QB-Unit 1
41 pages
ml_unit_1
No ratings yet
ml_unit_1
19 pages
UNIT1@
No ratings yet
UNIT1@
4 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
DS-unit2
No ratings yet
DS-unit2
23 pages
Unit 1
No ratings yet
Unit 1
47 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
64 pages
Machine Learning Unit-1
No ratings yet
Machine Learning Unit-1
22 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Null 5
No ratings yet
Null 5
16 pages
AI using Python
No ratings yet
AI using Python
26 pages
Unit 1
No ratings yet
Unit 1
21 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
19 pages
Engineer Being Machine Learning Notes
No ratings yet
Engineer Being Machine Learning Notes
95 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
ML 2
No ratings yet
ML 2
4 pages
Chapter1
No ratings yet
Chapter1
30 pages
UNIT II deep learning
No ratings yet
UNIT II deep learning
42 pages
Machine Learning Interview Questions & Answers - MIQ
No ratings yet
Machine Learning Interview Questions & Answers - MIQ
17 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
Question-Answers in Machine Learning
No ratings yet
Question-Answers in Machine Learning
14 pages
Mubbashir assignment ML
No ratings yet
Mubbashir assignment ML
10 pages
What Are The Types of Machine Learning?
100% (1)
What Are The Types of Machine Learning?
24 pages
R20 ML - Unit-1
No ratings yet
R20 ML - Unit-1
23 pages
There Are Key Areas in The Process of Machine Learning, Like
No ratings yet
There Are Key Areas in The Process of Machine Learning, Like
45 pages
AIML Interview Questions
No ratings yet
AIML Interview Questions
17 pages
Research Paper (Machine Learning & Clustering)
No ratings yet
Research Paper (Machine Learning & Clustering)
8 pages
Tutorial Sheet1 (M.L.)
No ratings yet
Tutorial Sheet1 (M.L.)
49 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
ASSIGNMENT 1 Mavhine Learning
No ratings yet
ASSIGNMENT 1 Mavhine Learning
8 pages
Short Question_ Answe of ML
No ratings yet
Short Question_ Answe of ML
9 pages
ML Question Answer
No ratings yet
ML Question Answer
21 pages
Unit-1
No ratings yet
Unit-1
55 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
ETE Ans
No ratings yet
ETE Ans
73 pages
machine learning notes
No ratings yet
machine learning notes
20 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Engineer Being Machine Learning notes
No ratings yet
Engineer Being Machine Learning notes
95 pages
Engineer Being Machine Learning Notes
No ratings yet
Engineer Being Machine Learning Notes
95 pages
5th Sem Report
No ratings yet
5th Sem Report
29 pages
Machine Learning and Eural Etwork
100% (1)
Machine Learning and Eural Etwork
21 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Machine Learning - Trading
No ratings yet
Machine Learning - Trading
3 pages
Chapter Five
No ratings yet
Chapter Five
178 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Introduction to ML
No ratings yet
Introduction to ML
17 pages
ML R20 Material
No ratings yet
ML R20 Material
96 pages
Unit-1 Part-1 Material
No ratings yet
Unit-1 Part-1 Material
45 pages
1.Machine Learning Basics
No ratings yet
1.Machine Learning Basics
74 pages
Unit-I
No ratings yet
Unit-I
8 pages
machine_learning_units_1_to_5_bolded_questions
No ratings yet
machine_learning_units_1_to_5_bolded_questions
19 pages
Machine Learning
No ratings yet
Machine Learning
73 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
The multivariate social scientist introductory statistics using generalized linear models Sofroniou download
100% (2)
The multivariate social scientist introductory statistics using generalized linear models Sofroniou download
63 pages
Correlation and Regression
No ratings yet
Correlation and Regression
22 pages
Unit 1
No ratings yet
Unit 1
31 pages
Lecture 10_1_Bernoulli distribution
No ratings yet
Lecture 10_1_Bernoulli distribution
33 pages
Full download Monte Carlo Methods for Particle Transport 2nd Edition Alireza Haghighat pdf docx
100% (2)
Full download Monte Carlo Methods for Particle Transport 2nd Edition Alireza Haghighat pdf docx
55 pages
Risk and Return Formulae PDF
No ratings yet
Risk and Return Formulae PDF
1 page
Univariate and Multivariate Control Charts For Monitoring Dynamic-Behavior Processes: A Case Study
No ratings yet
Univariate and Multivariate Control Charts For Monitoring Dynamic-Behavior Processes: A Case Study
35 pages
DISCRETE-TIME RANDOM PROCESS Summary
No ratings yet
DISCRETE-TIME RANDOM PROCESS Summary
13 pages
Download Full Data Assimilation for the Geosciences: From Theory to Application 2nd Edition Steven J. Fletcher - eBook PDF PDF All Chapters
100% (4)
Download Full Data Assimilation for the Geosciences: From Theory to Application 2nd Edition Steven J. Fletcher - eBook PDF PDF All Chapters
69 pages
Activity Sheet For STAT
No ratings yet
Activity Sheet For STAT
1 page
Chap 012
No ratings yet
Chap 012
79 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
63 pages
Design of Experiments
No ratings yet
Design of Experiments
24 pages
Regression PPT
No ratings yet
Regression PPT
21 pages
Time Series Forecasting Using Hybrid Wavelet - Arima
No ratings yet
Time Series Forecasting Using Hybrid Wavelet - Arima
17 pages
A Second Course in Statistics: Regression Analysis 8th Edition William Mendenhall - Download the full ebook set with all chapters in PDF format
100% (3)
A Second Course in Statistics: Regression Analysis 8th Edition William Mendenhall - Download the full ebook set with all chapters in PDF format
54 pages
Worksheet 7 Solution
No ratings yet
Worksheet 7 Solution
4 pages
Excel Templates For Use With Production & Operations Management BY Chase, Aquilano and Jacobs Developed by Ore A. Soluade
No ratings yet
Excel Templates For Use With Production & Operations Management BY Chase, Aquilano and Jacobs Developed by Ore A. Soluade
36 pages
Descriptive Statistics and Inferential Statistics - Edited
No ratings yet
Descriptive Statistics and Inferential Statistics - Edited
3 pages
Engineering Mathematics III MCQ: Out of The Following Values, Which One Is Not Possible in Probability?
No ratings yet
Engineering Mathematics III MCQ: Out of The Following Values, Which One Is Not Possible in Probability?
21 pages
Normal Curve Distribution IIIf-2-3 Chapter 2 Lesson 1
No ratings yet
Normal Curve Distribution IIIf-2-3 Chapter 2 Lesson 1
20 pages
Fse Question Paper
No ratings yet
Fse Question Paper
3 pages
Unit 5
No ratings yet
Unit 5
3 pages
Statistics Project: Khizar Bin Nasir Salman Ali Ghause Ahmad
No ratings yet
Statistics Project: Khizar Bin Nasir Salman Ali Ghause Ahmad
11 pages
Medical Applications of Finite Mixture Models Full Digital Edition
100% (10)
Medical Applications of Finite Mixture Models Full Digital Edition
15 pages
Statistics & Probability Quarter 3 Week 1 Day1&2
No ratings yet
Statistics & Probability Quarter 3 Week 1 Day1&2
5 pages

ML Mid1

Uploaded by

ML Mid1

Uploaded by

MACHINE LEARNING

Q2) Define unsupervised learning with an example?

Q3) Define reinforcement learning with an example?

Q4) Define deep learning with an example?

Q5) Define semi-supervised learning with an example?

Types of Machine Learning:

o In unsupervised learning, the algorithm is trained on an unlabeled dataset and has to

o Example: Customer segmentation in marketing. A business might use unsupervised

1. Reinforcement Learning (RL):

2. Deep Learning (DL):

1. MCAR (Missing Completely at Random):

2. MAR (Missing at Random):

3. MNAR (Missing Not at Random):

Q5) Explain the applications of Machine Learning?

5. Natural Language Processing (NLP):

o Machine learning is used for predictive maintenance, allowing companies to predict

These applications demonstrate machine learning's ability to automate decision-making, improve

Q2) Define Skewed Distribution?

Q3) Why data transformation is important in Machine Learning?

Q4) List two data transformation techniques in Machine Learning?

1. Normalization: Scales data to a range (e.g., 0 to 1).

2. Standardization: Rescales data to have a mean of 0 and a standard deviation of 1.

1. MCAR (Missing Completely at Random)

2. MAR (Missing at Random)

3. MNAR (Missing Not at Random)

Q2) Explain the process to handle imbalanced data in Machine Learning?

Q3) Explain Filter feature selection in Machine Learning?

o Mutual Information: Measures how much information a feature contributes to the

o Prevents overfitting by removing irrelevant or redundant features early on.

Q4) Explain Wrapper feature selection in Machine Learning?

o Computationally expensive since it involves training and evaluating models multiple

o Time-consuming, especially for large datasets.

Q5) Explain Embedded feature selection in Machine Learning?

1. Examples of Embedded Methods:

o Ridge Regression (L2 Regularization): It penalizes large coefficients, indirectly promoting

o Model-specific, meaning different models may select different features.

o Requires careful tuning of hyperparameters (like regularization strength) to ensure

Q2) Define Logistic Regression?

Q3) Define Data Imputation?

Q4) List out the 4 Filter Feature selection techniques?

4. ANOVA (Analysis of Variance)

Q5) List out the 4 Wrapper Feature selection techniques?

3. Recursive Feature Elimination (RFE)

4. Exhaustive Feature Selection

o Isolation Forest: A tree-based model specifically designed to detect outliers by isolating

Q2) Explain Logistic Regression with a practical example?

Spam Email Classification:

• Input Features (X): Number of links, suspicious words, etc.

• Output (Y): 0 (not spam) or 1 (spam).

The logistic function is represented as:

Q3) Explain Linear Regression with a practical example?

House Price Prediction:

• Independent Variable (X): Size of the house in square feet.

• Dependent Variable (Y): Sale price of the house.

The model will establish a linear relationship such as:

o Linear Regression: Outputs a continuous value (can be any real number).

o Logistic Regression: Outputs a probability between 0 and 1, which is then classified as 0

o Linear Regression: Uses a linear equation Y = mX + b.

o Linear Regression: Predicts numerical values, such as sales, temperature, or house

Q5) Explain the sources of data in Machine Learning?

5. Sensor Data (IoT):

You might also like