0% found this document useful (0 votes)
15 views35 pages

Machine Learning Unit I

The document provides an overview of Machine Learning, detailing its definition, types, and various algorithms used in supervised, unsupervised, and reinforcement learning. It discusses the importance of functions, relations, and probabilistic models in the machine learning process, emphasizing their roles in building predictive models and understanding data patterns. Additionally, it covers specific algorithms and their applications, illustrating the practical aspects of machine learning in real-world scenarios.

Uploaded by

suganyaphd20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views35 pages

Machine Learning Unit I

The document provides an overview of Machine Learning, detailing its definition, types, and various algorithms used in supervised, unsupervised, and reinforcement learning. It discusses the importance of functions, relations, and probabilistic models in the machine learning process, emphasizing their roles in building predictive models and understanding data patterns. Additionally, it covers specific algorithms and their applications, illustrating the practical aspects of machine learning in real-world scenarios.

Uploaded by

suganyaphd20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

KPR College of Arts Science and Research

Machine Learning - Unit I

Algorithmic models of learning


Learning classifiers
Functions
Relations
Grammars
Probabilistic models
Value functions
Behaviors and programs for experience.
Bayesian maximum some posterior and minimum description
length frameworks

III CSDA - Machine Learning


Machine Learning - Unit I

Introduction to Machine Learning


What is Machine Learning
We are surrounded by humans who can learn everything from their experiences
with their learning capability, and we have computers or machines which work on
our instructions.
But can a machine also learn from experiences or past data like a human does? So
here comes the role of Machine Learning.

III CSDA - Machine Learning


Machine Learning - Unit I

Introduction to Machine Learning


 A subset of artificial intelligence known as machine learning
 It focuses primarily on the creation of algorithms that enable a computer to
independently learn from data and previous experiences.
 Arthur Samuel first used the term "machine learning" in 1959
 Without being explicitly programmed, machine learning enables
 A machine to automatically learn from data
 Improve performance from experiences
 Predict things
 Machine learning algorithms create a mathematical model that, without being
explicitly programmed, aids in making predictions or decisions with the
assistance of sample historical data, or training data.
 For the purpose of developing predictive models, machine learning brings
together statistics and computer science.
 Algorithms that learn from historical data are either constructed or utilized in
machine learning.
 The performance will rise in proportion to the quantity of information we
provide.

III CSDA - Machine Learning


Machine Learning - Unit I

How does Machine Learning works


 A machine learning system builds prediction models, learns from previous data,
and predicts the output of new data whenever it receives it.
 The amount of data helps to build a better model that accurately predicts the
output, which in turn affects the accuracy of the predicted output.

III CSDA - Machine Learning


Machine Learning - Unit I

Features of Machine Learning


 Machine learning uses data to detect various patterns in a given dataset.
 It can learn from past data and improve automatically.
 It is a data-driven technology.
 Machine learning is much similar to data mining as it also deals with the huge
amount of the data.
Definition of learning:
 A computer program is said to learn from experience E with respect to some
class of tasks T and performance measure P, if its performance at tasks T, as
measured by P, improves with experience E.
Example 1 : Handwriting recognition learning problem
 Task T : Recognizing and classifying handwritten words within images
 Performance P : Percent of words correctly classified
 Training experience E : A dataset of handwritten words with given classifications
Example 2 : A robot driving learning problem
 Task T : Driving on highways using vision sensors
 Performance P : Average distance travelled before an error
 Training experience E : A sequence of images and steering commands recorded while
observing a human driver

III CSDA - Machine Learning


Machine Learning - Unit I

III CSDA - Machine Learning


Machine Learning - Unit I

 Different machine learning models


 There are many machine learning models, and almost all of them are based on
certain machine learning algorithms.
 Popular classification and regression algorithms fall under supervised machine
learning, and clustering algorithms are generally deployed in unsupervised
machine learning scenarios.
 Supervised Machine Learning
 Logistic Regression: Logistic Regression is used to determine if an input belongs
to a certain group or not
 SVM: SVM, or Support Vector Machines create coordinates for each object in
an n-dimensional space and uses a hyperplane to group objects by common
features
 Naive Bayes: Naive Bayes is an algorithm that assumes independence among
variables and uses probability to classify objects based on features
 Decision Trees: Decision trees are also classifiers that are used to determine
what category an input falls into by traversing the leaf's and nodes of a tree

III CSDA - Machine Learning


Machine Learning - Unit I

 Linear Regression: Linear regression is used to identify relationships between


the variable of interest and the inputs, and predict its values based on the
values of the input variables.
 kNN: The k Nearest Neighbors technique involves grouping the closest objects
in a dataset and finding the most frequent or average characteristics among the
objects.
 Random Forest: Random forest is a collection of many decision trees from
random subsets of the data, resulting in a combination of trees that may be
more accurate in prediction than a single decision tree.
 Boosting algorithms: Boosting algorithms, such as Gradient Boosting Machine,
XGBoost, and LightGBM, use ensemble learning. They combine the predictions
from multiple algorithms
 Unsupervised Machine Learning
 K-Means: The K-Means algorithm finds similarities between objects and groups
them into K different clusters.
 Hierarchical Clustering: Hierarchical clustering builds a tree of nested clusters
without having to specify the number of clusters.

III CSDA - Machine Learning


Machine Learning - Unit I

Learning Classification:
Machine learning can be classified into three types:
 Supervised learning
 Unsupervised learning
 Reinforcement learning
 Supervised learning
 In supervised learning, sample labeled data are provided to the machine
learning system for training, and the system then predicts the output based on
the training data.
 The mapping of the input data to the output data is the objective of supervised
learning.
 Spam filtering is an example of supervised learning.

III CSDA - Machine Learning


Machine Learning - Unit I

 Supervised learning

III CSDA - Machine Learning


Machine Learning - Unit I

 Supervised learning
 Supervised learning can be grouped further in two categories of algorithms:
 Classification
 Regression
 Supervised machine learning is widely deployed in image recognition, utilizing a
technique called classification.
 Supervised machine learning is also used in predicting demographics such as
population growth or health metrics, utilizing a technique called regression.

III CSDA - Machine Learning


Machine Learning - Unit I

 Unsupervised learning
 Unsupervised learning is a learning method in which a machine learns without
any supervision.
 The training is provided to the machine with the set of data that has not been
labeled, classified, or categorized, and the algorithm needs to act on that data
without any supervision.
 The goal of unsupervised learning is to restructure the input data into new
features or a group of objects with similar patterns.
 It can be further classifieds into two categories of algorithms:
 Clustering
 Association

III CSDA - Machine Learning


Machine Learning - Unit I

 Reinforcement Learning
 Reinforcement learning is a feedback-based learning method, in which a
learning agent gets a reward for each right action and gets a penalty for each
wrong action.
 The agent learns automatically with these feedbacks and improves its
performance.
 In reinforcement learning, the agent interacts with the environment and
explores it.
 The goal of an agent is to get the most reward points, and hence, it improves its
performance.
 The robotic dog, which automatically learns the movement of his arms, is an
example of Reinforcement learning.

III CSDA - Machine Learning


Machine Learning - Unit I

III CSDA - Machine Learning


Machine Learning - Unit I

 Function in Machine Learning


 Functions play a central role in machine learning, serving various purposes
throughout the different stages of the machine learning pipeline.
 Objective Function (Loss Function):
 The objective function, also known as the loss function, quantifies the
difference between the predicted output and the actual target values.
 Activation Functions:
 Activation functions introduce non-linearity to the model, allowing it to learn
complex patterns.
 Common activation functions include the sigmoid, tanh, and rectified linear
unit (ReLU).
 These functions are applied to the output of each neuron in a neural network

III CSDA - Machine Learning


Machine Learning - Unit I

 Function in Machine Learning


 Cost Function:
 The cost function is a broader term that encompasses the objective function
but may also include additional regularization terms.
 It represents the overall cost associated with the model's predictions and is
used during the training process to adjust the model parameters.
 Model Functions (Predictive Functions):
 The model function represents the learned relationship between the input
features and the output. I
 In supervised learning, this function is what the model uses to make
predictions on new, unseen data.
 Optimization Function:
 Optimization functions are used to adjust the model parameters during the
training process to minimize the objective function.
 Gradient descent is a common optimization algorithm that iteratively updates
the model parameters in the direction that reduces the value of the objective
function.

III CSDA - Machine Learning


Machine Learning - Unit I

 Function in Machine Learning


 Evaluation Metrics:
 Evaluation metrics are functions used to assess the performance of a machine
learning model.
 Examples include accuracy, precision, recall, F1 score, and area under the
receiver operating characteristic (ROC) curve.
 Decision Functions:
 In classification tasks, decision functions determine the class assignment for a
given input.
 These functions often involve setting a threshold on the predicted probabilities
to make binary or multiclass decisions.
 Transformation Functions:
 Feature transformation functions are used to preprocess and transform input
data.
 This includes tasks such as normalization, standardization, and encoding
categorical variables.

III CSDA - Machine Learning


Machine Learning - Unit I

 Functions in Machine Learning


 Regularization Functions:
 Regularization functions, such as L1 and L2 regularization, are used to prevent
overfitting by penalizing large weights in the model.
 These functions are added to the cost function during training.
 Kernel Functions:
 Kernel functions are used in support vector machines (SVMs) and other
kernelized algorithms.
 They define the similarity between pairs of data points in a higher-dimensional
space, allowing linear models to capture non-linear relationships.

III CSDA - Machine Learning


Machine Learning - Unit I

 Relations in Machine Learning


 Feature Relations:
 In supervised learning, the relationships between input features and the target
variable are crucial.
 Correlation:
 Correlation measures the statistical association between two variables. In the
context of machine learning, understanding the correlation between features
can be important for feature selection and model interpretation.
 Dependency:
 Dependencies between variables indicate how changes in one variable affect
another. This is especially relevant in probabilistic graphical models and
Bayesian networks.
 Model Relations:
 In ensemble learning, different models may be combined to improve overall
performance. Understanding the relationships and interactions between
individual models is key to leveraging the benefits of ensemble methods.

III CSDA - Machine Learning


Machine Learning - Unit I

 Relations in Machine Learning


 Data Relations:
 Relationships within the dataset, such as temporal dependencies or spatial
correlations, can be essential for certain types of models.
 For example, time series models rely on the temporal relationship between
data points.
 Class Relations:
 In classification tasks, understanding the relationships between different classes
is important.
 This can involve exploring class imbalances, hierarchical relationships, or
dependencies between classes.
 Hyperparameter Relations:
 The performance of a machine learning model can be influenced by
hyperparameters.
 Understanding how changes in hyperparameters affect model performance is
crucial for hyperparameter tuning.

III CSDA - Machine Learning


Machine Learning - Unit I

 Interpretability:
 Interpretable models allow us to understand the relationships and decision-
making processes within the model. This is particularly important in
applications where model interpretability is a requirement, such as in
healthcare or finance.
 Understanding and analyzing these different relations are fundamental to
building effective and interpretable machine learning models. The specific
context and type of machine learning problem will determine which
relationships are most relevant to consider.

III CSDA - Machine Learning


Machine Learning - Unit I

 Grammar in Machine Learning


 Grammar in the context of machine learning often refers to the structure and
rules governing the representation and manipulation of data or language within
a given model or system.
 Natural Language Processing (NLP):
 In NLP, grammar is crucial for tasks such as syntax parsing, where the goal is to
understand the grammatical structure of sentences.
 Grammar rules help machines recognize the relationships between words in a
sentence.
 Grammar-based Models:
 Some machine learning models, particularly in the field of language processing,
are based on grammatical rules.
 For example, probabilistic context-free grammars (PCFGs) or dependency
grammars can be used to model the syntactic structure of sentences.

III CSDA - Machine Learning


Machine Learning - Unit I

 Text Generation:
 In text generation tasks, grammar is essential for generating coherent and
grammatically correct sentences. This involves training models to understand
and reproduce the grammatical structures found in a given language.
 Grammar Checking:
 Machine learning is used in grammar checking tools to identify and correct
grammatical errors in written text. These models learn from large datasets to
recognize patterns of correct grammar and flag potential errors.
 Code Parsing:
 In machine learning for programming languages, models may be designed to
understand the grammar of code. This can involve parsing code to identify its
syntactic structure, which is useful for tasks like code completion or bug
detection.

III CSDA - Machine Learning


Machine Learning - Unit I

 Structured Data:
 In structured data analysis, understanding the grammar of the data is crucial.
 This involves recognizing patterns and relationships within the data, such as the
grammar of a database schema or the structure of time series data.
 Grammar-based Models for Image Generation:
 In image generation tasks, grammar can be used to represent hierarchical
structures.
 For example, in scenes with multiple objects, a grammar-based model might
capture the relationships between objects and their spatial arrangement.
 Reinforcement Learning Policies:
 In reinforcement learning, policies can be thought of as a kind of grammar for
decision-making.
 Agents learn policies that dictate their actions based on the current state, and
the quality of these policies is crucial for the success of the learning process.

III CSDA - Machine Learning


Machine Learning - Unit I

 Probabilistic Models
 Probabilistic models in machine learning are models that incorporate
uncertainty and probability distributions to represent and reason about
uncertainty in data and predictions.
 These models are particularly useful when dealing with incomplete or noisy
data, as they provide a principled way to express uncertainty and make
predictions in the presence of variability.
 Types of probabilistic models used in machine learning:
 Probabilistic Graphical Models (PGMs)
 Bayesian Models
 Gaussian Processes
 Hidden Markov Models (HMMs)
 Mixture Models
 Dirichlet Process Models
 Variational Autoencoders (VAEs)
 Monte Carlo Methods
 Conditional Random Fields (CRFs)

III CSDA - Machine Learning


Machine Learning - Unit I

 Probabilistic Graphical Models (PGMs):


 PGMs are a powerful framework that combines graph theory and probability
theory to model complex relationships and dependencies among variables.
 Examples include Bayesian Networks and Markov Random Fields.
 Bayesian Models:
 Bayesian models use Bayes' theorem to update beliefs about parameters or
hypotheses based on observed data.
 Bayesian models include Bayesian linear regression, Bayesian neural networks,
and Bayesian hierarchical models.
 Gaussian Processes:
 Gaussian processes are a non-parametric approach to modeling functions.
 They define a distribution over functions and provide a flexible framework for
regression and classification tasks.

III CSDA - Machine Learning


Machine Learning - Unit I

 Hidden Markov Models (HMMs):


 HMMs are a type of probabilistic model used for sequential data where the
underlying system is assumed to be a Markov process with hidden states. They
have applications in speech recognition, natural language processing, and
bioinformatics.
 Mixture Models:
 Mixture models assume that the data is generated from a mixture of several
probability distributions.
 Mixture models are useful for capturing complex data structures.
 Dirichlet Process Models:
 Dirichlet processes are used in non-parametric Bayesian models to model
distributions over an unknown number of components.

III CSDA - Machine Learning


Machine Learning - Unit I

 Variational Autoencoders (VAEs):


 VAEs are a type of generative model that combines variational inference and
deep learning.
 They model the underlying data distribution by mapping input data into a
latent space, allowing for the generation of new samples.
 Monte Carlo Methods:
 Monte Carlo methods, such as Markov Chain Monte Carlo (MCMC), are used
for sampling from complex probability distributions.
 Conditional Random Fields (CRFs):
 CRFs model the conditional probability of a set of output variables given a set
of input variables.
 They are commonly used in structured prediction tasks, such as sequence
labelling in natural language processing.

III CSDA - Machine Learning


Machine Learning - Unit I

 Value Functions
 Reinforcement Learning:
 In reinforcement learning, the value function is a critical concept used to assess
the goodness of a state or a state-action pair.
 There are two main types of value functions: the state value function (V-
function) and the action value function (Q-function).
 The state value function, denoted as V(s), represents the expected cumulative
future rewards when starting from a particular state s and following a certain
policy.
 The action value function, denoted as Q(s,a), represents the expected
cumulative future rewards when starting from a state s, taking action a, and
then following a certain policy.

III CSDA - Machine Learning


Machine Learning - Unit I

 Value Functions
 Supervised Learning - Value Prediction:
 In some regression problems, the term "value function" might refer to the
function that predicts a continuous output variable (value) based on input
features.
 For example, in a real estate prediction task, the value function could predict
the price of a house based on features like square footage, location, and
number of bedrooms.
 Cost or Loss Function:
 The term "value function" is sometimes used interchangeably with the cost
function or loss function in supervised learning.
 The cost function measures the difference between the predicted values and
the true values in the training data.
 The goal in training a model is often to minimize this value function.

III CSDA - Machine Learning


Machine Learning - Unit I

 Value Functions
 Value Function Approximation:
 In certain algorithms, particularly in function approximation methods, the value
function may be approximated using a model or a set of parameters.
 For example, in approximate dynamic programming or deep reinforcement
learning, a neural network might be used to approximate the Q-function or V-
function.

III CSDA - Machine Learning


Machine Learning - Unit I

Behaviors:
 In reinforcement learning, behaviors refer to the actions taken by an agent in
response to the state of the environment.
 The behavior of an agent is defined by its policy, which is a strategy or a set of
rules that determines how the agent selects actions in different states.
 Behaviors are crucial for learning because they affect the outcomes and
rewards the agent receives.
 Programs:
 In the context of machine learning, particularly in reinforcement learning,
"programs" can refer to the algorithms or procedures implemented by an agent
to interact with the environment.
 These programs include the decision-making process (policy) and the
mechanisms for learning and adapting over time.

III CSDA - Machine Learning


Machine Learning - Unit I

 Bayesian Framework
 Maximum A Posteriori (MAP)
 Minimum Description Length (MDL)
 Bayesian are frameworks used in statistics and machine learning to make
decisions, estimate parameters, or perform model selection.
 Bayesian Framework:
 The Bayesian framework is based on Bayes' theorem, which describes the
probability of a hypothesis given the data.
 In the context of machine learning, Bayesian methods provide a probabilistic
approach to modeling uncertainty.
 The posterior distribution is updated based on observed data using the
likelihood function and prior distribution.
 Bayes' theorem is expressed as follows:

 Bayesian methods are used for parameter estimation, model selection, and
uncertainty quantification.

III CSDA - Machine Learning


Machine Learning - Unit I

 Bayesian Framework
 Maximum A Posteriori (MAP)
 MAP estimation is a point estimate method within the Bayesian framework.
 It seeks to find the point in the parameter space that maximizes the posterior
distribution.
 Mathematically, it can be expressed as follows:

 This framework is particularly useful when you want a single point estimate for
the parameters, combining information from the likelihood and the prior.

III CSDA - Machine Learning


Machine Learning - Unit I

 Bayesian Framework
 Minimum Description Length (MDL) Framework:
 MDL is a principle of model selection that balances the complexity of a model and its
ability to explain the data.
 The idea is to choose the model that provides the most concise description of the data.
 MDL can be expressed as the sum of the coding length of the model and the coding
length of the data given the model:

 The MDL framework penalizes overly complex models, promoting a trade-off between
model fit and simplicity.

III CSDA - Machine Learning

You might also like