0% found this document useful (0 votes)
50 views7 pages

Module 1

The document provides an overview of optimization in machine learning, detailing its importance, key components, and various challenges faced during the optimization process. It elaborates on gradient descent, a popular optimization algorithm, and its types, advantages, and disadvantages. Additionally, it discusses machine learning's definition, types (supervised, unsupervised, reinforcement), and classification algorithms, highlighting their applications and challenges.

Uploaded by

awanit6202374449
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views7 pages

Module 1

The document provides an overview of optimization in machine learning, detailing its importance, key components, and various challenges faced during the optimization process. It elaborates on gradient descent, a popular optimization algorithm, and its types, advantages, and disadvantages. Additionally, it discusses machine learning's definition, types (supervised, unsupervised, reinforcement), and classification algorithms, highlighting their applications and challenges.

Uploaded by

awanit6202374449
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

❖ Introduction to Optimization:

Optimization is the process of iteratively adjusting a model's internal parameters to minimize a "loss
function," which quantifies the error between the model's predictions and the actual target values.
Optimization is the heart of machine learning. Without optimization, ML models cannot learn effectively.
The Goal of optimization is to improve model performance by reducing error.
Gradient descent, stochastic gradient descent, Adam, Adagrad, and RMSProp are a few optimization
methods that can be utilized in machine learning.
For Example, In Manufacturing, Determining the production quantities of different products to maximize
profit given limited resources like materials, labor, and machine time.
In Logistics, Finding the most efficient routes for delivery trucks to minimize fuel costs and delivery times.

Key Components of Optimization-


• Objective Function: The main goal of optimization is to improve this function. It represents the
quantity to be maximized (e.g., profit, efficiency) or minimized (e.g., cost, error). In machine learning,
this is often the loss function.
• Parameters: The internal values within a model that are adjusted during the training process to improve
its performance.
• Variables: These are the adjustable factors or inputs that we can change to achieve the desired outcome.
• Constraints: A constraint is a condition or restriction that the solution must satisfy while optimizing. In
ML, optimization isn’t always free — sometimes we restrict parameters or outputs.
• Feasible Region: This is the set of all potential solutions that satisfy the constraints. Only solutions
within this region are considered valid.

Importance of Optimization-
• Improved Accuracy: Optimization directly leads to more accurate predictions by reducing the error
between predicted and actual outputs.
• Generalization: The ultimate goal is to create models that perform well not just on the training data but
also on new, real-world data.
• Resource Efficiency: It helps in allocating resources effectively, reducing waste and cost.
• Better Performance: It leads to designs, processes, and systems that perform at their highest potential.
• Wide Applicability: Optimization techniques are used in virtually every industry, from business and
finance to computer science and environmental management.

Optimization Challenges-
• Overfitting: The model performs well on training data but poorly on unseen data.
• Local Minima: Optimization algorithms may converge to a suboptimal local minimum rather than
the global minimum, especially with complex, non-convex loss functions.
1|Page
• Learning Rate: Choosing an optimal learning rate is a critical challenge. If the rate is too high, the
algorithm may overshoot the minimum. If it is too low, convergence can be very slow.
• Vanishing and Exploding Gradients: In deep networks, gradients can become too small or too
large, destabilizing the training process. Adaptive optimizers like Adam and RMSprop were
developed to address this.

❖ Gradient Decent:
Gradient Descent is one of the most popular optimization algorithms in machine learning and deep learning.
It is used to minimize a loss function (difference between predicted and actual values). It works by iteratively
adjusting parameters in the direction that reduces the loss.
The mathematical formula is,
θnew = θold − η⋅∇L(θ)
Where, θ = parameters (weights, biases)
𝐿(𝜃) = loss function
∇𝐿 (𝜃) = gradient (slope)
η = learning rate (step size)

Types of Gradient Descent:


• Batch Gradient Descent (BGD): It Calculates the gradient using the entire training dataset in each
iteration. It produces a stable, accurate gradient and converges smoothly toward the minimum. This can
be computationally expensive for large datasets.
• Stochastic Gradient Descent (SGD): It Calculates the gradient and updates parameters using only a
single randomly selected training example per iteration. This is faster but can lead to noisy updates.
• Mini-batch Gradient Descent: It Splits the training data into small batches and performs an update for
each batch. This is the most popular and balanced approach. It is more stable than SGD and more
computationally efficient than BGD.
• Momentum-based Gradient Descent: Momentum-based Gradient Descent speeds up convergence by
adding a fraction of the previous gradient to the current update.
• Adagrad: Adagrad adjusts learning rates based on the historical magnitude of gradients.
• RMSprop: RMSprop is similar to Adagrad but uses a moving average of squared gradients for learning
rate adjustments.
• Adam: Adam combines Momentum, Adagrad, and RMSprop by using moving averages of gradients
and squared gradients.

Working of Gradient Descent:


Step 1: Start with Initial Parameters: Begin with random guesses for parameters (weights, biases).

2|Page
Step 2: Calculate the Gradient: Find the gradient of the loss function with respect to the parameters at the
current point. The gradient tells us the direction of the steepest ascent (where the function is increasing the
most).
Step 3: Update the Parameters: Move in opposite direction of the gradient to reduce the functions value.
The size of the step is determined by the learning rate.
Step 4: Repeat the Process: Continue this process for several iterations until the parameters converge to the
optimal values that maximize the function.

Advantages:
• Simple and easy to implement.
• Works well for both small and large datasets.
• Foundation of almost all deep learning optimizers.

Disadvantages:
• May get stuck in local minima.
• Sensitive to learning rate
• For very large datasets, computation can be expensive.
• It can be time-consuming especially when dealing with large datasets.

❖ Machine Learning:
Machine learning (ML) is defined as a discipline of artificial intelligence (AI) that provides machines the
ability to automatically learn from data and past experiences to identify patterns and make predictions with
minimal human intervention.
The accuracy of predicted output depends upon the amount of data, as the huge amount of data helps to build
a better model which predicts the output more accurately.
Arthur Samuel, coined the term “Machine Learning” in 1959. He defined machine learning as “the field of
study that gives computers the ability to learn without being explicitly programmed.”
Examples: If you have used Netflix, then you must know that it recommends you some movies or shows for
watching based on what you have watched earlier. Machine Learning is used for this recommendation and to
select the data which matches your choice. It uses the earlier data.
The second example would be Facebook. When you upload a photo on Facebook, it can recognize a person
in that photo and suggest you, mutual friends. ML is used for these predictions. It uses data like your friend-
list, photos available etc. and it makes predictions based on that.

Features of Machine Learning-


• Machine learning uses data to detect various patterns in a given dataset.
• It can learn from past data and improve automatically.

3|Page
• It is a data-driven technology.
• Machine learning is much similar to data mining as it also deals with the huge amount of the data.

Need for Machine Learning-


The need for machine learning is increasing day by day. The reason behind the need for machine learning is
that it is capable of doing tasks that are too complex for a person to implement directly. As a human, we have
some limitations as we cannot access the huge amount of data manually, so for this, we need some computer
systems and here comes the machine learning to make things easy for us.

We can train machine learning algorithms by providing them the huge amount of data and let them explore
the data, construct the models, and predict the required output automatically. The performance of the
machine learning algorithm depends on the amount of data, and it can be determined by the cost function.
With the help of machine learning, we can save both time and money.

The importance of machine learning can be easily understood by its use’s cases, currently, machine learning
is used in self-driving cars, cyber fraud detection, face recognition, and friend suggestion by Facebook, etc.
Various top companies such as Netflix and Amazon have built machine learning models that are using a vast
amount of data to analyze the user interest and recommend product accordingly.

Types of machine Learning:


In general, machine learning algorithms can be classified into three types.
1. Supervised Learning
Supervised learning is the types of machine learning in which machines are trained using well "labelled"
training data, and on basis of that data, machines predict the output. The labelled data means some input
data is already tagged with the correct output.
In supervised learning, the training data provided to the machines work as the supervisor that teaches the
machines to predict the output correctly. It applies the same concept as a student learns in the supervision
of the teacher.
Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).
Supervised learning can be further divided into two types of problems: Regression and Classification.
In the real-world, supervised learning can be used for Risk Assessment, Image classification, Fraud
Detection, Medical Diagnosis, Spam Filtering, etc.
Advantages-
• With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
• In supervised learning, we can have an exact idea about the classes of objects.
4|Page
• Supervised learning model helps us to solve various real-world problems such as fraud detection,
spam filtering, etc.

Disadvantages
• Supervised learning models are not suitable for handling the complex tasks.
• Supervised learning cannot predict the correct output if the test data is different from the training
dataset.
• Training required lots of computation times.
• In supervised learning, we need enough knowledge about the classes of object.

2. Unsupervised Learning
Unsupervised learning is a machine learning technique in which models are not supervised using training
dataset.
The primary goal of Unsupervised learning is often to discover hidden patterns, similarities, or clusters
within the data, which can then be used for various purposes, such as data exploration, visualization,
dimensionality reduction, and more.
The unsupervised learning algorithm can be further categorized into two types of problems: Clustering
and Association.
Popular Algorithms are K-means clustering, Principal Component Analysis etc.,

Advantages-
• Unsupervised learning is used for more complex tasks as compared to supervised learning because,
in unsupervised learning, we don't have labeled input data.
• Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data.

Disadvantages-
• Unsupervised learning is intrinsically more difficult than supervised learning as it does not have
corresponding output.
• The result of the unsupervised learning algorithm might be less accurate as input data is not labeled,
and algorithms do not know the exact output in advance.
3. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A software
component) automatically explore its surrounding by hitting & trail, taking action, learning from
experiences, and improving its performance. Agent gets rewarded for each good action and gets punished
for each bad action; hence the goal of reinforcement learning agent is to maximize the rewards. In
reinforcement learning, there is no labelled data like supervised learning, and agents learn from their
experiences only.

5|Page
The reinforcement learning process is similar to a human being; for example, a child learns various things
by experiences in his day-to-day life.
An example of reinforcement learning is to play a game, where the Game is the environment, moves of
an agent at each step define states, and the goal of the agent is to get a high score. Agent receives
feedback in terms of punishment and rewards. Due to its way of working, reinforcement learning is
employed in different fields such as Game theory, Operation Research, Information theory, multi-agent
systems.
The reinforcement learning algorithm can be further categorized into two types of problems: Positive
reinforcement learning and Negative reinforcement learning.
In the real-world, reinforcement learning can be used for video games, Resource management, Robotics,
Text mining etc.

Advantages
• It helps in solving complex real-world problems which are difficult to be solved by general
techniques.
• The learning model of RL is similar to the learning of human beings; hence most accurate results can
be found.
• Helps in achieving long term results.

Disadvantages
• These algorithms are not preferred for simple problems.
• These algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states which can weaken the results.
• The curse of dimensionality limits reinforcement learning for real physical systems.

❖ Classification:
Classification algorithms are used when the output variable is categorical, which means there are two classes
such as Yes-No, Male-Female, True-false, etc. It deals with Discrete outcomes.
Some popular classification algorithms which come under supervised learning are Random Forest, Decision
Trees, Logistic Regression, Support vector Machines and many more.
There are four primary types of classification tasks in machine learning-
1. Binary Classifictaion- Binary classification is a task for which there are only two possible outcomes. It
means predicting which of the two classes will be correct. For example, An email filter classifying a
message as either "spam" or "not spam".
Some popular algorithms that can be used to divide things into two groups are Logistic Regression, k-
Nearest Neighbors, Decision Trees, Support Vector Machine and Naive Bayes

6|Page
2. Multi-Class Classification- The multi-class classification has at least two mutually exclusive class
labels. In this classification, the goal is to find to which class a given input belongs.
Face classification, plant species classification, and optical character recognition are some of the
examples.
Popular algorithms that can be used for multi-class classification includes k-Nearest Neighbors, Decision
Trees, Naive Bayes, Random Forest, and Gradient Boosting.
3. Multi-Label Classification- Multi-label classification is a classification task in which each sample is
mapped to a collection of target labels. This classification task involves making predictions about one or
more classes. For example, a news story can be about Games, People, and a Location all together at the
same time.
4. Imbalanced Classification- This describes a classification problem where the distribution of data is
uneven among the classes, with one or more classes being significantly underrepresented. This is not a
type of output, but a data condition that requires special handling. For example, Fraud detection, Medical
diagnostic test.

Advantages of classification
1. Organized and Efficient: Classification helps organize information, making it easier to find specific
items based on their category.
2. Accurate Predictions: Classification models can be trained to predict the category of new, unseen data,
enabling informed decision-making.
3. Improved Understanding: Classification can simplify complex data by grouping similar items together,
making it easier for humans to grasp the underlying patterns and relationships.
4. Universal Applicability: Classification principles can be applied across various fields, including
biology, data science, and information management.

Disadvantages of classification
1. Requires High-Quality Data: The effectiveness of classification heavily relies on the quality and
accuracy of the data used for training. Errors or biases in the data can lead to inaccurate classifications.
2. May Struggle with Unstructured Data: Classification methods often work best with structured data.
Dealing with unstructured data, such as text or images, can be challenging and may require specialized
techniques.
3. Complexity in Choosing Methods: Selecting the appropriate classification method for a specific task
can be complex, requiring expertise in data analysis and machine learning.

7|Page

You might also like