0% found this document useful (0 votes)
87 views15 pages

DL Unit5

Uploaded by

Kavya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views15 pages

DL Unit5

Uploaded by

Kavya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

2.

With Neat sketch explain Generative Adversarial Networks

Generative Adversarial Networks (GANs) are a powerful class of neural networks that are used for
an unsupervised learning. GANs are made up of two neural networks, a discriminator and a
generator. They use adversarial training to produce artificial data that is identical to actual data.

 The Generator attempts to fool the Discriminator, which is tasked with accurately
distinguishing between produced and genuine data, by producing random noise samples.

 Realistic, high-quality samples are produced as a result of this competitive interaction, which
drives both networks toward advancement.

 GANs are proving to be highly versatile artificial intelligence tools, as evidenced by their
extensive use in image synthesis, style transfer, and text-to-image synthesis.

 They have also revolutionized generative modeling.

Through adversarial training, these models engage in a competitive interplay until the generator
becomes adept at creating realistic samples, fooling the discriminator approximately half the time.

Generative Adversarial Networks (GANs) can be broken down into three parts:

 Generative: To learn a generative model, which describes how data is generated in terms of
a probabilistic model.

 Adversarial: The word adversarial refers to setting one thing up against another. This means
that, in the context of GANs, the generative result is compared with the actual images in the
data set. A mechanism known as a discriminator is used to apply a model that attempts to
distinguish between real and fake images.

 Networks: Use deep neural networks as artificial intelligence (AI) algorithms for training
purposes.

Types of GANs

1. Vanilla GAN: This is the simplest type of GAN. Here, the Generator and the Discriminator are
simple a basic multi-layer perceptrons. In vanilla GAN, the algorithm is really simple, it tries
to optimize the mathematical equation using stochastic gradient descent.
2. Conditional GAN (CGAN): CGAN can be described as a deep learning method in which some
conditional parameters are put into place.

 In CGAN, an additional parameter ‘y’ is added to the Generator for generating the
corresponding data.

 Labels are also put into the input to the Discriminator in order for the Discriminator
to help distinguish the real data from the fake generated data.

3. Deep Convolutional GAN (DCGAN): DCGAN is one of the most popular and also the most
successful implementations of GAN. It is composed of ConvNets in place of multi-layer
perceptrons.

 The ConvNets are implemented without max pooling, which is in fact replaced by
convolutional stride.

 Also, the layers are not fully connected.

4. Laplacian Pyramid GAN (LAPGAN): The Laplacian pyramid is a linear invertible image
representation consisting of a set of band-pass images, spaced an octave apart, plus a low-
frequency residual.

 This approach uses multiple numbers of Generator and Discriminator networks and
different levels of the Laplacian Pyramid.

 This approach is mainly used because it produces very high-quality images. The
image is down-sampled at first at each layer of the pyramid and then it is again up-
scaled at each layer in a backward pass where the image acquires some noise from
the Conditional GAN at these layers until it reaches its original size.

5. Super Resolution GAN (SRGAN): SRGAN as the name suggests is a way of designing a GAN in
which a deep neural network is used along with an adversarial network in order to produce
higher-resolution images. This type of GAN is particularly useful in optimally up-scaling
native low-resolution images to enhance their details minimizing errors while doing so.

Architecture of GANs

A Generative Adversarial Network (GAN) is composed of two primary parts, which are the Generator
and the Discriminator.

Generator Model

A key element responsible for creating fresh, accurate data in a Generative Adversarial Network
(GAN) is the generator model. The generator takes random noise as input and converts it into
complex data samples, such text or images. It is commonly depicted as a deep neural network.

The training data’s underlying distribution is captured by layers of learnable parameters in its design
through training. The generator adjusts its output to produce samples that closely mimic real data as
it is being trained by using backpropagation to fine-tune its parameters.

The generator’s ability to generate high-quality, varied samples that can fool the discriminator is
what makes it successful.

Generator Loss
The objective of the generator in a GAN is to produce synthetic samples that are realistic enough to
fool the discriminator. The generator achieves this by minimizing its loss function JGJG. The loss is
minimized when the log probability is maximized, i.e., when the discriminator is highly likely to
classify the generated samples as real. The following equation is given below:

JG=−1mΣi=1mlogD(G(zi))JG=−m1Σi=1mlogD(G(zi))
Where,

 JGJG measure how well the generator is fooling the discriminator.

 log D(G(zi))D(G(zi))represents log probability of the discriminator being correct for generated
samples.

 The generator aims to minimize this loss, encouraging the production of samples that the
discriminator classifies as real (logD(G(zi))(logD(G(zi)), close to 1.

Discriminator Model

An artificial neural network called a discriminator model is used in Generative Adversarial Networks
(GANs) to differentiate between generated and actual input. By evaluating input samples and
allocating probability of authenticity, the discriminator functions as a binary classifier.

Over time, the discriminator learns to differentiate between genuine data from the dataset and
artificial samples created by the generator. This allows it to progressively hone its parameters and
increase its level of proficiency.

Convolutional layers or pertinent structures for other modalities are usually used in its architecture
when dealing with picture data. Maximizing the discriminator’s capacity to accurately identify
generated samples as fraudulent and real samples as authentic is the aim of the adversarial training
procedure. The discriminator grows increasingly discriminating as a result of the generator and
discriminator’s interaction, which helps the GAN produce extremely realistic-looking synthetic data
overall.

Discriminator Loss

The discriminator reduces the negative log likelihood of correctly classifying both produced and real
samples. This loss incentivizes the discriminator to accurately categorize generated samples as fake
and real samples with the following equation:
JD=−1mΣi=1mlogD(xi)–1mΣi=1mlog(1–D(G(zi))JD=−m1Σi=1mlogD(xi)–m1Σi=1mlog(1–D(G(zi))

 JDJD assesses the discriminator’s ability to discern between produced and actual samples.

 The log likelihood that the discriminator will accurately categorize real data is represented
by logD(xi)logD(xi).

 The log chance that the discriminator would correctly categorize generated samples as fake
is represented by log⁡(1−D(G(zi)))log⁡(1−D(G(zi))).

 The discriminator aims to reduce this loss by accurately identifying artificial and real
samples.

MinMax Loss

In a Generative Adversarial Network (GAN), the minimax loss formula is provided by:
minGmaxD(G,D)=[Ex∼pdata[logD(x)]+Ez∼pz(z)[log(1–D(g(z)))]minGmaxD(G,D)=[Ex∼pdata[logD(x)]
+Ez∼pz(z)[log(1–D(g(z)))]
Where,

 G is generator network and is D is the discriminator network

 Actual data samples obtained from the true data distribution pdata(x)pdata(x) are
represented by x.

 Random noise sampled from a previous distribution pz(z)pz(z)(usually a normal or uniform


distribution) is represented by z.

 D(x) represents the discriminator’s likelihood of correctly identifying actual data as real.

 D(G(z)) is the likelihood that the discriminator will identify generated data coming from the
generator as authentic.

How does a GAN work?

The steps involved in how a GAN works:

1. Initialization: Two neural networks are created: a Generator (G) and a Discriminator (D).

 G is tasked with creating new data, like images or text, that closely resembles real
data.

 D acts as a critic, trying to distinguish between real data (from a training dataset) and
the data generated by G.

2. Generator’s First Move: G takes a random noise vector as input. This noise vector contains
random values and acts as the starting point for G’s creation process. Using its internal layers
and learned patterns, G transforms the noise vector into a new data sample, like a generated
image.

3. Discriminator’s Turn: D receives two kinds of inputs:


 Real data samples from the training dataset.

 The data samples generated by G in the previous step. D’s job is to analyze each
input and determine whether it’s real data or something G cooked up. It outputs a
probability score between 0 and 1. A score of 1 indicates the data is likely real, and 0
suggests it’s fake.

4. The Learning Process: Now, the adversarial part comes in:

 If D correctly identifies real data as real (score close to 1) and generated data as fake
(score close to 0), both G and D are rewarded to a small degree. This is because
they’re both doing their jobs well.

 However, the key is to continuously improve. If D consistently identifies everything


correctly, it won’t learn much. So, the goal is for G to eventually trick D.

5. Generator’s Improvement:

 When D mistakenly labels G’s creation as real (score close to 1), it’s a sign that G is on
the right track. In this case, G receives a significant positive update, while D receives
a penalty for being fooled.

 This feedback helps G improve its generation process to create more realistic data.

6. Discriminator’s Adaptation:

 Conversely, if D correctly identifies G’s fake data (score close to 0), but G receives no
reward, D is further strengthened in its discrimination abilities.

 This ongoing duel between G and D refines both networks over time.

As training progresses, G gets better at generating realistic data, making it harder for D to tell the
difference. Ideally, G becomes so adept that D can’t reliably distinguish real from fake data. At this
point, G is considered well-trained and can be used to generate new, realistic data samples.

3. Differentiate multitask Vs Multi-view Deep Learning?

Multi-Task Learning (MTL) is a type of machine learning technique where a


model is trained to perform multiple tasks simultaneously. In deep learning, MTL
refers to training a neural network to perform multiple tasks by sharing some of
the network’s layers and parameters across tasks.
In MTL, the goal is to improve the generalization performance of the model by
leveraging the information shared across tasks. By sharing some of the network’s
parameters, the model can learn a more efficient and compact representation of
the data, which can be beneficial when the tasks are related or have some
commonalities.
There are different ways to implement MTL in deep learning, but the most
common approach is to use a shared feature extractor and multiple task-specific
heads. The shared feature extractor is a part of the network that is shared across
tasks and is used to extract features from the input data. The task-specific heads
are used to make predictions for each task and are typically connected to the
shared feature extractor.
Another approach is to use a shared decision-making layer, where the decision-
making layer is shared across tasks, and the task-specific layers are connected to
the shared decision-making layer.
MTL can be useful in many applications such as natural language processing,
computer vision, and healthcare, where multiple tasks are related or have some
commonalities. It is also useful when the data is limited, MTL can help to
improve the generalization performance of the model by leveraging the
information shared across tasks.
However, MTL also has its own limitations, such as when the tasks are very
different
Multi-Task Learning is a sub-field of Deep Learning. It is recommended that you
familiarize yourself with the concepts of neural networks to understand what
multi-task learning means. What is Multi-Task Learning? Multi-Task learning
is a sub-field of Machine Learning that aims to solve multiple different tasks at
the same time, by taking advantage of the similarities between different tasks.
This can improve the learning efficiency and also act as a regularizer which we
will discuss in a while. Formally, if there are n tasks (conventional deep learning
approaches aim to solve just 1 task using 1 particular model), where these n tasks
or a subset of them are related to each other but not exactly identical, Multi-Task
Learning (MTL) will help in improving the learning of a particular model by
using the knowledge contained in all the n tasks. Intuition behind Multi-Task
Learning (MTL): By using Deep learning models, we usually aim to learn a
good representation of the features or attributes of the input data to predict a
specific value. Formally, we aim to optimize for a particular function by training
a model and fine-tuning the hyperparameters till the performance can’t be
increased further. By using MTL, it might be possible to increase performance
even further by forcing the model to learn a more generalized representation as it
learns (updates its weights) not just for one specific task but a bunch of tasks.
Biologically, humans learn in the same way. We learn better if we learn multiple
related tasks instead of focusing on one specific task for a long time. MTL as a
regularizer: In the lingo of Machine Learning, MTL can also be looked at as a
way of inducing bias. It is a form of inductive transfer, using multiple tasks
induces a bias that prefers hypotheses that can explain all the n tasks. MTL acts
as a regularizer by introducing inductive bias as stated above. It significantly
reduces the risk of overfitting and also reduces the model’s ability to
accommodate random noise during training. Now, let’s discuss the major and
prevalent techniques to use MTL. Hard Parameter Sharing – A common hidden
layer is used for all tasks but several task specific layers are kept intact towards
the end of the model. This technique is very useful as by learning a representation
for various tasks by a common hidden layer, we reduce the risk of overfitting.

Hard Parameter Sharing

Soft Parameter Sharing – Each model has their own sets of weights and biases
and the distance between these parameters in different models is regularized so
that the parameters become similar and can represent all the tasks.
Soft Parameter Sharing

Assumptions and Considerations – Using MTL to share knowledge among


tasks are very useful only when the tasks are very similar, but when this
assumption is violated, the performance will significantly
decline. Applications: MTL techniques have found various uses, some of the
major applications are-
 Object detection and Facial recognition
 Self Driving Cars: Pedestrians, stop signs and other obstacles can be
detected together
 Multi-domain collaborative filtering for web applications
 Stock Prediction
 Language Modelling and other NLP applications

Important points:

Here are some important points to consider when implementing Multi-Task


Learning (MTL) for deep learning:
1. Task relatedness: MTL is most effective when the tasks are related or
have some commonalities, such as natural language processing,
computer vision, and healthcare.
2. Data limitation: MTL can be useful when the data is limited, as it
allows the model to leverage the information shared across tasks to
improve the generalization performance.
3. Shared feature extractor: A common approach in MTL is to use a
shared feature extractor, which is a part of the network that is shared
across tasks and is used to extract features from the input data.
4. Task-specific heads: Task-specific heads are used to make predictions
for each task and are typically connected to the shared feature extractor.
5. Shared decision-making layer: another approach is to use a shared
decision-making layer, where the decision-making layer is shared
across tasks, and the task-specific layers are connected to the shared
decision-making layer.
6. Careful architecture design: The architecture of MTL should be
carefully designed to accommodate the different tasks and to make sure
that the shared features are useful for all tasks.
7. Overfitting: MTL models can be prone to overfitting if the model is not
regularized properly.
8. Avoiding negative transfer: when the tasks are very different or
independent, MTL can lead to suboptimal performance compared to
training a single-task model. Therefore, it is important to make sure that
the shared features are useful for all tasks to avoid negative transfer.

4.Autoencoders

What are Autoencoders?


Autoencoders are a specialized class of algorithms that can learn efficient
representations of input data with no need for labels. It is a class of artificial
neural networks designed for unsupervised learning . Learning to compress and
effectively represent input data without specific labels is the essential principle of
an automatic decoder. This is accomplished using a two-fold structure that
consists of an encoder and a decoder. The encoder transforms the input data into
a reduced-dimensional representation, which is often referred to as “latent space”
or “encoding”. From that representation, a decoder rebuilds the initial input. For
the network to gain meaningful patterns in data, a process of encoding and
decoding facilitates the definition of essential features.
Architecture of Autoencoder in Deep Learning
The general architecture of an autoencoder includes an encoder, decoder, and
bottleneck layer.
1. Encoder
 Input layer take raw input data
 The hidden layers progressively reduce the dimensionality of the
input, capturing important features and patterns. These layer
compose the encoder.
 The bottleneck layer (latent space) is the final hidden layer, where
the dimensionality is significantly reduced. This layer represents the
compressed encoding of the input data.
2. Decoder
 The bottleneck layer takes the encoded representation and expands it
back to the dimensionality of the original input.
 The hidden layers progressively increase the dimensionality and aim
to reconstruct the original input.
 The output layer produces the reconstructed output, which ideally
should be as close as possible to the input data.
3. The loss function used during training is typically a reconstruction loss,
measuring the difference between the input and the reconstructed output.
Common choices include mean squared error (MSE) for continuous data or
binary cross-entropy for binary data.
4. During training, the autoencoder learns to minimize the reconstruction
loss, forcing the network to capture the most important features of the
input data in the bottleneck layer.
After the training process, only the encoder part of the autoencoder is retained to
encode a similar type of data used in the training process. The different ways to
constrain the network are: –
 Keep small Hidden Layers: If the size of each hidden layer is kept as
small as possible, then the network will be forced to pick up only the
representative features of the data thus encoding the data.
 Regularization: In this method, a loss term is added to the cost function
which encourages the network to train in ways other than copying the
input.
 Denoising: Another way of constraining the network is to add noise to the
input and teach the network how to remove the noise from the data.
 Tuning the Activation Functions: This method involves changing the
activation functions of various nodes so that a majority of the nodes are
dormant thus, effectively reducing the size of the hidden layers.
Types of Autoencoders
There are diverse types of autoencoders and analyze the advantages and
disadvantages associated with different variation:
Denoising Autoencoder
Denoising autoencoder works on a partially corrupted input and trains to recover
the original undistorted image. As mentioned above, this method is an effective
way to constrain the network from simply copying the input and thus learn the
underlying structure and important features of the data.
Advantages
1. This type of autoencoder can extract important features and reduce the
noise or the useless features.
2. Denoising autoencoders can be used as a form of data augmentation, the
restored images can be used as augmented data thus generating additional
training samples.
Disadvantages
1. Selecting the right type and level of noise to introduce can be challenging
and may require domain knowledge.
2. Denoising process can result into loss of some information that is needed
from the original input. This loss can impact accuracy of the output.
Sparse Autoencoder
This type of autoencoder typically contains more hidden units than the input but
only a few are allowed to be active at once. This property is called the sparsity of
the network. The sparsity of the network can be controlled by either manually
zeroing the required hidden units, tuning the activation functions or by adding a
loss term to the cost function.
Advantages
1. The sparsity constraint in sparse autoencoders helps in filtering out noise
and irrelevant features during the encoding process.
2. These autoencoders often learn important and meaningful features due to
their emphasis on sparse activations.
Disadvantages
1. The choice of hyperparameters play a significant role in the performance
of this autoencoder. Different inputs should result in the activation of
different nodes of the network.
2. The application of sparsity constraint increases computational complexity.
Variational Autoencoder
Variational autoencoder makes strong assumptions about the distribution of latent
variables and uses the Stochastic Gradient Variational Bayes estimator in the
training process. It assumes that the data is generated by a Directed Graphical
Model and tries to learn an approximation to to the conditional
property where and are the parameters of the encoder and the
decoder respectively.
Advantages
1. Variational Autoencoders are used to generate new data points that
resemble the original training data. These samples are learned from the
latent space.
2. Variational Autoencoder is probabilistic framework that is used to learn a
compressed representation of the data that captures its underlying structure
and variations, so it is useful in detecting anomalies and data exploration.
Disadvantages
1. Variational Autoencoder use approximations to estimate the true
distribution of the latent variables. This approximation introduces some
level of error, which can affect the quality of generated samples.
2. The generated samples may only cover a limited subset of the true data
distribution. This can result in a lack of diversity in generated samples.
Convolutional Autoencoder
Convolutional autoencoders are a type of autoencoder that use convolutional
neural networks (CNNs) as their building blocks. The encoder consists of
multiple layers that take a image or a grid as input and pass it through different
convolution layers thus forming a compressed representation of the input. The
decoder is the mirror image of the encoder it deconvolves the compressed
representation and tries to reconstruct the original image.
Advantages
1. Convolutional autoencoder can compress high-dimensional image data into
a lower-dimensional data. This improves storage efficiency and
transmission of image data.
2. Convolutional autoencoder can reconstruct missing parts of an image. It
can also handle images with slight variations in object position or
orientation.
Disadvantages
1. These autoencoder are prone to overfitting. Proper regularization
techniques should be used to tackle this issue.
2. Compression of data can cause data loss which can result in reconstruction
of a lower quality image.

Real-life Applications of Autoencoders


Image Compression/Denoising
One of the main applications of Autoencoders is to compress images to reduce
their overall file size while trying to keep as much of the valuable information as
possible or restore images that have been degraded over time.
Anomaly Detection
Since Autoencoders are good at distinguishing essential characteristics of data
from noise, they can be used in order to detect anomalies (e.g., if an image has
been photoshopped, if there are unusual activities in a network, etc.)
Data Generation
Variational Autoencoders and Generative Adversarial Networks (GAN) are
frequently used in order to generate synthetic data (e.g., realistic images of
people).

5. Applications of deep learning with examples

10 deep learning applications


Deep learning applications are making an impact across many different
industries. You might even already use some of these applications in your
everyday life. Let’s examine ten examples highlighting deep learning’s broad use
to understand it better.
1. Fraud detection
Deep learning algorithms can identify security issues to help protect against
fraud. For example, deep learning algorithms can detect suspicious attempts to
log into your accounts and notify you, as well as inform you if your chosen
password isn’t strong enough.
2. Customer service
You may have seen or used customer service help online and interacted with a
chatbot to help answer your questions or utilized a virtual assistant on your
smartphone. Deep learning allows these systems to learn over time to respond.
3. Financial services
Several financial services can rely on assistance from deep learning. Predictive
analytics helps support investment portfolios and trading assets in the stock
market, as well as allowing banks to mitigate risk relating to loan approvals.
4. Natural language processing
Natural language processing is an important part of deep learning applications
that rely on interpreting text and speech. Customer service chatbots, language
translators, and sentiment analysis are all examples of applications benefitting
from natural language processing.
Read more: What is Natural Language Processing? Definition and Examples
5. Facial recognition
An area of deep learning known as computer vision allows deep learning
algorithms to recognize specific features in pictures and videos. With this
technique, you can use deep learning for facial recognition, identifying you by
your own unique features.
Read more: What Is Facial Recognition?
6. Self-driving vehicles
Autonomous vehicles use deep learning to learn how to operate and handle
different situations while driving, and it allows vehicles to detect traffic lights,
recognize signs, and avoid pedestrians.
7. Predictive analytics
Deep learning models can analyze large amounts of historical information to
make accurate predictions about the future. Predictive analytics helps businesses
in several aspects, including forecasting revenue, product development, decision-
making, and manufacturing.
8. Recommender systems
Online services often use recommender systems with enhanced capabilities
provided by deep learning models. With enough data, these deep learning models
can predict the probabilities of certain interactions based on the history of
previous interactions. Industries such as streaming services, e-commerce, and
social media implement recommender systems.
9. Health care
Deep learning applications in the health care industry serve multiple purposes.
Not only can they assist in developing treatment solutions, but deep learning
algorithms are also capable of understanding medical images and helping doctors
diagnose patients by detecting cancer cells.
Read more: What Is Machine Learning in Health Care?
10. Industrial
Deep learning applications in industrial automation help keep workers safe in
factories by enabling machines to detect dangerous situations, such as when
objects or people are too close to the machines.

You might also like