DL Unit5
DL Unit5
Generative Adversarial Networks (GANs) are a powerful class of neural networks that are used for
an unsupervised learning. GANs are made up of two neural networks, a discriminator and a
generator. They use adversarial training to produce artificial data that is identical to actual data.
The Generator attempts to fool the Discriminator, which is tasked with accurately
distinguishing between produced and genuine data, by producing random noise samples.
Realistic, high-quality samples are produced as a result of this competitive interaction, which
drives both networks toward advancement.
GANs are proving to be highly versatile artificial intelligence tools, as evidenced by their
extensive use in image synthesis, style transfer, and text-to-image synthesis.
Through adversarial training, these models engage in a competitive interplay until the generator
becomes adept at creating realistic samples, fooling the discriminator approximately half the time.
Generative Adversarial Networks (GANs) can be broken down into three parts:
Generative: To learn a generative model, which describes how data is generated in terms of
a probabilistic model.
Adversarial: The word adversarial refers to setting one thing up against another. This means
that, in the context of GANs, the generative result is compared with the actual images in the
data set. A mechanism known as a discriminator is used to apply a model that attempts to
distinguish between real and fake images.
Networks: Use deep neural networks as artificial intelligence (AI) algorithms for training
purposes.
Types of GANs
1. Vanilla GAN: This is the simplest type of GAN. Here, the Generator and the Discriminator are
simple a basic multi-layer perceptrons. In vanilla GAN, the algorithm is really simple, it tries
to optimize the mathematical equation using stochastic gradient descent.
2. Conditional GAN (CGAN): CGAN can be described as a deep learning method in which some
conditional parameters are put into place.
In CGAN, an additional parameter ‘y’ is added to the Generator for generating the
corresponding data.
Labels are also put into the input to the Discriminator in order for the Discriminator
to help distinguish the real data from the fake generated data.
3. Deep Convolutional GAN (DCGAN): DCGAN is one of the most popular and also the most
successful implementations of GAN. It is composed of ConvNets in place of multi-layer
perceptrons.
The ConvNets are implemented without max pooling, which is in fact replaced by
convolutional stride.
4. Laplacian Pyramid GAN (LAPGAN): The Laplacian pyramid is a linear invertible image
representation consisting of a set of band-pass images, spaced an octave apart, plus a low-
frequency residual.
This approach uses multiple numbers of Generator and Discriminator networks and
different levels of the Laplacian Pyramid.
This approach is mainly used because it produces very high-quality images. The
image is down-sampled at first at each layer of the pyramid and then it is again up-
scaled at each layer in a backward pass where the image acquires some noise from
the Conditional GAN at these layers until it reaches its original size.
5. Super Resolution GAN (SRGAN): SRGAN as the name suggests is a way of designing a GAN in
which a deep neural network is used along with an adversarial network in order to produce
higher-resolution images. This type of GAN is particularly useful in optimally up-scaling
native low-resolution images to enhance their details minimizing errors while doing so.
Architecture of GANs
A Generative Adversarial Network (GAN) is composed of two primary parts, which are the Generator
and the Discriminator.
Generator Model
A key element responsible for creating fresh, accurate data in a Generative Adversarial Network
(GAN) is the generator model. The generator takes random noise as input and converts it into
complex data samples, such text or images. It is commonly depicted as a deep neural network.
The training data’s underlying distribution is captured by layers of learnable parameters in its design
through training. The generator adjusts its output to produce samples that closely mimic real data as
it is being trained by using backpropagation to fine-tune its parameters.
The generator’s ability to generate high-quality, varied samples that can fool the discriminator is
what makes it successful.
Generator Loss
The objective of the generator in a GAN is to produce synthetic samples that are realistic enough to
fool the discriminator. The generator achieves this by minimizing its loss function JGJG. The loss is
minimized when the log probability is maximized, i.e., when the discriminator is highly likely to
classify the generated samples as real. The following equation is given below:
JG=−1mΣi=1mlogD(G(zi))JG=−m1Σi=1mlogD(G(zi))
Where,
log D(G(zi))D(G(zi))represents log probability of the discriminator being correct for generated
samples.
The generator aims to minimize this loss, encouraging the production of samples that the
discriminator classifies as real (logD(G(zi))(logD(G(zi)), close to 1.
Discriminator Model
An artificial neural network called a discriminator model is used in Generative Adversarial Networks
(GANs) to differentiate between generated and actual input. By evaluating input samples and
allocating probability of authenticity, the discriminator functions as a binary classifier.
Over time, the discriminator learns to differentiate between genuine data from the dataset and
artificial samples created by the generator. This allows it to progressively hone its parameters and
increase its level of proficiency.
Convolutional layers or pertinent structures for other modalities are usually used in its architecture
when dealing with picture data. Maximizing the discriminator’s capacity to accurately identify
generated samples as fraudulent and real samples as authentic is the aim of the adversarial training
procedure. The discriminator grows increasingly discriminating as a result of the generator and
discriminator’s interaction, which helps the GAN produce extremely realistic-looking synthetic data
overall.
Discriminator Loss
The discriminator reduces the negative log likelihood of correctly classifying both produced and real
samples. This loss incentivizes the discriminator to accurately categorize generated samples as fake
and real samples with the following equation:
JD=−1mΣi=1mlogD(xi)–1mΣi=1mlog(1–D(G(zi))JD=−m1Σi=1mlogD(xi)–m1Σi=1mlog(1–D(G(zi))
JDJD assesses the discriminator’s ability to discern between produced and actual samples.
The log likelihood that the discriminator will accurately categorize real data is represented
by logD(xi)logD(xi).
The log chance that the discriminator would correctly categorize generated samples as fake
is represented by log(1−D(G(zi)))log(1−D(G(zi))).
The discriminator aims to reduce this loss by accurately identifying artificial and real
samples.
MinMax Loss
In a Generative Adversarial Network (GAN), the minimax loss formula is provided by:
minGmaxD(G,D)=[Ex∼pdata[logD(x)]+Ez∼pz(z)[log(1–D(g(z)))]minGmaxD(G,D)=[Ex∼pdata[logD(x)]
+Ez∼pz(z)[log(1–D(g(z)))]
Where,
Actual data samples obtained from the true data distribution pdata(x)pdata(x) are
represented by x.
D(x) represents the discriminator’s likelihood of correctly identifying actual data as real.
D(G(z)) is the likelihood that the discriminator will identify generated data coming from the
generator as authentic.
1. Initialization: Two neural networks are created: a Generator (G) and a Discriminator (D).
G is tasked with creating new data, like images or text, that closely resembles real
data.
D acts as a critic, trying to distinguish between real data (from a training dataset) and
the data generated by G.
2. Generator’s First Move: G takes a random noise vector as input. This noise vector contains
random values and acts as the starting point for G’s creation process. Using its internal layers
and learned patterns, G transforms the noise vector into a new data sample, like a generated
image.
The data samples generated by G in the previous step. D’s job is to analyze each
input and determine whether it’s real data or something G cooked up. It outputs a
probability score between 0 and 1. A score of 1 indicates the data is likely real, and 0
suggests it’s fake.
If D correctly identifies real data as real (score close to 1) and generated data as fake
(score close to 0), both G and D are rewarded to a small degree. This is because
they’re both doing their jobs well.
5. Generator’s Improvement:
When D mistakenly labels G’s creation as real (score close to 1), it’s a sign that G is on
the right track. In this case, G receives a significant positive update, while D receives
a penalty for being fooled.
This feedback helps G improve its generation process to create more realistic data.
6. Discriminator’s Adaptation:
Conversely, if D correctly identifies G’s fake data (score close to 0), but G receives no
reward, D is further strengthened in its discrimination abilities.
This ongoing duel between G and D refines both networks over time.
As training progresses, G gets better at generating realistic data, making it harder for D to tell the
difference. Ideally, G becomes so adept that D can’t reliably distinguish real from fake data. At this
point, G is considered well-trained and can be used to generate new, realistic data samples.
Soft Parameter Sharing – Each model has their own sets of weights and biases
and the distance between these parameters in different models is regularized so
that the parameters become similar and can represent all the tasks.
Soft Parameter Sharing
Important points:
4.Autoencoders