SinGAN - Learning a Generative Model from a Single Natural Image

SinGAN: Learning a Generative Model
from a Single Natural Image
Jishnu Padalunkal, Shobhit Sundriyal
Fellowship.ai (Jan ’20 Cohort)
AI Reading Group
05-Mar-2020

Outline
● Introductions to GANs
● GAN Ecosystem
● SinGAN - Intro
● SinGAN - Architecture
● SinGAN - Experiments
● SinGAN - Results

● Classical GANs
○ Requirement of large image dataset
○ Dataset specific results
○ Application specific (Super Resolution(SR), Texture synthesis)
○ Most of these GANs also face the above mentioned problems.
● This where SinGAN comes to the rescue
○ Not dependent on anything.
○ Given a random image, it can generate random samples via absorbing the patch distribution
properties.
○ Results are not application or dataset specific.
○ Hence, useful in a wide range of applications.
Premise

GAN Ecosystem
● Training requires a lot of image data
● Single Generator
○ Generates images similar to original image distribution
● Single Discriminator
○ Tries to discriminate real and generated (fake) images
● Adversarial Loss
○ This is the standard loss that is found in every classical GAN architecture
○

GAN Loss Function
● L_adv(G_n, D_n) ) =
● D(x) is the discriminator estimate of the probability that real data instance x
is real.
● Ex
is the expected value over all real data instances.
● G(z) is the generator's output when given noise z.
● D(G(z)) is the discriminator estimate of the probability that a fake instance
is real.
● Ez
is the expected value over all random inputs to the generator (in effect, the
expected value over all generated fake instances G(z)).

Some Examples
● Classic GANs
○ SRGAN
○ DCGAN
○ EDSR
● Distribution learning using single image
○ PSGAN(Work on Texture)
○ Deep Texture Synthesis(Works on textures)
○ ZSSR
● NOTE: These are only for reference, we will stick to SinGANs

SinGAN - Similar works
● PSGAN
● Deep Texture Synthesis
● Several recent works proposed to “overfit” a deep model to a single training
example.
● However, these methods are designed for specific tasks (e.g., super
resolution, texture expansion).
● Shocher et al. were the first to introduce an internal GAN based model for a
single natural image, and illustrated it in the context of retargeting.
● However, their generation is conditioned on an input image (i.e., mapping
images to images).

SinGAN - Similar works
● In contrast, SinGAN is purely generative (i.e. maps noise to image samples),
and thus suits many different image manipulation tasks.
● Single image GANs have been explored only in the context of texture
generation.
● These models do not generate meaningful samples when trained on
non-texture images.
● SinGAN, on the other hand, is not restricted to texture and can handle general
natural images

SinGAN - Some results with Similar works

SinGAN’s Multi-Scale Pipeline (Architecture)

SinGAN - Generator Architecture

SinGAN - Experiments
● Training
○ General Adversarial Loss(Wasserstein Loss) + Reconstruction Loss
■ S
○ Reconstruction loss
■ D
○ Setting = {z,0,0,0,0,0}, scale={N, N-1,.....,0}
○ Training takes about 30 minutes on a 1080TI GPU for an image of size 256 × 256 pixels, and
the generation at test time is well under one second per image.

SinGAN - Experiments
● Optimization
○ At each scale, the weights of the generator and discriminator are initialized to those from the
previous trained scale (except for the coarsest scale or when changing the number of kernels,
in which cases we use random initialization).
○ Each scale trains for 2000 iterations.
○ Adam optimizer is used with a learning rate of 0.0005 (decreased by a factor of 0.1 after 1600
iterations) and momentum parameters β1 = 0.5, β2 = 0.999.
○ The weight of the reconstruction loss is α = 10
○ All LeakyReLU activations have a slope of 0.2 for negative values.
○ The generator’s last conv-block activation at each scale is Tanh instead of ReLU.
○ The descriminator’s last conv-block at each scale includes neither normalization nor activation.

SinGAN - Observations
● Boundary conditions and the effect of padding
● The padding configuration highly effects the diversity between samples at the corners.
● Zero padding in each convolutional layer (layer padding) leads to only minor variability at the
corners.
● Padding only the input to the net (initial padding), leads to somewhat increased variability at the
corners.
● Finally, padding by noise, leads to high variability

Super Resolution Results
● (PSNR/NIQE)

SR Results (BSD100 Dataset)
● The original undistorted image has the best perceptual quality and
therefore the lowest NIQE score.

SIFID
● Lower FID values mean better image quality and diversity.

Slight modification for Super Resolution

Useful Links
● Project Web Page
○ https://webee.technion.ac.il/people/tomermic/SinGAN/SinGAN.htm
● Paper
○ https://arxiv.org/abs/1905.01164
● Supplementary Material
○ https://tomer.net.technion.ac.il/files/2019/09/SingleImageGan_SM.pdf
● Code
○ https://github.com/tamarott/SinGAN

SinGAN - Learning a Generative Model from a Single Natural Image

More Related Content

What's hot (20)

Similar to SinGAN - Learning a Generative Model from a Single Natural Image (20)

More from Jishnu P (7)

Recently uploaded (20)

SinGAN - Learning a Generative Model from a Single Natural Image