SlideShare a Scribd company logo
SinGAN: Learning a Generative Model
from a Single Natural Image
Jishnu Padalunkal, Shobhit Sundriyal
Fellowship.ai (Jan ’20 Cohort)
AI Reading Group
05-Mar-2020
Outline
● Introductions to GANs
● GAN Ecosystem
● SinGAN - Intro
● SinGAN - Architecture
● SinGAN - Experiments
● SinGAN - Results
● Classical GANs
○ Requirement of large image dataset
○ Dataset specific results
○ Application specific (Super Resolution(SR), Texture synthesis)
○ Most of these GANs also face the above mentioned problems.
● This where SinGAN comes to the rescue
○ Not dependent on anything.
○ Given a random image, it can generate random samples via absorbing the patch distribution
properties.
○ Results are not application or dataset specific.
○ Hence, useful in a wide range of applications.
Premise
Quick Look on classic GANs
GAN Ecosystem
● Training requires a lot of image data
● Single Generator
○ Generates images similar to original image distribution
● Single Discriminator
○ Tries to discriminate real and generated (fake) images
● Adversarial Loss
○ This is the standard loss that is found in every classical GAN architecture
○
GAN Loss Function
● L_adv(G_n, D_n) ) =
● D(x) is the discriminator estimate of the probability that real data instance x
is real.
● Ex
is the expected value over all real data instances.
● G(z) is the generator's output when given noise z.
● D(G(z)) is the discriminator estimate of the probability that a fake instance
is real.
● Ez
is the expected value over all random inputs to the generator (in effect, the
expected value over all generated fake instances G(z)).
Some Examples
● Classic GANs
○ SRGAN
○ DCGAN
○ EDSR
● Distribution learning using single image
○ PSGAN(Work on Texture)
○ Deep Texture Synthesis(Works on textures)
○ ZSSR
● NOTE: These are only for reference, we will stick to SinGANs
SinGAN - Intro
SinGAN - Applications
SinGAN - Similar works
● PSGAN
● Deep Texture Synthesis
● Several recent works proposed to “overfit” a deep model to a single training
example.
● However, these methods are designed for specific tasks (e.g., super
resolution, texture expansion).
● Shocher et al. were the first to introduce an internal GAN based model for a
single natural image, and illustrated it in the context of retargeting.
● However, their generation is conditioned on an input image (i.e., mapping
images to images).
SinGAN - Similar works
● In contrast, SinGAN is purely generative (i.e. maps noise to image samples),
and thus suits many different image manipulation tasks.
● Single image GANs have been explored only in the context of texture
generation.
● These models do not generate meaningful samples when trained on
non-texture images.
● SinGAN, on the other hand, is not restricted to texture and can handle general
natural images
SinGAN - Some results with Similar works
SinGAN’s Multi-Scale Pipeline (Architecture)
SinGAN - Generator Architecture
SinGAN - Experiments
● Training
○ General Adversarial Loss(Wasserstein Loss) + Reconstruction Loss
■ S
○ Reconstruction loss
■ D
○ Setting = {z,0,0,0,0,0}, scale={N, N-1,.....,0}
○ Training takes about 30 minutes on a 1080TI GPU for an image of size 256 × 256 pixels, and
the generation at test time is well under one second per image.
SinGAN - Experiments
● Optimization
○ At each scale, the weights of the generator and discriminator are initialized to those from the
previous trained scale (except for the coarsest scale or when changing the number of kernels,
in which cases we use random initialization).
○ Each scale trains for 2000 iterations.
○ Adam optimizer is used with a learning rate of 0.0005 (decreased by a factor of 0.1 after 1600
iterations) and momentum parameters β1 = 0.5, β2 = 0.999.
○ The weight of the reconstruction loss is α = 10
○ All LeakyReLU activations have a slope of 0.2 for negative values.
○ The generator’s last conv-block activation at each scale is Tanh instead of ReLU.
○ The descriminator’s last conv-block at each scale includes neither normalization nor activation.
SinGAN - Observations
● Boundary conditions and the effect of padding
● The padding configuration highly effects the diversity between samples at the corners.
● Zero padding in each convolutional layer (layer padding) leads to only minor variability at the
corners.
● Padding only the input to the net (initial padding), leads to somewhat increased variability at the
corners.
● Finally, padding by noise, leads to high variability
Results
Effects of scales
Effects of scale cont..
“Real/Fake” test
Super Resolution Results
● (PSNR/NIQE)
SR Results (BSD100 Dataset)
● The original undistorted image has the best perceptual quality and
therefore the lowest NIQE score.
SIFID
● Lower FID values mean better image quality and diversity.
Slight modification for Super Resolution
Paint to Image Results
Image Editing Results
Harmonization
Animation
Failure Cases
Useful Links
● Project Web Page
○ https://webee.technion.ac.il/people/tomermic/SinGAN/SinGAN.htm
● Paper
○ https://arxiv.org/abs/1905.01164
● Supplementary Material
○ https://tomer.net.technion.ac.il/files/2019/09/SingleImageGan_SM.pdf
● Code
○ https://github.com/tamarott/SinGAN
Questions?
Thank you

More Related Content

PDF
Modern Recommendation for Advanced Practitioners
PDF
Score based generative model
PDF
Generative Adversarial Networks (GANs)
PPTX
adversarial robustness through local linearization
PDF
Conditional Image Generation with PixelCNN Decoders
PDF
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
PPTX
A friendly introduction to GANs
PPTX
Google net
Modern Recommendation for Advanced Practitioners
Score based generative model
Generative Adversarial Networks (GANs)
adversarial robustness through local linearization
Conditional Image Generation with PixelCNN Decoders
Loss Functions for Deep Learning - Javier Ruiz Hidalgo - UPC Barcelona 2018
A friendly introduction to GANs
Google net

What's hot (20)

PDF
Finding connections among images using CycleGAN
PDF
Designing more efficient convolution neural network
ODP
Simple Introduction to AutoEncoder
PPTX
DNN and RBM
PPTX
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
PDF
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
PPTX
Moving Frostbite to Physically Based Rendering
PDF
Variational Autoencoders For Image Generation
PDF
PRML上巻勉強会 at 東京大学 資料 第5章5.1 〜 5.3.1
PDF
PRML輪読#5
PDF
Self-Attention with Linear Complexity
PDF
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PDF
Gan 발표자료
PPTX
Diffusion models beat gans on image synthesis
PDF
PRML輪読#1
PDF
[PR12] image super resolution using deep convolutional networks
PDF
Densenet CNN
PDF
[DL輪読会]NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Det...
PDF
Introduction to Generative Adversarial Networks (GANs)
PDF
Generative adversarial networks
Finding connections among images using CycleGAN
Designing more efficient convolution neural network
Simple Introduction to AutoEncoder
DNN and RBM
Getting started with Ray Tracing in Unity 2019.3 - Unite Copenhagen 2019
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
Moving Frostbite to Physically Based Rendering
Variational Autoencoders For Image Generation
PRML上巻勉強会 at 東京大学 資料 第5章5.1 〜 5.3.1
PRML輪読#5
Self-Attention with Linear Complexity
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Gan 발표자료
Diffusion models beat gans on image synthesis
PRML輪読#1
[PR12] image super resolution using deep convolutional networks
Densenet CNN
[DL輪読会]NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Det...
Introduction to Generative Adversarial Networks (GANs)
Generative adversarial networks
Ad

Similar to SinGAN - Learning a Generative Model from a Single Natural Image (20)

PPTX
Image colorization
PPTX
Image colorization
PPTX
Brain Tumour Detection.pptx
PPTX
Flash Photography and toonification
PDF
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
PDF
Volodymyr Lyubinets “Generative models for images”
PDF
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
PDF
PPTX
Deep Learning Module 2A Training MLP.pptx
PDF
Training Deep Neural Nets
PDF
Seeing what a gan cannot generate: paper review
PDF
Module 2 Computer Vision: Image Processing
PDF
Generative adversarial network_Ayadi_Alaeddine
PDF
物件偵測與辨識技術
PDF
DC02. Interpretation of predictions
PDF
11_gan.pdf
PPTX
computervisionanditsapplications-190311134821.pptx
PDF
Restricting the Flow: Information Bottlenecks for Attribution
PDF
Dictionary Learning in Games - GDC 2014
PDF
Neural networks
Image colorization
Image colorization
Brain Tumour Detection.pptx
Flash Photography and toonification
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Volodymyr Lyubinets “Generative models for images”
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Deep Learning Module 2A Training MLP.pptx
Training Deep Neural Nets
Seeing what a gan cannot generate: paper review
Module 2 Computer Vision: Image Processing
Generative adversarial network_Ayadi_Alaeddine
物件偵測與辨識技術
DC02. Interpretation of predictions
11_gan.pdf
computervisionanditsapplications-190311134821.pptx
Restricting the Flow: Information Bottlenecks for Attribution
Dictionary Learning in Games - GDC 2014
Neural networks
Ad

More from Jishnu P (7)

PDF
Breaking CAPTCHAs using ML
PDF
Stencil computation research project presentation #1
PDF
Btp 2017 presentation
PDF
Ir mcq-answering-system
PDF
Cs403 Parellel Programming Travelling Salesman Problem
PDF
Ansible Overview - System Administration and Maintenance
PDF
CS404 Pattern Recognition - Locality Preserving Projections
Breaking CAPTCHAs using ML
Stencil computation research project presentation #1
Btp 2017 presentation
Ir mcq-answering-system
Cs403 Parellel Programming Travelling Salesman Problem
Ansible Overview - System Administration and Maintenance
CS404 Pattern Recognition - Locality Preserving Projections

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Yogi Goddess Pres Conference Studio Updates
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Trump Administration's workforce development strategy
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
Updated Idioms and Phrasal Verbs in English subject
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Supply Chain Operations Speaking Notes -ICLT Program
Microbial disease of the cardiovascular and lymphatic systems
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
202450812 BayCHI UCSC-SV 20250812 v17.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Final Presentation General Medicine 03-08-2024.pptx
Yogi Goddess Pres Conference Studio Updates
A systematic review of self-coping strategies used by university students to ...
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Trump Administration's workforce development strategy
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Microbial diseases, their pathogenesis and prophylaxis
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
Updated Idioms and Phrasal Verbs in English subject
Paper A Mock Exam 9_ Attempt review.pdf.
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Supply Chain Operations Speaking Notes -ICLT Program

SinGAN - Learning a Generative Model from a Single Natural Image

  • 1. SinGAN: Learning a Generative Model from a Single Natural Image Jishnu Padalunkal, Shobhit Sundriyal Fellowship.ai (Jan ’20 Cohort) AI Reading Group 05-Mar-2020
  • 2. Outline ● Introductions to GANs ● GAN Ecosystem ● SinGAN - Intro ● SinGAN - Architecture ● SinGAN - Experiments ● SinGAN - Results
  • 3. ● Classical GANs ○ Requirement of large image dataset ○ Dataset specific results ○ Application specific (Super Resolution(SR), Texture synthesis) ○ Most of these GANs also face the above mentioned problems. ● This where SinGAN comes to the rescue ○ Not dependent on anything. ○ Given a random image, it can generate random samples via absorbing the patch distribution properties. ○ Results are not application or dataset specific. ○ Hence, useful in a wide range of applications. Premise
  • 4. Quick Look on classic GANs
  • 5. GAN Ecosystem ● Training requires a lot of image data ● Single Generator ○ Generates images similar to original image distribution ● Single Discriminator ○ Tries to discriminate real and generated (fake) images ● Adversarial Loss ○ This is the standard loss that is found in every classical GAN architecture ○
  • 6. GAN Loss Function ● L_adv(G_n, D_n) ) = ● D(x) is the discriminator estimate of the probability that real data instance x is real. ● Ex is the expected value over all real data instances. ● G(z) is the generator's output when given noise z. ● D(G(z)) is the discriminator estimate of the probability that a fake instance is real. ● Ez is the expected value over all random inputs to the generator (in effect, the expected value over all generated fake instances G(z)).
  • 7. Some Examples ● Classic GANs ○ SRGAN ○ DCGAN ○ EDSR ● Distribution learning using single image ○ PSGAN(Work on Texture) ○ Deep Texture Synthesis(Works on textures) ○ ZSSR ● NOTE: These are only for reference, we will stick to SinGANs
  • 10. SinGAN - Similar works ● PSGAN ● Deep Texture Synthesis ● Several recent works proposed to “overfit” a deep model to a single training example. ● However, these methods are designed for specific tasks (e.g., super resolution, texture expansion). ● Shocher et al. were the first to introduce an internal GAN based model for a single natural image, and illustrated it in the context of retargeting. ● However, their generation is conditioned on an input image (i.e., mapping images to images).
  • 11. SinGAN - Similar works ● In contrast, SinGAN is purely generative (i.e. maps noise to image samples), and thus suits many different image manipulation tasks. ● Single image GANs have been explored only in the context of texture generation. ● These models do not generate meaningful samples when trained on non-texture images. ● SinGAN, on the other hand, is not restricted to texture and can handle general natural images
  • 12. SinGAN - Some results with Similar works
  • 14. SinGAN - Generator Architecture
  • 15. SinGAN - Experiments ● Training ○ General Adversarial Loss(Wasserstein Loss) + Reconstruction Loss ■ S ○ Reconstruction loss ■ D ○ Setting = {z,0,0,0,0,0}, scale={N, N-1,.....,0} ○ Training takes about 30 minutes on a 1080TI GPU for an image of size 256 × 256 pixels, and the generation at test time is well under one second per image.
  • 16. SinGAN - Experiments ● Optimization ○ At each scale, the weights of the generator and discriminator are initialized to those from the previous trained scale (except for the coarsest scale or when changing the number of kernels, in which cases we use random initialization). ○ Each scale trains for 2000 iterations. ○ Adam optimizer is used with a learning rate of 0.0005 (decreased by a factor of 0.1 after 1600 iterations) and momentum parameters β1 = 0.5, β2 = 0.999. ○ The weight of the reconstruction loss is α = 10 ○ All LeakyReLU activations have a slope of 0.2 for negative values. ○ The generator’s last conv-block activation at each scale is Tanh instead of ReLU. ○ The descriminator’s last conv-block at each scale includes neither normalization nor activation.
  • 17. SinGAN - Observations ● Boundary conditions and the effect of padding ● The padding configuration highly effects the diversity between samples at the corners. ● Zero padding in each convolutional layer (layer padding) leads to only minor variability at the corners. ● Padding only the input to the net (initial padding), leads to somewhat increased variability at the corners. ● Finally, padding by noise, leads to high variability
  • 23. SR Results (BSD100 Dataset) ● The original undistorted image has the best perceptual quality and therefore the lowest NIQE score.
  • 24. SIFID ● Lower FID values mean better image quality and diversity.
  • 25. Slight modification for Super Resolution
  • 26. Paint to Image Results
  • 31. Useful Links ● Project Web Page ○ https://webee.technion.ac.il/people/tomermic/SinGAN/SinGAN.htm ● Paper ○ https://arxiv.org/abs/1905.01164 ● Supplementary Material ○ https://tomer.net.technion.ac.il/files/2019/09/SingleImageGan_SM.pdf ● Code ○ https://github.com/tamarott/SinGAN