Introduction to Deep
Lecture #1
Generative Modeling
HY-673 – Computer Science Dept, University of Crete
Professors: Yannis Pantazis & Yannis Stylianou
TAs: Michail Raptakis & Michail Spanakis
Lecture
What is this course about? #1
Lecture
#1
üStatistical Generative Models
üA Generative Model (GM) is defined as a probability distribution, 𝒑 𝒙 .
üA statistical GM is a trainable probabilistic model, 𝒑𝜽 𝒙 .
üA deep GM is a statistical generative model parametrized by a neural network.
ü𝒑 𝒙 and in many cases 𝒑𝜽 𝒙 are not analytically known. Only samples are available!
üData (𝒙): complex, (un)structured samples (e.g., images, speech, molecules, text, etc.)
üPrior knowledge: parametric form (e.g., Gaussian, mixture, softmax), loss function (e.g.,
maximum likelihood, divergence), optimization algorithm, invariance/equivariance, laws
of physics, prior distribution, etc.
Lecture
What is this course about? #1
ü A dataset with images e.g., of bedrooms (LSUN dataset)
data distribution GM’s distribution
𝑝(𝑥) or 𝑝!"#" (𝑥) or 𝑝! (𝑥) 𝑝$ (𝑥) or 𝑝% (𝑥)
ü Goal: Find 𝜃 ∈ Θ such that 𝑝! (𝑥) ≈ 𝑝" (𝑥).
ü It is generative because sampling from 𝑝! (𝑥) generates
new unseen images.
… ~ 𝑝! (𝑥)
Lecture
What is this course about? #1
xi ∼ pdata , pθ) pθ
d (p data
pdata
Parametric Model Space
We will stydy:
ü Families of Generative Models
ü Algorithms to train these GMs
ü Network architectures
ü Loss functions & distances between probability density functions
Lecture
What is this course about? #1
üConditional Generative Models
üA conditional GM is defined as a conditional probability distribution, 𝑝 𝑥|𝑦 .
ü𝑦: conditioning variable(s) (e.g., label/class, text, captions, speaker id, style,
rotation, thickness, …)
~ 𝑝! (𝑥|𝑦), 𝑦:digit label
Lecture
Discriminative vs Generative Models #1
Lecture
#1
Data: 𝒙 Label: 𝒚
“Cat”
ü Discriminative Model
ü Learn the probability distribution
𝒑(𝒚|𝒙)
ü Generative Model
ü Learn the probability distribution 𝒑 𝒙
ü Conditional GM
ü Learn 𝒑(𝒙|𝒚)
Lecture
Families of Generative Models #1
Lecture
#1
üEnergy-based Models (EBMs)
üGenerative Adversarial Nets (GANs)
üVariational Auto-Encoders (VAEs)
üNormalizing Flows (NFs)
üDiffusion Probabilistic Models (DPMs)
üDeep Autoregressive Models (ARMs)
Lecture
Families of Generative Models #1
Lecture
#1
GMs
Exact Approximate Implicit
ARMs NFs VAEs EBMs DPMs GANs GGFs
(R)NADE Planar Vanilla Belief nets diffusion Vanilla KALE
WaveNet Coupling β-VAE Boltzmann denoising WGAN Lipschitz-reg.
WaveRNN MAFs/IAFs VQ-VAE machines score 𝑓-GAN …
GPT … … … … (𝑓, Γ)-GAN
…
Lecture
Less known Families of GMs #1
Lecture
#1
üGenerative Stochastic Networks (GSNs)
üGenerative Gradient Flows (GGFs)
üSpecific EBMs
üDeep Belief Networks
üDeep Boltzmann Machines
…
üGenerative Flow Networks (GFlowNets)
...
Lecture
Progress in Image Generation #1
Lecture
#1
ü Face generation: Rapid progress in image quality
Lecture
Image Super Resolution #1
Lecture
#1
ü Several inverse problems can be solved with conditional GMs.
ü Inverse problems: From measurements, calculate/infer the causes.
ü𝑷(𝒉𝒊𝒈𝒉 𝒓𝒆𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏|𝒍𝒐𝒘 𝒓𝒆𝒔𝒐𝒍𝒖𝒕𝒊𝒐𝒏)
ü Photo-Realistic Single Image Super-Resolution Using a
Generative Adversarial Network - Ledig et al. - CVPR 2017
ü https://openaccess.thecvf.com/content_cvpr_2017/html/Ledig_P
hoto-Realistic_Single_Image_CVPR_2017_paper.html
Lecture
Image Inpainting #1
Lecture
Lecture
#1
#1
ü𝑷(𝒇𝒖𝒍𝒍 𝒊𝒎𝒂𝒈𝒆|𝒎𝒂𝒔𝒌𝒆𝒅 𝒊𝒎𝒂𝒈𝒆)
üDeepFill (v2): Free-Form Image Inpainting With Gated Convolution
– Yu et al. - ICCV 2019
ü https://openaccess.thecvf.com/content_ICCV_2019/html/Yu_Free-
Form_Image_Inpainting_With_Gated_Convolution_ICCV_2019_paper.html
Lecture
Image Colorization #1
Lecture
#1
ü𝑷(𝒄𝒐𝒍𝒐𝒓𝒆𝒅 𝒊𝒎𝒂𝒈𝒆|𝒈𝒓𝒂𝒚𝒔𝒄𝒂𝒍𝒆 𝒊𝒎𝒂𝒈𝒆)
ü PalGAN: Image Colorization with Palette Generative Adversarial
Networks – Wang et al. - ECCV 2022
ü https://link.springer.com/chapter/10.1007/978-3-031-19784-0_16
Lecture
Text2Image Translation #1
Lecture
Lecture
#1
#1
üRecent advancements:
üDALL-E 2
üStable Diffusion
üImagen
üGLIDE
üMidjourney
ü!(#$%&'|)'*))
Théâtre D’opéra Spatial by Jason Allen and Midjourney
Lecture
OpenAI’s DALL-E 2 #1
Lecture
#1
üText à Text embedding à Image embedding à low resolution
image à medium resolution image à high resolution image
ü𝑷 𝒉𝒊𝒈𝒉 𝒓𝒆𝒔 𝒊𝒎𝒂𝒈𝒆 𝒕𝒆𝒙𝒕 𝒄𝒂𝒑𝒕𝒊𝒐𝒏 = 𝑷 𝒊𝒎𝒂𝒈𝒆 𝒆𝒎𝒃 𝒕𝒆𝒙𝒕 𝒄𝒂𝒑𝒕𝒊𝒐𝒏 x
𝑷 𝒉𝒊𝒈𝒉 𝒓𝒆𝒔 𝒊𝒎𝒂𝒈𝒆 𝒊𝒎𝒂𝒈𝒆 𝒆𝒎𝒃
üHierarchical Text-Conditional
Image Generation with CLIP Latents -
Ramesh et al. - 2022
ü https://cdn.openai.com/papers/dall-e-2.pdf
Lecture
Image2Image Translation #1
Lecture
#1
üUnpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks
(CycleGAN) – Zhu et al. - ICCV 2017
ü https://openaccess.thecvf.com/content_iccv_2017/html/Zhu_Unpaired_Image-To-
Image_Translation_ICCV_2017_paper.html
Lecture
Speech & Audio Synthesis Lecture
#1
#1
Lecture
#1
ü𝑷(𝑥#$% |𝑥# , 𝑥#&% , … , 𝑡𝑒𝑥𝑡)
üWaveNet, WaveRNN, Parallel Wavenet, Text to Speech Synthesis
MelGAN, WaveDiff, … Parametric
Concatenative
WaveNet
Unconditional
Music
van den Oord et al., 2016
Lecture
(Natural) Language Generation #1
Lecture
#1
ü𝑷(𝒏𝒆𝒙𝒕 𝒘𝒐𝒓𝒅|𝒑𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝒘𝒐𝒓𝒅)
ühttps://app.inferkit.com/demo
üGPT-3
üGenerative Pre-trained
Transformer
ühttps://deepai.org/machine-
learning-model/text-generator
Lecture
(Natural) Language Generation #1
Lecture
#1
üEnormous model size (Trillion parameters?)
üEnormous & diverse training data
üMultimodal capabilities
üContext learning (a.k.a. prompting)
üReinforcement learning
üEnormous performance
üCoherence, relevance, proficiency
üSafety & Ethics
üFew steps from AGI
Lecture
Geometric Design #1
Lecture
#1
üJust meshing around with GPT4 (ESA proposal)
Lecture
Molecule/Drug/Protein Design #1
Lecture
#1
üMolGAN: An implicit generative model for small molecular graphs
– De Cao & Kipf – ICML 2018
Lecture
Driving forces in GM progress #1
Lecture
#1
üRepresentation learning
üLeveraging the exponential growth of data & of model’s parameters
via self-supervised learning
üGave raise to the Foundation Models
üComputational resources are also exponentially increasing.
üBetter understanding of the models, algorithms act as key
enablers.
üUnlocks human productivity & creativity.
üIdeally, it will accelerate the scientific discovery process.
Lecture
Challenges in GMs #1
Lecture
#1
üRepresentation: How do we model the joint distribution of many
random variables?
üNeed compact & meaningful representations
üLearning (a.k.a. quality assessment): What is the proper comparison
metrics between probability distributions?
üReliability: Can we trust the generated outcomes? Are they
consistent?
üAlignment: Do they perform according to the input of the user?
Lecture
Prerequisites #1
Lecture
Lecture
#1
#1
üVery good knowledge of probability theory, multivariate calculus
& linear algebra.
üIntermediate knowledge regarding machine learning & neural
networks.
üProficiency in some programming language, preferable Python, is
required.
Lecture
Course Syllabus #1
Lecture
#1
üBasics in probability theory (1W)
ü Shallow generative models - GMMs (1W)
üExact (i.e., fully-observed) likelihood
GMs
ü AR models (1.5W)
ü Normalizing flows (1.5W)
üApproximate likelihood Approxi Implicit
Exact
ü VAEs (2W) mate
ü Diffusion/Score-based models (2W)
ü EBMs (1W)
NFs VAEs GANs GGFs
üImplicit ARMs EBMs DPMs
ü GANs (2W)
Lecture
Logistics #1
Lecture
#1
üTeaching Assistant: Michail Raptakis (PhD candidate)
üWeekly Tutorial (Friday 10:00-12:00): Python/PyTorch basics, neural
network architectures and training, solve problems to assist with
homework, solve selected homework’s problems.
üTextbook: Probabilistic Machine Learning: Advanced Topics
by Kevin P. Murphy
ühttps://probml.github.io/pml-book/book2.html
üSeminal papers will be distributed.
Lecture
Grading policy #1
Lecture
#1
üFinal Exam (30% of total grade)
üOpen notes
üNO internet
ü5-6 series of Homework (40% of total grade)
üMix of theoretical and programming problems
üEqually weighted
üProject: paper implementation & presentation (30% of total grade)
üImplementation: 10%
üFinal report: 10%
üPresentation: 10%
Lecture
Project #1
Lecture
Lecture
#1
#1
üSelect from a given list of papers or propose a paper (which has to
be approved)
üCategories of papers:
üApplication of deep generative models on a novel task/dataset
üAlgorithmic improvements into the learning, inference and/or evaluation
of deep generative models
üTheoretical analysis of any aspect of existing deep generative models
üGroups of up to 2 students per project
üComputational resources might be provided (colab, local GPUs,
etc.)
Introduction to Deep
Generative Modeling Lecture #1
HY-673 – Computer Science Dept, University of Crete
Professors: Yannis Pantazis & Yannis Stylianou
TAs: Michail Raptakis & Michail Spanakis