0% found this document useful (0 votes)
21 views58 pages

Lecture 3 Flow Models

This document summarizes a lecture on flow models for deep unsupervised learning. Flow models aim to fit a density model to continuous data by transforming the data through an invertible mapping to a simple distribution like a Gaussian. The key aspects are: 1) A flow model trains by maximizing the likelihood of the data under the model through a change of variables formula. 2) It can evaluate the density of new points and sample by inverting the flow from the simple distribution back to the data space. 3) Practical flow models use parameterizations like neural networks where each layer maintains invertibility and differentiability.

Uploaded by

albertoluin10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views58 pages

Lecture 3 Flow Models

This document summarizes a lecture on flow models for deep unsupervised learning. Flow models aim to fit a density model to continuous data by transforming the data through an invertible mapping to a simple distribution like a Gaussian. The key aspects are: 1) A flow model trains by maximizing the likelihood of the data under the model through a change of variables formula. 2) It can evaluate the density of new points and sample by inverting the flow from the simple distribution back to the data space. 3) Practical flow models use parameterizations like neural networks where each layer maintains invertibility and differentiability.

Uploaded by

albertoluin10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

CS294-158 Deep Unsupervised Learning

Lecture 3 Likelihood Models: Flow Models

Pieter Abbeel, Xi (Peter) Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan
UC Berkeley
Our Goal Today
■ How to fit a density model with continuous

■ What do we want from this model?


■ Good fit to the training data (really, the underlying distribution!)
■ For new x, ability to evaluate
■ Ability to sample from
■ And, ideally, a latent representation that’s meaningful

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 2
Our Goal Today
■ How to fit a density model with continuous

■ What do we want from this model?


■ Good fit to the training data (really, the underlying distribution!)
■ For new x, ability to evaluate
■ Ability to sample from
■ And, ideally, a latent representation that’s meaningful

■ Differences from Autoregressive Models from last lecture

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 3
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 4
Quick Refresher: Probability Density Models

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 5
How to fit a density model?
Continuous data Maximum Likelihood:
0.22159854, 0.84525919, 0.09121633, 0.364252 , 0.30738086,
0.32240615, 0.24371194, 0.22400792, 0.39181847, 0.16407012,
0.84685229, 0.15944969, 0.79142357, 0.6505366 , 0.33123603,
0.81409325, 0.74042126, 0.67950372, 0.74073271, 0.37091554,
0.83476616, 0.38346571, 0.33561352, 0.74100048, 0.32061713,
0.09172335, 0.39037131, 0.80496586, 0.80301971, 0.32048452,
0.79428266, 0.6961708 , 0.20183965, 0.82621227, 0.367292 ,
0.76095756, 0.10125199, 0.41495427, 0.85999877, 0.23004346,
0.28881973, 0.41211802, 0.24764836, 0.72743029, 0.20749136,
0.29877091, 0.75781455, 0.29219608, 0.79681589, 0.86823823,
0.29936483, 0.02948181, 0.78528968, 0.84015573, 0.40391632,
Equivalently:
0.77816356, 0.75039186, 0.84709016, 0.76950307, 0.29772759,
0.41163966, 0.24862007, 0.34249207, 0.74363912, 0.38303383, …

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 6
Example Density Model: Mixtures of Gaussians

Parameters: means and variances of


components, mixture weights

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 7
Aside on Mixtures of Gaussians
Do mixtures of Gaussians work for
high-dimensional data?

Not really. The sampling process is:


1. Pick a cluster center
2. Add Gaussian noise

Imagine this for modeling natural images!


The only way a realistic image can be
generated is if it is a cluster center, i.e. if
it is already stored directly in the
parameters.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 8
How to fit a general density model?

...

■ How to ensure proper distribution?

■ How to sample?
■ Latent representation?

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 9
Flows: Main Idea

...

Generally:
Normalizing Flow:

How to train? How to evaluate ? How to sample?


UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 10
Flows: Training

...

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 11
Change of Variables

Note: requires invertible & differentiable

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 12
Flows: Training

→ assuming we have an expression for ,


this can be optimized with Stochastic Gradient Descent
UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 13
Flows: Sampling

...

Step 1: sample

Step 2:

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 14
What do we need to keep in mind for f?

...

Recall, change of variable formula requires


- Invertible & differentiable

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 15
Example: Flow to Uniform z

Before training

After training

True distribution of x Flow x → z Empirical distribution of z

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 16
Example: Flow to Beta(5,5) z

Before training

After training

True distribution of x Flow x → z Empirical distribution of z

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 17
Example: Flow to Gaussian z

Before training

After training

True distribution of x Flow x → z Empirical distribution of z

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 18
Practical Parameterizations of Flows
Requirement: Invertible and Differentiable

■ Cumulative Density Functions


■ E.g. Gaussian mixture density, mixture of logistics
■ Neural Net
■ If each layer flow, then sequencing of layers = flow
■ Each layer:
■ ReLU?
■ Sigmoid?
■ Tanh?

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 19
How general are flows?
- Can every (smooth) distribution be represented by a
(normalizing) flow? [considering 1-D for now]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 20
Refresher: Cumulative Density Function (CDF)

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 21
Sampling via inverse CDF
Sampling from the model:

The CDF is an invertible,


differentiable map from
data to [0, 1]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 22
How general are flows?
- CDF turns any density into uniform
- Inverse flow is flow

→ can turn any (smooth) p(x) into any (smooth) p(z)


UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 23
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 24
2-D Autoregressive Flow

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 25
2-D Autoregressive Flow: Two Moons
Architecture:
■ Base distribution: Uniform[0,1]^2

■ x1: mixture of 5 Gaussians


■ x2: mixture of 5 Gaussians, conditioned on x1

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 26
2-D Autoregressive Flow: Face
Architecture:
■ Base distribution: Uniform[0,1]^2

■ x1: mixture of 5 Gaussians


■ x2: mixture of 5 Gaussians, conditioned on x1

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 27
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 28
High-dimensional data

f (inference)

f-1 (sampling)

x and z must have the same dimension

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 29
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Autoregressive Flows and Inverse Autoregressive Flows
■ RealNVP (like) architectures
■ Glow, Flow++, FFJORD
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 30
Autoregressive flows
■ The sampling process of a Bayes net is a flow
■ If autoregressive, this flow is called an autoregressive flow

■ Sampling is an invertible mapping from z to x

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 31
Autoregressive flows
■ How to fit autoregressive flows?
■ Map x to z
■ Fully parallelizable
■ Notice
■ x → z has the same structure as the log likelihood computation of an
autoregressive model
■ z → x has the same structure as the sampling procedure of an
autoregressive model

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 32
Inverse autoregressive flows
■ The inverse of an autoregressive flow is also a flow, called the inverse
autoregressive flow (IAF)

■ x → z has the same structure as the sampling in an autoregressive model

■ z → x has the same structure as log likelihood computation of an


autoregressive model. So, IAF sampling is fast

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 33
AF vs IAF
■ Autoregressive flow
■ Fast evaluation of p(x) for arbitrary x
■ Slow sampling
■ Inverse autoregressive flow
■ Slow evaluation of p(x) for arbitrary x, so training directly by
maximum likelihood is slow.
■ Fast sampling
■ Fast evaluation of p(x) if x is a sample
■ There are models (Parallel WaveNet, IAF-VAE) that exploit
IAF’s fast sampling

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 34
AF and IAF
Naively, both end up being as deep as the number of variables!
- E.g. 1MP image → 1M layers…

Can do parameter sharing as in Autoregressive Models from


lecture 2 [e.g. RNN, masking]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 35
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Autoregressive Flows and Inverse Autoregressive Flows
■ RealNVP (like) architectures
■ Glow, Flow++, FFJORD
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 36
Change of MANY variables
For z ~ p(z), sampling process f-1 linearly transforms a small cube
dz to a small parallelepiped dx. Probability is conserved:

Intuition: x is likely if it maps to a “large” region in z space

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 37
Flow models: training
Change-of-variables formula lets us compute the density over x:

Train with maximum likelihood:

New key requirement: the Jacobian determinant must be easy to calculate and
differentiate!

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 38
Constructing flows: composition
■ Flows can be composed
x → f1 → f2 → … fk → z

■ Easy way to increase expressiveness

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 39
Affine flows
■ Another name for affine flow: multivariate Gaussian.
■ Parameters: an invertible matrix A and a vector b
■ f(x) = A-1(x - b)
■ Sampling: x = Az + b, where z ~ N(0, I)
■ Log likelihood is expensive when dimension is large.
■ The Jacobian of f is A-1
■ Log likelihood involves calculating det(A)

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 40
Elementwise flows

■ Lots of freedom in elementwise flow


■ Can use elementwise affine functions or CDF flows.
■ The Jacobian is diagonal, so the determinant is easy to
evaluate.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 41
NICE/RealNVP
Affine coupling layer
■ Split variables in half: x ,x
1:d/2 d/2+1:d

■ Invertible! Note that sθ and tθ can be arbitrary neural nets


with no restrictions.
■ Think of them as data-parameterized elementwise flows.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 42
NICE/RealNVP
■ It also has a tractable Jacobian determinant

■ The Jacobian is triangular, so its determinant is the product of diagonal entries.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 43
RealNVP
■ Takeaway: coupling layers allow unrestricted neural nets to
be used in flows, while preserving invertibility and tractability

[Dinh et al. Density estimation using Real NVP. ICLR 2017]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 44
RealNVP Architecture
Input x: 32x32xc image
■ Layer 1: (Checkerboard x3, channel squeeze, channel x3)
■ Split result to get x1: 16x16x2c and z1: 16x16x2c (fine-grained latents)
■ Layer 2: (Checkerboard x3, channel squeeze, channel x3)
■ Split result to get x2: 8x8x4c and z2: 8x8x4c (coarser latents)
■ Layer 3: (Checkerboard x3, channel squeeze, channel x3)
■ Get z3: 4x4x16c (latents for highest-level details)

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 45
RealNVP: How to partition variables?

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 46
Good vs Bad Partitioning
Checkerboard x4; channel squeeze; (Mask top half; mask bottom half;
channel x3; channel unsqueeze; mask left half; mask right half) x2
checkerboard x3

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 47
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Autoregressive Flows and Inverse Autoregressive Flows
■ RealNVP (like) architectures
■ Glow, Flow++, FFJORD
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 48
Choice of coupling transformation
■ A Bayes net defines coupling dependency, but what invertible
transformation f to use is a design question

■ Affine transformation is the most commonly used one (NICE,


RealNVP, IAF-VAE, …)

■ More complex, nonlinear transformations -> better


performance
■ CDFs and inverse CDFs for Mixture of Gaussians or Logistics (Flow++)
■ Piecewise linear/quadratic functions (Neural Importance Sampling)
UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 49
NN architecture also matters
■ Flow++ = MoL transformation + self-attention in NN
■ Bayes net (coupling dependency), transformation function class, NN
architecture all play a role in a flow’s performance. Still an

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 50
Other classes of flows
■ Glow (link)
■ Invertible 1x1 convolutions
■ Large-scale training

■ Continuous time flows


(FFJORD)
■ Allows for unrestricted
architectures. Invertibility and
fast log probability
computation guaranteed.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 51
Outline
■ Foundations of Flows (1-D)
■ 2-D Flows
■ N-D Flows
■ Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 52
Flow on Discrete Data Without Dequantization...

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 53
Continuous flows for discrete data
■ A problem arises when fitting continuous density models to
discrete data: degeneracy
■ When the data are 3-bit pixel values,
■ What density does a model assign to values between bins like 0.4,
0.42…?
■ Correct semantics: we want the integral of probability density
within a discrete interval to approximate discrete probability
mass

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 54
Continuous flows for discrete data
■ Solution: Dequantization. Add noise to data.

■ We draw noise u uniformly from

[Theis, Oord, Bethge, 2016]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 55
Flow on Discrete Data With Dequantization

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 56
Future directions
■ The ultimate goal: a likelihood-based model with
■ fast sampling
■ fast inference
■ fast training
■ good samples
■ good compression
■ Flows seem to let us achieve some of these criteria.
■ But how exactly do we design and compose flows for great
performance? That’s an open question.

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 57
Bibliography
NICE: Dinh, Laurent, David Krueger, and Yoshua Bengio. "NICE: Non-linear independent components estimation." arXiv preprint arXiv:1410.8516
(2014).
RealNVP: Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio. "Density estimation using Real NVP." arXiv preprint arXiv:1605.08803 (2016).
AF:Chen, Xi, et al. "Variational lossy autoencoder." arXiv preprint arXiv:1611.02731 (2016).; Papamakarios, George, Theo Pavlakou, and Iain
Murray. "Masked autoregressive flow for density estimation." Advances in Neural Information Processing Systems. 2017.
IAF: Kingma, Durk P., et al. "Improved variational inference with inverse autoregressive flow." Advances in neural information processing systems.
2016.
Flow++: Ho, Jonathan, et al. "Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design." arXiv
preprint arXiv:1902.00275 (2019).
Neural Importance Sampling: Müller, Thomas, et al. "Neural importance sampling." arXiv preprint arXiv:1808.03
Glow: Kingma, Durk P., and Prafulla Dhariwal. "Glow: Generative flow with invertible 1x1 convolutions." Advances in Neural Information
Processing Systems. 2018.
FFJORD: Grathwohl, Will, et al. "Ffjord: Free-form continuous dynamics for scalable reversible generative models." arXiv preprint arXiv:1810.01367
(2018).
Neural Autoregressive Flow: Huang, Chin-Wei, et al. "Neural autoregressive flows." arXiv preprint arXiv:1804.00779 (2018)
Residual Flows for invertible generative modeling, Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen, arXiv: 1906.02735
Normalizing Flows for Probabilistic Modeling and Inference, George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed,
Balaji Lakshminarayanan https://arxiv.org/abs/1912.02762 [tutorial]

UC Berkeley -- Spring 2020 -- Deep Unsupervised Learning -- Pieter Abbeel, Peter Chen, Jonathan Ho, Aravind Srinivas, Alex Li, Wilson Yan -- L3 Flows 58

You might also like