0% found this document useful (0 votes)
57 views43 pages

Sparse Coding and Dictionary Learning For Image Analysis

This document discusses using sparse representations and dictionary learning for image restoration tasks like denoising, demosaicking, and inpainting. It presents an algorithm called K-SVD that learns an overcomplete dictionary to sparsely represent image patches and uses it to reconstruct images from noisy or incomplete data. Experimental results are shown for denoising, demosaicking, and inpainting images and video. Learning 3D dictionaries and propagating them across video frames is also proposed for video processing.

Uploaded by

Jose Silva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views43 pages

Sparse Coding and Dictionary Learning For Image Analysis

This document discusses using sparse representations and dictionary learning for image restoration tasks like denoising, demosaicking, and inpainting. It presents an algorithm called K-SVD that learns an overcomplete dictionary to sparsely represent image patches and uses it to reconstruct images from noisy or incomplete data. Experimental results are shown for denoising, demosaicking, and inpainting images and video. Learning 3D dictionaries and propagating them across video frames is also proposed for video processing.

Uploaded by

Jose Silva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Sparse Coding and Dictionary Learning

for Image Analysis


Part II: Dictionary Learning for signal reconstruction

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro

ICCV’09 tutorial, Kyoto, 28th September 2009

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 1/43
What this part is about
The learning of compact representations of images
adapted to restoration tasks.
A fast online algorithm for learning dictionaries and
factorizing matrices in general.
Various formulations for image and video processing.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 2/43
The Image Denoising Problem

y
|{z} = xorig + |{z}
w
|{z} noise
measurements original image

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 3/43
Sparse representations for image restoration

y = xorig + |{z}
w
|{z} |{z}
measurements original image noise

Energy minimization problem - MAP estimation


E (x) = ||y − x||22 + Pr (x)
| {z } | {z }
relation to measurements prior

Some classical priors


Smoothness λ||Lx||22
Total variation λ||∇x||21
Wavelet sparsity λ||Wx||1
...

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 4/43
Sparse representations for image restoration
Sparsity and redundancy
Pr (x) = λ||α||0 for x ≈ Dα
 
    α[1]
α[2]
x =  d1 d2 · · · dp   
 .. 
 . 
| {z } | {z } α[p]
x∈Rm D∈Rm×p | {z }
α∈Rp ,sparse

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 5/43
Sparse representations for image restoration
Designed dictionaries
[Haar, 1910], [Zweig, Morlet, Grossman ∼70s], [Meyer, Mallat,
Daubechies, Coifman, Donoho, Candes ∼80s-today]. . . (see [Mallat,
1999])
Wavelets, Curvelets, Wedgelets, Bandlets, . . . lets

Learned dictionaries of patches


[Olshausen and Field, 1997], [Engan et al., 1999], [Lewicki and
Sejnowski, 2000], [Aharon et al., 2006] , [Roth and Black, 2005], [Lee
et al., 2007]
X1
min ||xi − Dαi ||22 + λψ(αi )
αi ,D∈C 2 } | {z }
i | {z
sparsity
reconstruction
ψ(α) = ||α||0 (“`0 pseudo-norm”)
ψ(α) = ||α||1 (`1 norm)

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 6/43
Sparse representations for image restoration
Solving the denoising problem
[Elad and Aharon, 2006]
Extract all overlapping 8 × 8 patches yi .
Solve a matrix factorization problem:
n
X 1
min ||yi − Dαi ||22 + λψ(αi ),
αi ,D∈C
i=1 |2 {z } |sparsity
{z }
reconstruction

with n > 100, 000


Average the reconstruction of each patch.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 7/43
Sparse representations for image restoration
K-SVD: [Elad and Aharon, 2006]

Figure: Dictionary trained on a noisy version of the image


boat.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 8/43
Sparse representations for image restoration
Inpainting, Demosaicking
X1
min ||β i ⊗ (yi − Dαi )||22 + λi ψ(αi )
D∈C,α 2
i

RAW Image Processing (see our poster)

White
balance.
Denoising
Black
substraction.
Conversion
to sRGB.
Demosaicking
Gamma
correction.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 9/43
Sparse representations for image restoration
[Mairal, Bach, Ponce, Sapiro, and Zisserman, 2009c]

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 10/43
Sparse representations for image restoration
[Mairal, Sapiro, and Elad, 2008b]

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 11/43
Sparse representations for image restoration
Inpainting, [Mairal, Elad, and Sapiro, 2008a]

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 12/43
Sparse representations for image restoration
Inpainting, [Mairal, Elad, and Sapiro, 2008a]

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 13/43
Sparse representations for video restoration
Key ideas for video processing
[Protter and Elad, 2009]
Using a 3D dictionary.
Processing of many frames at the same time.
Dictionary propagation.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 14/43
Sparse representations for image restoration
Inpainting, [Mairal, Sapiro, and Elad, 2008b]

Figure: Inpainting results.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 15/43
Sparse representations for image restoration
Inpainting, [Mairal, Sapiro, and Elad, 2008b]

Figure: Inpainting results.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 16/43
Sparse representations for image restoration
Inpainting, [Mairal, Sapiro, and Elad, 2008b]

Figure: Inpainting results.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 17/43
Sparse representations for image restoration
Inpainting, [Mairal, Sapiro, and Elad, 2008b]

Figure: Inpainting results.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 18/43
Sparse representations for image restoration
Inpainting, [Mairal, Sapiro, and Elad, 2008b]

Figure: Inpainting results.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 19/43
Sparse representations for image restoration
Color video denoising, [Mairal, Sapiro, and Elad, 2008b]

Figure: Denoising results. σ = 25

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 20/43
Sparse representations for image restoration
Color video denoising, [Mairal, Sapiro, and Elad, 2008b]

Figure: Denoising results. σ = 25

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 21/43
Sparse representations for image restoration
Color video denoising, [Mairal, Sapiro, and Elad, 2008b]

Figure: Denoising results. σ = 25

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 22/43
Sparse representations for image restoration
Color video denoising, [Mairal, Sapiro, and Elad, 2008b]

Figure: Denoising results. σ = 25

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 23/43
Sparse representations for image restoration
Color video denoising, [Mairal, Sapiro, and Elad, 2008b]

Figure: Denoising results. σ = 25

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 24/43
Optimization for Dictionary Learning

n
X 1
min ||xi − Dαi ||22 + λ||αi ||1
α∈R p×n 2
D∈C i=1
M
C = {D ∈ Rm×p s.t. ∀j = 1, . . . , p, ||dj ||2 ≤ 1}.

Classical optimization alternates between D and α.


Good results, but very slow!

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 25/43
Optimization for Dictionary Learning
[Mairal, Bach, Ponce, and Sapiro, 2009a]

Classical formulation of dictionary learning


n
1X
min fn (D) = min l(xi , D),
D∈C D∈C n
i=1

where
M 1
l(x, D) = minp ||x − Dα||22 + λ||α||1 .
α∈R 2

Which formulation are we interested in?


n
n 1X o
min f (D) = Ex [l(x, D)] ≈ lim l(xi , D)
D∈C n→+∞ n
i=1

[Bottou and Bousquet, 2008]: Online learning can


handle potentially infinite or dynamic datasets,
be dramatically faster than batch algorithms.
Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 26/43
Optimization for Dictionary Learning
Require: D0 ∈ Rm×p (initial dictionary); λ ∈ R
1: A0 = 0, B0 = 0.
2: for t=1,. . . ,T do
3: Draw xt
4: Sparse Coding

1
αt ← arg min ||xt − Dt−1 α||22 + λ||α||1 ,
α∈Rp 2

5: Aggregate sufficient statistics


At ← At−1 + αt αT T
t , Bt ← Bt−1 + xt αt
6: Dictionary Update (block-coordinate descent)
t
1 X 1 
Dt ← arg min ||xi − Dαi ||22 + λ||αi ||1 .
D∈C t 2
i=1

7: end for
Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 27/43
Optimization for Dictionary Learning
Which guarantees do we have?
Under a few reasonable assumptions,
we build a surrogate function fˆt of the expected cost f verifying

lim fˆt (Dt ) − f (Dt ) = 0,


t→+∞

Dt is asymptotically close to a stationary point.

Extensions (all implemented in SPAMS)


non-negative matrix decompositions.
sparse PCA (sparse dictionaries).
fused-lasso regularizations (piecewise constant dictionaries)

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 28/43
Optimization for Dictionary Learning
Experimental results, batch vs online

m = 8 × 8, p = 256
Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 29/43
Optimization for Dictionary Learning
Experimental results, batch vs online

m = 12 × 12 × 3, p = 512
Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 30/43
Optimization for Dictionary Learning
Inpainting a 12-Mpixel photograph

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 31/43
Optimization for Dictionary Learning
Inpainting a 12-Mpixel photograph

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 32/43
Optimization for Dictionary Learning
Inpainting a 12-Mpixel photograph

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 33/43
Optimization for Dictionary Learning
Inpainting a 12-Mpixel photograph

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 34/43
Extension to NMF and sparse PCA
[Mairal, Bach, Ponce, and Sapiro, 2009b]

NMF extension
n
X 1
min ||xi − Dαi ||22 s.t. αi ≥ 0, D ≥ 0.
α∈Rp×n 2
D∈C i=1

SPCA extension
n
X 1
min ||xi − Dαi ||22 + λ||α1 ||1
α∈Rp×n 2
D∈C 0 i=1

M
C 0 = {D ∈ Rm×p s.t. ∀j ||dj ||22 + γ||dj ||1 ≤ 1}.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 35/43
Extension to NMF and sparse PCA
Faces: Extended Yale Database B

(a) PCA (b) NNMF (c) DL

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 36/43
Extension to NMF and sparse PCA
Faces: Extended Yale Database B

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 37/43
Extension to NMF and sparse PCA
Natural Patches

(a) PCA (b) NNMF (c) DL

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 38/43
Extension to NMF and sparse PCA
Natural Patches

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 39/43
Summary of this part
The dictionary learning framework leads to
state-of-the-art results for many image . . .
. . . and video processing tasks.
Online learning techniques are well-suited for this
problem and allows training sets with millions of
patches.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 40/43
References I
M. Aharon, M. Elad, and A. M. Bruckstein. The K-SVD: An algorithm for designing
of overcomplete dictionaries for sparse representations. IEEE Transactions on
Signal Processing, 54(11):4311–4322, November 2006.
L. Bottou and O. Bousquet. The trade-offs of large scale learning. In J.C. Platt,
D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information
Processing Systems, volume 20, pages 161–168. MIT Press, Cambridge, MA, 2008.
M. Elad and M. Aharon. Image denoising via sparse and redundant representations
over learned dictionaries. IEEE Transactions on Image Processing, 54(12):
3736–3745, December 2006.
K. Engan, S. O. Aase, and J. H. Husoy. Frame based signal compression using
method of optimal directions (MOD). In Proceedings of the 1999 IEEE
International Symposium on Circuits Systems, volume 4, 1999.
A. Haar. Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen,
69:331–371, 1910.
H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In
B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information
Processing Systems, volume 19, pages 801–808. MIT Press, Cambridge, MA, 2007.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 41/43
References II
M. S. Lewicki and T. J. Sejnowski. Learning overcomplete representations. Neural
Computation, 12(2):337–365, 2000.
J. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration.
IEEE Transactions on Image Processing, 17(1):53–69, January 2008a.
J. Mairal, G. Sapiro, and M. Elad. Learning multiscale sparse representations for
image and video restoration. SIAM Multiscale Modelling and Simulation, 7(1):
214–241, April 2008b.
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse
coding. In Proceedings of the International Conference on Machine Learning
(ICML), 2009a.
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization
and sparse coding. ArXiv:0908.0050v1, 2009b. submitted.
J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Non-local sparse models
for image restoration. In Proceedings of the IEEE International Conference on
Computer Vision (ICCV), 2009c.
S. Mallat. A Wavelet Tour of Signal Processing, Second Edition. Academic Press,
New York, September 1999.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 42/43
References III
B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A
strategy employed by V1? Vision Research, 37:3311–3325, 1997.
M. Protter and M. Elad. Image sequence denoising via sparse and redundant
representations. IEEE Transactions on Image Processing, 18(1):27–36, 2009.
S. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2005.

Francis Bach, Julien Mairal, Jean Ponce and Guillermo Sapiro Dictionary Learning for signal reconstruction 43/43

You might also like