0% found this document useful (0 votes)
23 views92 pages

Lecture 02

Uploaded by

jinyaoz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views92 pages

Lecture 02

Uploaded by

jinyaoz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Computer Vision!

CS-E4850, 5 study credits!


!
Juho Kannala!
Aalto University !
Lecture 2: Image Processing!

• Lecture concentrates on image filtering!


• Relevant reading: Chapter 3 of Szeliski’s book!

!
!
!
!
Acknowledgement: many slides from James Hays, Derek Hoiem, Svetlana Lazebnik, Esa Rahtu, Steve Seitz, David Lowe,
Kristen Grauman, Alexei Efros, and others.!

!
Three views of filtering!

• Image filters in spatial domain!


– Filter is a mathematical operation of a grid of numbers!
– Smoothing, sharpening, edge detection!

• Image filters in the frequency domain!


– Filtering is a way to modify the frequencies of images!
– Hybrid images, sampling, image resizing!

• Templates and image pyramids!


– Filtering is a way to match a template to the image!
– Detection, coarse-to-fine registration!

Source: J. Hays
Image filtering!

• Image filtering: compute function of local neighborhood at each position !


• Really important in practice!!
• Enhance images (Denoise, resize, increase contrast, etc.)!
• Extract information from images (Texture, edges, distinctive points, etc.)!
• Detect patterns (Template matching)!
• Deep Convolutional Networks (Sequence of filters and non-linear functions)!
Motivation: Image denoising!

• How can we reduce noise in a photograph?!

Source: Lazebnik
Moving average!

• Let’s replace each pixel with a weighted average of its neighborhood!


• The weights are called the filter kernel!
• The weights for the average of a 3x3 neighborhood:!

1 1 1

1 1 1

1 1 1

“box filter”
Source: D. Lowe
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0
0 0
0 0
0 90
90 0
0 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 90
90 90
90 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 90
90 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0 50

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10

0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20

0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30

0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 0 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60 40 20

0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10

0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Box filter - what does it do?!

• Replaces each pixel with an average of its


neighborhood! g[⋅ ,⋅ ]
• Achieves smoothing effect!
(removes sharp features)! 1 1 1

1 1 1

1 1 1

Source: D. Lowe
Smoothing with box filter!
Practice with linear filters!

0 0 0
0
0
1
0
0
0
?
Original

Source: D. Lowe
Practice with linear filters!

0 0 0
0 1 0
0 0 0

Original Filtered
(no change)

Source: D. Lowe
Practice with linear filters!

0 0 0
0
0
0
0
1
0
?
Original

Source: D. Lowe
Practice with linear filters!

0 0 0
0 0 1
0 0 0

Original Shifted left


By 1 pixel

Source: D. Lowe
Practice with linear filters!

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1
?
(Note that filter sums to 1)
Original

Source: D. Lowe
Practice with linear filters!

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1

Original Sharpening filter


- Accentuates differences with local
average

Source: D. Lowe
Sharpening!

Source: D. Lowe
Other filters!

1 0 -1
2 0 -2
1 0 -1
Sobel

Vertical Edge
(absolute value)
Other filters!

1 2 1
0 0 0
-1 -2 -1
Sobel

Horizontal Edge
(absolute value)
Key properties!

• Linearity: !

filter(f1 + f2) = filter(f1) + filter(f2)!


• Shift invariance: !

filter(shift(f)) = shift(filter(f))
-> same behavior regardless of pixel location!

• Theoretical result: any linear shift-invariant operator can be represented as a convolution!

Source: S. Lazebnik
Properties in more detail!

• Commutative: a * b = b * a!
• Conceptually no difference between filter and signal!

• Associative: a * (b * c) = (a * b) * c!
• Often apply several filters one after another: (((a * b1) * b2) * b3)!
• This is equivalent to applying one filter: a * (b1 * b2 * b3)!

• Distributes over addition: a * (b + c) = (a * b) + (a * c)!


• Scalars factor out: ka * b = a * kb = k (a * b)!
• Identity: unit impulse e = […, 0, 0, 1, 0, 0, …],!
a * e = a!
!
Source: S. Lazebnik
Filtering vs. Convolution!

• 2D filtering:! I=image f=filter


!
h=filter2(f,I); or h=imfilter(I,f);

! h[m, n] = ∑ f [k , l ] I [m + k , n + l ]
k ,l
• 2D convolution:!
!
h=conv2(f,I);
h[m, n] = ∑ f [k , l ] I [m − k , n − l ]
k ,l
Definition of convolution!

• Let f be the image and g be the kernel. The output of convolving f with g is denoted f * g!

( f ∗ g )[m, n] = ∑ f [m − k , n − l ] g[k , l ]
k ,l

Convention:!
kernel is “flipped”!
f

• See MATLAB functions: conv2, filter2, imfilter (the latter two don’t flip the kernel) !
Source: F. Durand
Important filter - Gaussian!

• Spatially-weighted average!

0.003 0.013 0.022 0.013 0.003


0.013 0.059 0.097 0.059 0.013
0.022 0.097 0.159 0.097 0.022
0.013 0.059 0.097 0.059 0.013
0.003 0.013 0.022 0.013 0.003

5 x 5, σ = 1

Credit: C. Rasmussen
Smoothing with Gaussian filter!
Smoothing with box filter!
Gaussian filters!

• Remove “high-frequency” components from the image (low-pass filter)!


– Images become more smooth!

• Convolution with self is another Gaussian!


• So can smooth with small-width kernel, repeat, and get same result as larger-width kernel would have!
• Convolving two times with Gaussian kernel of width σ is same as convolving once with kernel of width 𝜎√2 !

• Separable kernel!
• Factors into product of two 1D Gaussians!

Source: K. Grauman
Separability of the Gaussian filter!
Separability example!

2D filtering
(center location only)

The filter factors


into a product of 1D
filters:

Perform filtering =
along rows: *

Followed by filtering =
along the remaining column: *

Source: K. Grauman
Separability!

• Why is separability useful in practice?!


Separability!

• Why is separability useful in practice?!

• Filter of size k*k requires k2 operations per pixel!

• Only 2k operations for separable kernels: !


Practical matters – what happens near the edge?!

• The filter window falls off the edge of the image!


• Need to extrapolate:!
• clip filter (black)!
Matlab: imfilter(f, g, 0)!
• wrap around!
Matlab: imfilter(f, g, ‘circular’)!
• copy edge !
Matlab: imfilter(f, g, ‘replicate’)!
• reflect across edge!
Matlab: imfilter(f, g, ‘symmetric’)!
Source: S. Marschner
Practical matters!

• What is the size of the output?!


• Matlab: filter2(g, f, shape)!
• shape = ‘full’: output size is sum of sizes of f and g!
• shape = ‘same’: output size is same as f!
• shape = ‘valid’: output size is difference of sizes of f and g !

full same valid


g g
g g
g g

f f f

g g
g g
g g
Source: S. Lazebnik
Why Gaussian gives smooth output compared to box filter?!

Gaussian Box filter

Source: D. Hoiem
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Thinking in terms of frequency!
Jean Baptiste Joseph Fourier (1768-1830)!
...the manner in which the author arrives at these
equations is not exempt of difficulties and...his
• He had a crazy idea in 1807:! analysis to integrate them still leaves something to be
desired on the score of generality and even rigour.
Any univariate function can be rewritten !
as a weighted sum of sines and cosines !
of different frequencies. !
• Don’t believe it? !
• Neither did lagrange, Laplace, Poisson and other big wigs!
Laplace
• Not translated into English until 1878!!

• But it’s (mostly) true!!


• Called Fourier Series!
• There are some subtle restrictions!

Legendre
Lagrange

Slides: Efros
A sum of sines!

• Our building block:!


!
Asin(ωx + φ )
• Add enough of them to get any signal
f(x) you want!!
A sum of sines!

• Example:!
g(t) = sin(2πf t ) + (1/3)sin(2π( 3f ) t)!

= +
Example: Music!

• We think of music in terms of frequencies at different magnitudes!

Source: D. Hoiem
2D signals!

• We can also think of all kinds of other signals the same way!

Source: D. Hoiem
Other signals!

• We can also think of all kinds of other signals the same way!

Source: D. Hoiem
Fourier analysis in images!

• In 2D case we have two-dimensional frequency !


(which encodes also the 2D orientation of the sine wave)!

Intensity Image

Fourier Image

Slide adapted from D. Hoiem http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering


Signals can be composed!

+ =

http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering
Source: D. Hoiem More: http://www.cs.unm.edu/~brayer/vision/fourier.html
Fourier Bases!

Strong Vertical Frequency


(Sharp Horizontal Edge)
Diagonal Frequencies
Strong Horz.
Frequency
(Sharp Vert.
Edge)

Log Magnitude
Low Frequencies

This change of basis is the Fourier Transform! Source: Hays, Hoiem


Fourier Transform!

• Fourier transform stores the magnitude and phase at each frequency!


• Magnitude encodes how much signal there is at a particular frequency!
• Phase encodes spatial information (indirectly)!
• For mathematical convenience, this is often notated in terms of real and complex numbers!

2 2 I (ω )
−1
Amplitude: A = ± R(ω ) + I (ω ) Phase: φ = tan
R(ω )
Euler’s formula:

Source: D. Hoiem
Source: L. Xie
Computing 2D-DFT!

DFT

IDFT

• Discrete, 2-D Fourier & inverse Fourier transforms are implemented


in fft2 and ifft2, respectively
• fftshift: Move origin (DC component) to image center for display
• Example:
>> I = imread(‘test.png’); % Load grayscale image
>> F = fftshift(fft2(I)); % Shifted transform
>> imshow(log(abs(F)),[]); % Show log magnitude
>> imshow(angle(F),[]); % Show phase angle
The Convolution Theorem!

• The Fourier transform of the convolution of two functions is the product of their Fourier
transforms!

F[ g ∗ h] = F[ g ] F[ h]
• The inverse Fourier transform of the product of two Fourier transforms is the convolution of the
two inverse Fourier transforms!
!
−1 −1 −1
!
! F [ gh] = F [ g ] ∗ F [ h]
!
• Convolution in spatial domain is equivalent to multiplication in frequency domain!!

Source: D. Hoiem
Properties of Fourier Transforms!

• Linearity!

• Fourier transform of a real signal is symmetric about the origin!

• The energy of the signal is the same as the energy of its Fourier transform!

See Szeliski Book (3.4)


Questions!

• Which has more information, the phase or the magnitude?!


• What happens if you take the phase from one image and combine it with the magnitude from
another image?!
Example: amplitude vs. phase !

A = “Aron” P = “Phyllis”
FA = fft2(A) FP = fft2(P)

log(abs(FA)) log(abs(FP))

angle(FA) angle(FP)

ifft2(abs(FA), angle(FP)) ifft2(abs(FP), angle(FA))

Source: L. Xie
What this all has to do with filtering?!
Filtering in spatial domain!
1 0 -1
2 0 -2
1 0 -1

* =

Source: D. Hoiem
Filtering in frequency domain!

FFT FFT

=
Inverse FFT

Source: D. Hoiem
Why Gaussian gives smooth output compared to box filter?!

Gaussian Box filter

Source: D. Hoiem
Gaussian filter!
Box filter!
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Subsampling by a factor of two!

Throw away every other row and column to create a ½ size image
Problem: Aliasing !

• One-dimensional example (sinewave):!


!
!
Problem: Aliasing !

• One-dimensional example (sinewave):!


!
!
Aliasing in graphics !

!
• Characteristic errors may appear ”checker board disintegrate”, “striped shirts look funny”,….!
Nyquist-Shannon sampling theorem !

• When sampling a signal at discrete intervals, the sampling frequency must be ≥ 2 × fmax!
• This allows to reconstruct the original perfectly from the sampled version!

good

bad
Solution: Anti-aliasing!

• Option 1: Sample more often!


• Option 2: Get rid of frequencies greater than half the new sampling frequency (i.e. filter)!
-> Loss of information, but still better than aliasing!

• Example algorithm for downsampling by factor 2 (Matlab):!


1. Apply low-pass filter!
im_blur = imfilter(image,fspecial(‘gaussian’,7,1));!
2. Sample every other pixel !
im_small = im_blur(1:2:end , 1:2:end);!
Subsampling without pre-filtering!

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Credit: S. Seitz
Subsampling with pre-filtering!

Gaussian 1/2 G 1/4 G 1/8


Credit: S. Seitz
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Hybrid Images!

A. Oliva, A. Torralba, P.G. Schyns, “Hybrid Images,” SIGGRAPH 2006!


Source: D. Hoiem
Why do we get distance-dependent interpretation of a hybrid image?!

Adapted from a slide by D. Hoiem


Clues from Human Perception!

• Early processing in humans filters for various orientations and scales of frequency!
• Perceptual cues in the mid-high frequencies dominate perception!
• When we see an image from far away, we are effectively subsampling it (and low pass filtering)!

Early Visual Processing: Multi-scale edge and blob filters

Source: D. Hoiem
Hybrid Image in FFT!

Hybrid Image Low-passed Image High-passed Image

Source: D. Hoiem
Thus, we get distance-dependent interpretation of a hybrid image!

Adapted from a slide by D. Hoiem


Template matching using filtering!
Template matching!

• Goal: find in image!


• Approach: Filter image using the template!
• What is a good filter function (i.e. similarity
measure) between two patches?!

Source: D. Hoiem
Matching with filters!

• Goal: find in image!


• Method 1: filter the image with eye patch! h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
! k ,l

f = image
g = filter

What went wrong?!

Input! Filtered Image! Source: D. Hoiem


Matching with filters!

• Goal: find in image!


• Method 2: filter with zero-mean eye! h[ m, n] = ∑ ( g[ k , l ] −g ) ( f [ m + k , n + l ] )
! k ,l
mean of template g

True detections

False
detections

Input Filtered Image (scaled) Thresholded Image


Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
mean template mean image patch
!

∑ ( g[k , l ] − g )( f [m + k , n + l ] − f
k ,l
m ,n )
h[ m, n] = 0.5
⎛ 2 2⎞
⎜⎜ ∑ ( g[ k , l ] − g ) ∑ ( f [ m + k , n + l ] − f m,n ) ⎟⎟
⎝ k ,l k ,l ⎠

Matlab: normxcorr2(template, im)


Source: D. Hoiem
Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
!
True detections

Input Normalized X-Correlation Thresholded Image


Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
!
True detections

Input Normalized X-Correlation Thresholded Image


Q: What is the best method to use?!

!
A: Depends!
• Zero-mean filter: fast but not a great matcher!
• Normalized cross-correlation: slow but invariant to local average intensity and contrast!

Source: D. Hoiem
Q: What if we want to find larger or smaller eyes?!

A: Image pyramids: multiresolution image representations!


• Repeated decimation with a Gaussian low-pass filter gives Gaussian pyramid!
Template Matching with Image Pyramids!

Input: Image, Template!


1. Match template at current scale!
2. Downsample image!
• In practice, scale step of 1.1 to 1.2!
3. Repeat 1-2 until image is very small!
4. Take responses above some threshold, perhaps with non-maxima suppression!
Gaussian
Filter Sample
Low-Pass Low-Res
Image
Filtered Image Image
Source: D. Hoiem
Laplacian pyramid!

• Contains the difference images between two successive Gaussian pyramid


levels:!
Showing, at full resolution, the information captured at each level of a Gaussian (top) and
Laplacian (bottom) pyramid.!
Major uses of image pyramids!

• Compression!
• Object detection!
• Scale search!
• Features!

• Detecting stable interest points !


• Registration!
• Coarse-to-fine!

Source: D. Hoiem
Things to Remember!

• Image filtering: compute function of local neighborhood at each position!


!
• Sometimes it makes sense to think of images and filtering in the frequency domain!
!
• Can be faster to filter using FFT for large images (N logN vs. N2 for auto-correlation)!
!
• Template matching: localize given template in image!
!
• Image pyramid: multiresolution representation of image!
(Remember to low pass filter before sub-sampling)!

You might also like