Feature Descriptors
What is a Feature Descriptor?
• It is a simplified representation of
the image that contains only the
most important information about
the image.
• Image descriptors and feature
descriptors govern how an image Original image
is abstracted and quantified,
while feature vectors are the
output of descriptors and used to
quantify the image.
• Taken as a whole, this process is
called feature extraction.
Only the shape and edges
Image Descriptors
Image Descriptor: An image
descriptor is an algorithm
and methodology that governs how an
input image is globally quantified and
returns a feature vector abstractly
representing the image contents.
Feature Descriptors
• Feature Descriptor: A
feature descriptor is an
algorithm and
methodology that
governs how an
input region of an
image is locally quant
ified. A feature
descriptor
accepts a single input
image and
returns multiple feature
vectors.
Feature Descriptors
• There are a number of feature descriptors out there.
Here are a few of the most popular ones:
HOG: Histogram of Oriented Gradients
SIFT: Scale Invariant Feature Transform
SURF: Speeded-Up Robust Feature
Histogram of Oriented Gradients
• Some important aspects of HOG that makes it different
from other feature descriptors:
• The HOG descriptor focuses on the structure or the shape of
an object.
• How is this different from the edge features we extract for
images?
• In the case of edge features, we only identify if the pixel is an
edge or not.
• HOG is able to provide the edge direction as well.
• This is done by extracting the gradient and orientation (or you can
say magnitude and direction) of the edges
HOG Features
• Additionally, these orientations are calculated
in ‘localized’ portions.
• This means that the complete image is broken down into
smaller regions and for each region, the gradients and
orientation are calculated.
• Finally, the HOG would generate a Histogram for each
of these regions separately. The histograms are created
using the gradients and orientations of the pixel values.
Process of Calculating HOG
• Step 1: Pre-processing
• Step 2: Compute the gradient vector of every pixel, as well as
its magnitude and direction
• Step 3: Calculate histogram of gradients in 8x8 cells
• Step 5: Normalize gradients in 16×16 cell
• Step 6: HOG Feature vector of the complete image
Step 3: Calculate histogram of gradients in 8x8 cells
• A histogram is a plot that
shows the frequency
distribution of a set of
continuous data.
• We are going to take the angle or
orientation on the x-axis and the
magnitude on the y-axis.
• Now that we have our gradient A “cell” is a rectangular region defined
by the number of pixels that belong in
magnitude and orientation each cell.
representations, we need to For example, if we had a 128
divide our image up into cells xcell
128 image and defined our pixels per
as 4 x 4, we would thus have 32
and blocks. x 32 = 1024 cells:
Step 3 continued….
Center : The RGB patch and gradients represented using arrows.
Right : The gradients in the same patch represented as numbers
Step 3 continued….
• The next step is to create a histogram of gradients in
these 8×8 cells. The histogram contains 9 bins
corresponding to angles 0, 20, 40 … 160.
• The angles are between 0 and 180 degrees which are
called “unsigned” gradients
Empirically it has been shown that unsigned gradients work better than
signed gradients for pedestrian detection.
Step 3 continued….
Calculating weighted votes in each bin
Step 3 continued….
Step 3 continued….
• The contributions
of all the pixels in
the 8×8 cells are
added up to create
the 9-bin
histogram. For the
patch in the
previous slide, it
looks like this.
Step 4: 16×16 Block Normalization
• In the previous step, we created a histogram based on
the gradient of the image.
• Gradients of an image are sensitive to overall lighting.
• If you make the image darker by dividing all pixel values by 2,
the gradient magnitude will change by half, and therefore the
histogram values will change by half.
• Ideally, we want our descriptor to be independent of
lighting variations.
• In other words, we would like to “normalize” the
histogram so they are not affected by lighting variations.
Step 4 continued…
• Let’s say we have an RGB color vector [ 128, 64, 32 ]. The
length of this vector is .
• This is also called the L2 norm of the vector.
• Dividing each element of this vector by 146.64 gives us a
normalized vector [0.87, 0.43, 0.22].
• Now consider another vector in which the elements are twice
the value of the first vector 2 x [ 128, 64, 32 ] = [ 256, 128, 64 ].
• Now normalizing [ 256, 128, 64 ] will result in [0.87, 0.43, 0.22],
which is the same as the normalized version of the original RGB
vector.
• We can see that normalizing a vector removes the scale.