Image classification and Analysis

Dr. P. K. Mani

Bidhan Chandra Krishi Viswavidyalaya
E-mail: pabitramani@gmail.com
Website: www.bckv.edu.in
Image Processing and Analysis
Classification
• Bands of a single image are used to identify and separate spectral
signatures of landscape features.
• Ordination and other statistical techniques are used to “cluster” pixels of
similar spectral signatures in a theoretical space.
• The maximum likelihood classifier is most often used.
• Each cluster is then assigned to a category and applied to the image to
create a classified image.
• The resulting classified image can now be used and interpreted as a
map.
•The resulting classified image will have errors! Accuracy assessment is
critical. Maps created by image classification should report an estimate of
accuracy.
Image Processing and Analysis
3. Classification
Black
Box
Transformation / Clustering
Maximum Likelihood Classifier

Band 4

Spectral Signatures
Band 3

Band 2

Band 1

Classified Image (Map)
Image Classification
In order to make the classifier work with thematic (instead of
spectral) classes, some “knowledge” about the relationship
between classes and feature vectors must be given.
Therefore, classifications methods are much more widely
used, where the process is divided into two phases: a
training phase, where the user “trains” the computer, by
assigning for a limited number of pixels to what classes they
belong in this particular image, followed by the decision
making phase, where the computer assigns a class label to
all (other) image pixels, by looking for each pixel to which of
the trained classes this pixel is most similar.
During the training phase, the classes to be use are previously
defined. About each class some “ground truth” is needed:
Guidelines for selecting training areas:
· Training areas should be homogenous. This can be tested by graphic

histograms, numeric summaries, 2-band scatter plot for investigating
separability of feature classes by pairs of bands, 3-D plot of 3-band
feature space (if the softwareallows!).
· One large ‘uniform’ training area per feature class is preferable to
several smaller training areas, though this must depend upon the degree of
variability within each class from site to site, and degree of variability
within individual site.
· Easy to extract more than is needed, and then examine site statistics
before making decision.
· Each training area should be easily located in the image: use a
topographic map, nautical chart, or aerial photos to assist, though
differential GPS observations may help.
· If a smaller training area is necessary, then the minimum size is critical.
What should be the size of the training site?
· Note CCRS statement for MSS: individual training area should be
minimum of 3 - 4 pixels East-West by 6 pixels North-South.
· Others [e.g. Swain and Davis, IDRISI] state (10 x # bands used), e.g. area of
40 pixels if all four MSS bands used (or approx 6 pixels x 7 pixels).
It is common to call the three bands as “features”. The term
features instead of bands is used because it is very usual to
apply transformations to the image, prior to classification.
They are called “feature transformations”, their results
“derived features”. Examples are: Principal components,
In one pixel, the values in the (three) features can be
regarded as components of a 3- dimensional vector, the
feature vector. Such a vector can be plotted in a 3dimensional space, called feature space. Pixels belonging to
the same (land cover) class and having similar characteristics,
end up near to each other in the feature space, regardless of
how far they are from each other in the terrain and in the
image. All pixels belonging to a certain class will (hopefully)
form a cluster in the feature space.
Supervised Classification
Supervised classification
requires the analyst to
select training areas where
he/she knows what is on the
ground and then digitize a
polygon within that area…

The computer then creates...Mean Spectral
Signatures
Conifer

Known Conifer
Area

Water

Known Water
Area

Deciduous

Known Deciduous
Area

Digital Image
Supervised Classification
Mean Spectral
Signatures

Multispectral Image

Information

(Classified Image)

Conifer

Deciduous

Water

Unknown

Spectral
Signature of
Next Pixel to be
Classified
The Result is Information--in this case a Land Cover map...

Land Cover Map
Legend:
Water
Conifer
Deciduous
Multi spectral image classification is used to extract
thematic information from satellite images in a semi-automatic
way.
Image classification are based on the theory about
probabilities. Looking at a certain image pixel in M bands
simultaneously, M values are observed at the same time.
Using multi-spectral SPOT images, where M=3, three reflection
values per pixel are given.
For instance, (34, 25, 117) in one pixel, in another
(34,24,119) and in a third (11, 77, 51). These values found for
1 pixel in several bands are called feature vectors.
It can be recognized that the first two sets of values are
quite similar and that the third is different from the other two.
The first two probably belong to the same (land cover) class
and the third belongs to another one.
Unsupervised Classification
The analyst requests the computer
to examine the image and extract a
number of spectrally distinct
clusters…

Spectrally Distinct
Clusters

Cluster 3

Cluster 5

Cluster 1

Digital Image

Cluster 6

Cluster 2

Cluster 4
Unsupervised Classification
Saved Clusters
Cluster 3

Cluster 5

Cluster 1

Output Classified Image

Cluster 6

Cluster 2

Next Pixel
to be
Classified

Cluster 4
Unknown
Unsupervised Classification
• Recall:
In unsupervised classification, the spectral
data imposes constraints on our interpretation
• How?
Rather than defining training sets and
carving out pieces of n-dimensional space, we
define no classes before hand and instead use
statistical approaches to divide the ndimensional space into clusters with the best
separation
• After the fact, we assign class names to those
clusters
Supervised Classification
• Common Classifiers:
– Parallelpiped/Box classifier
– Minimum distance to mean
– Maximum likelihood
Supervised Classification
• Parallelepiped/ Box
Approach
The Box classifier is the simplest
classification method: In 2-D space,
rectangles are created around the
training feature vector for each class;
in 3-Dimension they are actually boxes
(blocks).
The position and sizes of the boxes
can be exactly around the feature
vectors (Min-Max method), or
according to the mean vector (this will
be at the center of a box) and the
standard deviations of the feature
vector, calculated separately per
feature (this determines the size of the
box in that dimension).
Supervised Classification: Statistical Approaches
Minimum distance to mean
The Minimum Distance-to-mean
classifier:
first calculates for each class the
mean vector of the training
feature vectors.
Then, the feature space is
partitioned by giving to each
feature vector the class label of
the nearest mean vector,
according to Euclidean metric.
Usually it is possible to specify a
maximum distance threshold:
If the nearest mean is still further away than that threshold, it is
assumed that none of the classes is similar enough and the
result will be “unknown”
Gaussian Maximum Likelihood
classifiers assume that the
feature vectors of each class
are (statistically) distributed
according to a multivariate
normal probability density
function. The training samples
are used to estimate the
parameters of the distributions.
The boundaries between the
different partitions in the feature
space are placed where the
decision changes from one
class to another. They are
called decision boundaries.
Supervised Classification
• Maximum likelihood
– Pro:
• Most sophisticated; achieves good separation of
classes

– Con:
• Requires strong training set to accurately describe
mean and covariance structure of classes
Classification: Critical Point
• LAND COVER not necessarily equivalent to
LAND USE
– We focus on what’s there: LAND COVER
– Many users are interested in how what’s there
is being used: LAND USE
• Example
– Grass is land cover; pasture and recreational
parks are land uses of grass

Image classification, remote sensing, P K MANI

  • 1.
    Image classification andAnalysis Dr. P. K. Mani Bidhan Chandra Krishi Viswavidyalaya E-mail: [email protected] Website: www.bckv.edu.in
  • 2.
    Image Processing andAnalysis Classification • Bands of a single image are used to identify and separate spectral signatures of landscape features. • Ordination and other statistical techniques are used to “cluster” pixels of similar spectral signatures in a theoretical space. • The maximum likelihood classifier is most often used. • Each cluster is then assigned to a category and applied to the image to create a classified image. • The resulting classified image can now be used and interpreted as a map. •The resulting classified image will have errors! Accuracy assessment is critical. Maps created by image classification should report an estimate of accuracy.
  • 3.
    Image Processing andAnalysis 3. Classification Black Box Transformation / Clustering Maximum Likelihood Classifier Band 4 Spectral Signatures Band 3 Band 2 Band 1 Classified Image (Map)
  • 8.
  • 18.
    In order tomake the classifier work with thematic (instead of spectral) classes, some “knowledge” about the relationship between classes and feature vectors must be given. Therefore, classifications methods are much more widely used, where the process is divided into two phases: a training phase, where the user “trains” the computer, by assigning for a limited number of pixels to what classes they belong in this particular image, followed by the decision making phase, where the computer assigns a class label to all (other) image pixels, by looking for each pixel to which of the trained classes this pixel is most similar. During the training phase, the classes to be use are previously defined. About each class some “ground truth” is needed:
  • 19.
    Guidelines for selectingtraining areas: · Training areas should be homogenous. This can be tested by graphic histograms, numeric summaries, 2-band scatter plot for investigating separability of feature classes by pairs of bands, 3-D plot of 3-band feature space (if the softwareallows!). · One large ‘uniform’ training area per feature class is preferable to several smaller training areas, though this must depend upon the degree of variability within each class from site to site, and degree of variability within individual site. · Easy to extract more than is needed, and then examine site statistics before making decision. · Each training area should be easily located in the image: use a topographic map, nautical chart, or aerial photos to assist, though differential GPS observations may help. · If a smaller training area is necessary, then the minimum size is critical. What should be the size of the training site? · Note CCRS statement for MSS: individual training area should be minimum of 3 - 4 pixels East-West by 6 pixels North-South. · Others [e.g. Swain and Davis, IDRISI] state (10 x # bands used), e.g. area of 40 pixels if all four MSS bands used (or approx 6 pixels x 7 pixels).
  • 21.
    It is commonto call the three bands as “features”. The term features instead of bands is used because it is very usual to apply transformations to the image, prior to classification. They are called “feature transformations”, their results “derived features”. Examples are: Principal components, In one pixel, the values in the (three) features can be regarded as components of a 3- dimensional vector, the feature vector. Such a vector can be plotted in a 3dimensional space, called feature space. Pixels belonging to the same (land cover) class and having similar characteristics, end up near to each other in the feature space, regardless of how far they are from each other in the terrain and in the image. All pixels belonging to a certain class will (hopefully) form a cluster in the feature space.
  • 26.
    Supervised Classification Supervised classification requiresthe analyst to select training areas where he/she knows what is on the ground and then digitize a polygon within that area… The computer then creates...Mean Spectral Signatures Conifer Known Conifer Area Water Known Water Area Deciduous Known Deciduous Area Digital Image
  • 27.
    Supervised Classification Mean Spectral Signatures MultispectralImage Information (Classified Image) Conifer Deciduous Water Unknown Spectral Signature of Next Pixel to be Classified
  • 28.
    The Result isInformation--in this case a Land Cover map... Land Cover Map Legend: Water Conifer Deciduous
  • 44.
    Multi spectral imageclassification is used to extract thematic information from satellite images in a semi-automatic way. Image classification are based on the theory about probabilities. Looking at a certain image pixel in M bands simultaneously, M values are observed at the same time. Using multi-spectral SPOT images, where M=3, three reflection values per pixel are given. For instance, (34, 25, 117) in one pixel, in another (34,24,119) and in a third (11, 77, 51). These values found for 1 pixel in several bands are called feature vectors. It can be recognized that the first two sets of values are quite similar and that the third is different from the other two. The first two probably belong to the same (land cover) class and the third belongs to another one.
  • 45.
    Unsupervised Classification The analystrequests the computer to examine the image and extract a number of spectrally distinct clusters… Spectrally Distinct Clusters Cluster 3 Cluster 5 Cluster 1 Digital Image Cluster 6 Cluster 2 Cluster 4
  • 46.
    Unsupervised Classification Saved Clusters Cluster3 Cluster 5 Cluster 1 Output Classified Image Cluster 6 Cluster 2 Next Pixel to be Classified Cluster 4 Unknown
  • 48.
    Unsupervised Classification • Recall: Inunsupervised classification, the spectral data imposes constraints on our interpretation • How? Rather than defining training sets and carving out pieces of n-dimensional space, we define no classes before hand and instead use statistical approaches to divide the ndimensional space into clusters with the best separation • After the fact, we assign class names to those clusters
  • 49.
    Supervised Classification • CommonClassifiers: – Parallelpiped/Box classifier – Minimum distance to mean – Maximum likelihood
  • 50.
    Supervised Classification • Parallelepiped/Box Approach The Box classifier is the simplest classification method: In 2-D space, rectangles are created around the training feature vector for each class; in 3-Dimension they are actually boxes (blocks). The position and sizes of the boxes can be exactly around the feature vectors (Min-Max method), or according to the mean vector (this will be at the center of a box) and the standard deviations of the feature vector, calculated separately per feature (this determines the size of the box in that dimension).
  • 51.
    Supervised Classification: StatisticalApproaches Minimum distance to mean The Minimum Distance-to-mean classifier: first calculates for each class the mean vector of the training feature vectors. Then, the feature space is partitioned by giving to each feature vector the class label of the nearest mean vector, according to Euclidean metric. Usually it is possible to specify a maximum distance threshold: If the nearest mean is still further away than that threshold, it is assumed that none of the classes is similar enough and the result will be “unknown”
  • 52.
    Gaussian Maximum Likelihood classifiersassume that the feature vectors of each class are (statistically) distributed according to a multivariate normal probability density function. The training samples are used to estimate the parameters of the distributions. The boundaries between the different partitions in the feature space are placed where the decision changes from one class to another. They are called decision boundaries.
  • 53.
    Supervised Classification • Maximumlikelihood – Pro: • Most sophisticated; achieves good separation of classes – Con: • Requires strong training set to accurately describe mean and covariance structure of classes
  • 55.
    Classification: Critical Point •LAND COVER not necessarily equivalent to LAND USE – We focus on what’s there: LAND COVER – Many users are interested in how what’s there is being used: LAND USE • Example – Grass is land cover; pasture and recreational parks are land uses of grass