Classification Introduction
SANSA Earth Observation Directorate
Outline
Introduction
Why do we classify?
Image Classification process
Remote Sensing data
Preparing training data
– Supervised training
– Unsupervised training
– Considerations
Basic classification principles
Image classification algorithms (Pixel-based and Object-based)
Classification results
Summary
Introduction
Image classification is the process of sorting pixels into a
finite number of individual classes, or categories of data,
based on their data file values.
Classes may be associated with known features on the
ground or may represent areas that appear spectrally different
to the computer.
An example of a classified image is a land cover map,
showing vegetation, bare land, pasture, urban, etc.
Why do we classify?
Satellite images contain only raw data information about the
coverage on the earth surface
Classification provides an interpretation of what is seen on the
image into a thematic coverage of the earth surface.
Products of image classifications can be easily read and
interpreted by non-specialists and used for planning and
decision making.
The quality of the produced products can be verified in the
field or using other reference data.
Image Classification Process…
Statistics are derived from the spectral characteristics of all
pixels in an image.
Pixels are then sorted into classes based on mathematical or
statistical criteria
The classification process can be broken down into three
stages:
Training: defining the criteria by which patterns in the data are
recognised. The computer must be trained to recognise patterns.
Classification: based on the training dataset, an algorithm is
chosen to sort all untrained pixels into defined classes.
Post-Processing can include: cleaning (salt & pepper removed);
class merging; accuracy validation; conversion to vector layer
Image Classification process
Remote sensing data
Remote sensing data
Training data
Types of classification training
Supervised training:
Analyst selects sets of pixels representative of specific land cover
features. Selection can be made based on:
Recognition in image
External sources, e.g. aerial photos, ground truth data, or maps.
Requires knowledge of the data – particularly spectral information
Requires legend to be defined beforehand, thus knowledge of class
requirements
Result of supervised training is a set of signatures that correspond
to each required cover class.
Types of classification training
Unsupervised training:
Analyst defines a number of input classes (usually 2-3 times desired
number of legend classes required)
Computer automatically generates clusters of pixels into defined
number of classes based on spectral similarity
Result is a class image with defined number of classes. The
clusters do not necessarily directly correspond to meaningful class
characteristics of the scene.
Analyst has to make sense of this image post classification (usually
involves merging of classes, etc.).
Types of classification training
Training Considerations
The training phase of image classification is fundamental as:
“poor training will produce poor results”
Some considerations about training:
1. Sampling restrictions (cost, availability of data and accessibility) may
lead to inadequate sampling, which may lead to undertraining;
2. As the dimensionality of the data increases the number of training
samples increases (Hughes phenomenon);
3. It is common that even mixed pixels dominate the image, only pure
pixels are selected for training. This may lead to low classification
accuracy, because the classifier ends up loosing generalization ability.
4. The training should account for the variability within the image for each
class.
5. The classes should not or only partially overlap with other classes.
Basic Classification principles
The pixel-based classifiers, use radiometric properties of the
sensor e.g. DN/reflectance
Different objects reflect differently
40
35
30
Reflectance (%)
25
20 Vegetation
15 Soil
10
5
0
Band 1 Band 2 Band 3 Band 4 Band 5 Band 7
Basic Classification principles
40
35
30
A pixel occupies a point in an n-dimensional
Reflectance (%)
25
20 Vegetation feature space
15 Soil For example: Here we show vegetation and soil
10 pixels occupy different points in a 2D space.
5
0
Band 1 Band 2 Band 3 Band 4 Band 5 Band 7
40
35
30
Band 4 25
20
Vegetation
15
Soil
10
5
0
0 5 10 15 20
Band 3
Basic Classification principles
Image classification algorithms
Pixel based classification algorithms
Supervised Vs. Unsupervised
classification algorithms
Examples of supervised classifiers:
1. Maximum likelihood;
2. Parallelpiped
3. Minimum distance to mean
4. Support Vector Machine*
5. Artificial Neural Network*
* non-parametric classifiers
Examples of unsupervised classifiers:
1. K-means;
2. ISODATA
Supervised Classification
Supervised Classification
Maximum likelihood classifier
considers class centres, shape, size
and orientation of classes/clusters.
MLC calculate a statistical distance
(probability) that pixel x belongs to a
specific cluster based on mean
values and covariance matrix of the
classes.
Assumes classes have normal
(Gaussian) distribution.
Supervised Classification
Unsupervised classification
Iterative Self-Organizing
Data Analysis Technique
(ISODATA) - one of the most
popular methods of
unsupervised classification.
Only requires:
1. number of classes,
2. maximum # of iterations,
3. convergence threshold.
Non-parametric
Expert Systems (Rule Based)
Decision/Classification trees
Artificial Neural Network (ANN)
Object-based classification
Partitions the image into
meaningful or semantic objects –
segmentation
Uses colour (spectral), texture,
shape (compactness), to describe
objects.
Objects can be aggregated into
larger objects (bottom-up), or
split into smaller objects (top-
down) Scale Level - 30
Allows integration of Indices, Scale Level - 20
DEM, & vectors.
Scale Level - 10
Classification can be Rule-based Pixel Level
or Supervised
Image Classification process
Classification Results
Validating a classification
• Provides a measure on the confidence you have in the
produced map.
• Points for validation taken from ground samples, higher
resolution imagery or other reliable data sources
• Confusion matrix
• Error described in terms of:
– Overall & Class accuracy
– Users/Producers accuracy
– Kappa statistic
Confusion matrix
TIGER Land Cover Course
Examples
CORINE Land Cover Global Land Cover 2000
http://bioval.jrc.ec.europa.eu/ http://www.eea.europa.eu/
Summary
Use spectral (radiometric) & or object differences to
distinguish classes
Supervised classification
– Training areas characterize spectral properties of classes
– Assign other pixels to classes by matching with spectral
properties of training sets
Unsupervised classification
– Maximize separability of clusters
– Assign class names to clusters after classification
Contact us
E-Mail: [email protected]
Website: http://www.sansa.org.za
Tel: +27 12 844 0500 (reception)
ask for Customer Services Earth Observation